Internet DRAFT - draft-schanzen-r5n
draft-schanzen-r5n
Independent Stream M. Schanzenbach
Internet-Draft Fraunhofer AISEC
Intended status: Informational C. Grothoff
Expires: 20 August 2024 Berner Fachhochschule
B. Fix
GNUnet e.V.
17 February 2024
The R5N Distributed Hash Table
draft-schanzen-r5n-04
Abstract
This document contains the R^5N DHT technical specification. R^5N is
a secure distributed hash table (DHT) routing algorithm and data
structure for decentralized applications. It features an open peer-
to-peer overlay routing mechanism which supports ad-hoc
permissionless participation and support for topologies in
restricted-route environments. Optionally, the paths data takes
through the overlay can be recorded, allowing decentralized
applications to use the DHT to discover routes.
This document defines the normative wire format of protocol messages,
routing algorithms, cryptographic routines and security
considerations for use by implementers.
This specification was developed outside the IETF and does not have
IETF consensus. It is published here to guide implementation of R^5N
and to ensure interoperability among implementations including the
pre-existing GNUnet implementation.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 20 August 2024.
Schanzenbach, et al. Expires 20 August 2024 [Page 1]
Internet-Draft The R5N Distributed Hash Table February 2024
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Requirements Notation . . . . . . . . . . . . . . . . . . 4
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1. Restricted-route topologies . . . . . . . . . . . . . . . 6
3.2. Key differences to RELOAD . . . . . . . . . . . . . . . . 6
4. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5. Underlay . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6. Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1. Routing Table . . . . . . . . . . . . . . . . . . . . . . 11
6.2. Peer Discovery . . . . . . . . . . . . . . . . . . . . . 12
6.3. Peer Bloom Filter . . . . . . . . . . . . . . . . . . . . 13
6.4. Routing Functions . . . . . . . . . . . . . . . . . . . . 14
6.5. Pending Table . . . . . . . . . . . . . . . . . . . . . . 15
7. Message Processing . . . . . . . . . . . . . . . . . . . . . 16
7.1. Message components . . . . . . . . . . . . . . . . . . . 16
7.1.1. Flags . . . . . . . . . . . . . . . . . . . . . . . . 17
7.1.2. Path Element . . . . . . . . . . . . . . . . . . . . 17
7.2. HelloMessage . . . . . . . . . . . . . . . . . . . . . . 23
7.2.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 23
7.2.2. Processing . . . . . . . . . . . . . . . . . . . . . 24
7.3. PutMessage . . . . . . . . . . . . . . . . . . . . . . . 24
7.3.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 24
7.3.2. Processing . . . . . . . . . . . . . . . . . . . . . 27
7.4. GetMessage . . . . . . . . . . . . . . . . . . . . . . . 28
7.4.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 28
7.4.2. Result Filter . . . . . . . . . . . . . . . . . . . . 30
7.4.3. Processing . . . . . . . . . . . . . . . . . . . . . 30
7.5. ResultMessage . . . . . . . . . . . . . . . . . . . . . . 32
7.5.1. Wire Format . . . . . . . . . . . . . . . . . . . . . 32
7.5.2. Processing . . . . . . . . . . . . . . . . . . . . . 35
8. Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Schanzenbach, et al. Expires 20 August 2024 [Page 2]
Internet-Draft The R5N Distributed Hash Table February 2024
8.1. Block Operations . . . . . . . . . . . . . . . . . . . . 37
8.2. HELLO Blocks . . . . . . . . . . . . . . . . . . . . . . 38
8.3. Persistence . . . . . . . . . . . . . . . . . . . . . . . 42
8.3.1. Approximate Search Considerations . . . . . . . . . . 43
8.3.2. Caching Strategy Considerations . . . . . . . . . . . 43
9. Security Considerations . . . . . . . . . . . . . . . . . . . 44
9.1. Disjoint Underlay or Application Protocol Support . . . . 44
9.2. Approximate Result Filtering . . . . . . . . . . . . . . 44
9.3. Access Control . . . . . . . . . . . . . . . . . . . . . 44
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 45
11. GANA Considerations . . . . . . . . . . . . . . . . . . . . . 45
11.1. Block Type Registry . . . . . . . . . . . . . . . . . . 45
11.2. GNUnet URI Schema Subregistry . . . . . . . . . . . . . 45
11.3. GNUnet Signature Purpose Registry . . . . . . . . . . . 46
11.4. GNUnet Message Type Registry . . . . . . . . . . . . . . 46
12. Test Vectors . . . . . . . . . . . . . . . . . . . . . . . . 46
13. Normative References . . . . . . . . . . . . . . . . . . . . 47
14. Informative References . . . . . . . . . . . . . . . . . . . 48
Appendix A. Bloom filters in R^5N . . . . . . . . . . . . . . . 48
Appendix B. Overlay Operations . . . . . . . . . . . . . . . . . 49
B.1. GET operation . . . . . . . . . . . . . . . . . . . . . . 49
B.2. PUT operation . . . . . . . . . . . . . . . . . . . . . . 51
Appendix C. HELLO URLs . . . . . . . . . . . . . . . . . . . . . 51
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 52
1. Introduction
This specification describes the protocol of R^5N. R^5N is a
Distributed Hash Table (DHT). The name is an acronym for "randomized
recursive routing for restricted-route networks" and its first
academic description can be found in [R5N].
DHTs are a key data structure for the construction of decentralized
applications and generally provide a robust and efficient means to
distribute the storage and retrieval of key-value pairs.
The core idea behind R^5N is to combine a randomized routing
algorithm with an efficient, deterministic closest-peer algorithm.
This allows us to construct an algorithm that is able to escape and
circumvent restricted route environments while at the same time allow
for a logarithmically bounded routing complexity.
R^5N also includes advanced features like recording the path a key-
value pair took through the network, response filters and on-path
application-specific data validation.
Schanzenbach, et al. Expires 20 August 2024 [Page 3]
Internet-Draft The R5N Distributed Hash Table February 2024
This document defines the normative wire format of peer-to-peer
messages, routing algorithms, cryptographic routines and security
considerations for use by implementors.
1.1. Requirements Notation
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
2. Terminology
Address An _Address_ is a UTF-8 [RFC3629] string which can be used
to address a _peer_ through the Underlay (Section 5). The format
of an address is not enforced by this specification, but it is
expected that in most cases the address is a URI [RFC3986].
Applications _Applications_ are higher-layer components which
directly use the _Core Operations_. Possible _applications_
include the GNU Name System [RFC9498] and the GNUnet Confidential
Ad-hoc Decentralized End-to-End Transport (CADET) [cadet].
Core Operations The _Core Operations_ provide an interface to the
core operations of the DHT overlay to _applications_. This
includes storing _blocks_ in the DHT and retrieving _blocks_ from
the DHT.
Block Variable-size unit of payload stored in the DHT under a _key_.
In the context of "key-value stores" this refers to "value" stored
under a _key_.
Block Storage The _block storage_ component is used to persist and
manage _blocks_ stored by _peers_. It includes logic for
enforcing storage quotas, caching strategies and block validation.
Block-Type A unique 32-bit value identifying the data format of a
_block_. _Block-types_ are either private or registered in the
GANA block type registry (see Section 11.1).
Bootstrapping _Bootstrapping_ is the process of establishing a
connection to the peer-to-peer network. It requires an initial,
non-empty set of reachable _peers_ and corresponding _addresses_
supported by the implementation to connect to.
Initiator The _peer_ that initially creates and sends a DHT protocol
message (Section 7.2, Section 7.3, Section 7.4, Section 7.5).
Schanzenbach, et al. Expires 20 August 2024 [Page 4]
Internet-Draft The R5N Distributed Hash Table February 2024
HELLO block A HELLO block is a _block_ with a _block-type_ DHT_HELLO
(13). A HELLO block is used to store and retrieve _addresses_ of
a _peer_. HELLO blocks are used by the peer discovery mechanism
in Section 6.2.
HELLO URL HELLO URLs are HELLO blocks represented as URLs. They are
used for out-of-band exchanges of _peer_ _addresses_ and for
signalling address updates to _neighbours_. Implementation
details of HELLO URLs and examples are found in Appendix C.
Key 512-bit identifier of a location in the DHT. Multiple Blocks
can be stored under the same _key_. A _peer identity_ is also a
key. In the context of "key-value stores" this refers to "key"
under which values (_blocks_) are stored.
Message Processing The _message processing_ component of the DHT
implementation processes requests from and generates responses to
_applications_ and the _underlay interface_.
Neighbor A neighbor is a _peer_ which is directly able to
communicate with our _peer_ via the _underlay interface_.
Peer A host that is participating in the overlay by running an
implementation of the DHT protocol. Each participating host is
responsible for holding some portion of the data that has been
stored in the overlay, and they are responsible for routing
messages on behalf of other _peers_ as needed by the _routing
algorithm_.
Peer Identity The _peer identity_ is the identifier used on the
overlay to identify a _peer_. It is a SHA-512 hash of the _peer
public key_.
Peer Public Key The _peer public key_ is the key used to
authenticate a _peer_ in the underlay.
Routing The _routing_ component includes the routing table as well
as routing and _peer_ selection logic. It facilitates the R^5N
routing algorithm with required data structures and algorithms.
Underlay Interface The _underlay interface_ is an abstraction layer
on top of the supported links of a _peer_. Peers may be linked by
a variety of different transports, including "classical" protocols
such as TCP, UDP and TLS or higher-layer protocols such as GNUnet,
I2P or Tor.
3. Motivation
Schanzenbach, et al. Expires 20 August 2024 [Page 5]
Internet-Draft The R5N Distributed Hash Table February 2024
3.1. Restricted-route topologies
Restricted-route topologies emerge when a connected underlay topology
prevents (or restricts) direct connections between some of the nodes.
This commonly occurs through the use of NAT. Nodes operated behind a
NAT cause common DHT routing algorithms such as Kademlia to exhibit
degraded performance or even to fail. While excluding such nodes is
an option, this limits load distribution and is ineffective for some
physical networks.
Nodes which in terms of a classical distance metric such as XOR would
be considered close may not be reachable, for example due to a
firewall or NAT. This leads to multiple (local) minima with respect
to where data may be stored or where data can be retrieved. From a
particular fixed location in the network, a node may only be able to
find and and store data in the context of its local minimum.
R^5N addresses this problem by prepending a random walk before a
classical, deterministic XOR-based routing algorithm is employed. If
the network exhibits the properties of a small world topology, such a
random walk will cause the algorithm to land on a random node in the
network. Consequently, the deterministic part of the algorithm will
encounter a random local minimum. It is then possible to repeat this
process in order to store or retrieve data in the context of all or
at least multiple local minima. The number of repetitions expected
to cover all local minima depends on the current network size and
this one of the parameters of the R^5N routing algorithm.
3.2. Key differences to RELOAD
[RFC6940] specifies the RELOAD DHT. The R^5N DHT described in this
document differs from RELOAD in its objectives and thus in its
design. The authors of RELOAD make the case that P2P networks are
often established among a set of peers that do not trust each other.
It addresses this issue by requiring that node identifiers are either
assigned by a central authority, or self-issued in the case of closed
networks. In other words, by enforcing the P2P network to be
established among a set of _trusted_ peers. This misses the point
that this openness is a core requirement of efficient and useful DHTs
as they serve a fundamental part in a decentralized network
infrastructure. R^5N, by contrast, is intended for open overlay
networks, and thus does not include a central enrollment server to
certify participants and does not limit participation in another way.
As participants could be malicious, R^5N includes on-path
customizable key-value validation to delete malformed data and path
randomiziation to help evade malicious peers. R^5N also expects to
perform over a network where not each peer can communicate with every
other peer, and where thus its route discovery feature provides
Schanzenbach, et al. Expires 20 August 2024 [Page 6]
Internet-Draft The R5N Distributed Hash Table February 2024
utility to higher-level applications. As a result, both the features
and the security properties of RELOAD and R^5N are different, except
in that both allow storing and retrieving key-value pairs.
4. Overview
In R^5N peers provide the two fundamental core operations of any DHT
to their applications:
* PUT: This operation stores a _block_ under a _key_ on one or more
_peers_ with the goal of making the _block_ availiable for queries
using the GET operation. In the classical definition of a
dictionary interface, this operation would be called "insert".
* GET: This operation queries the network of peers for any number of
_blocks_ previously stored under or near a _key_. In the
classical definition of a dictionary interface, this operation
would be called "find".
An example for possible semantics of the above operations provided as
an API to applications by an implementation are outlined in
Appendix B.
A _peer_ does not necessarily need to expose the above operations to
_applications_, but it commonly will. A _peer_ that does not expose
the above operations could be a host purely used for _bootstrapping_,
_routing_ or supporting the overlay network with resources.
Similarly, there could be hosts on the network that participate in
the DHT but do not route traffic or store data. Examples for such
hosts would be mobile devices with limited bandwidth, battery and
storage capacity. Such hosts may be used to run applications that
use the DHT. However, we will not refer to such hosts as _peers_.
In a trivial scenario where there is only one _peer_ (on the local
host), R^5N operates similarly to a dictionary data structure.
However, the default use case is one where nodes communicate directly
and indirectly in order to realize a distributed storage mechanism.
This communication requires a lower-level peer addressing and message
transport mechanism such as TCP/IP. R^5N is agnostic to the
underlying transport protocol which is why this document defines a
common addressing and messaging interface in Section 5. The
interface provided by this underlay is used across the specification
of the R^5N protocol. It also serves as a set of requirements of
possible transport mechanisms that can be used to implement R^5N
with. That being said, common transport protocols such as TCP/IP or
UDP/IP and their interfaces are suitable R^5N underlays used by
existing implementations.
Schanzenbach, et al. Expires 20 August 2024 [Page 7]
Internet-Draft The R5N Distributed Hash Table February 2024
Specifics about the protocols of the underlays implementing the
_underlay interface_ or the _applications_ using the DHT are out of
the scope of this document.
To establish an initial connection to a network of R^5N peers, at
least one initial, addressable _peer_ is required as part of the
_bootstrapping_ process. Further _peers_, including _neighbors_, are
then learned via a peer discovery process as defined in Section 6.2.
Across this document, the functional components of an R^5N
implementation are divided into _routing_ (Section 6), _message
processing_ (Section 7) and block processing (Section 8).
_Applications_ that require application-specific _block_ payloads are
expected to register a _Block-Type_ in the GANA _Block-Type_ registry
(Section 11.1) and provide a specification of the associated block
operations (Section 8.1). to implementors of R^5N. Figure 1
illustrates the architectural overview of R^5N.
| +-----------------+ +-------+
Applications | | GNU Name System | | CADET | ...
| +-----------------+ +-------+
-------------+------------------------------------ Core Operations
| ^
| | +---------------+
| | | Block Storage |
| | +---------------+
| | ^
R5N | v v
| +--------------------+ +---------+
| | Message Processing |<-->| Routing |
| +--------------------+ +---------+
| ^ ^
| v v
-------------+------------------------------------ Underlay Interface
| +--------+ +--------+ +----------+
| |GNUnet | |IP | | QUIC |
Connectivity | |Underlay| |Underlay| | Underlay | ...
| |Link | |Link | | Link |
| +--------+ +--------+ +----------+
Figure 1: The R5N architecture.
Schanzenbach, et al. Expires 20 August 2024 [Page 8]
Internet-Draft The R5N Distributed Hash Table February 2024
5. Underlay
How peers are addressed in the underlay is out of scope of this
document. For example, a peer may have a TCP/IP address, or expose a
QUIC endpoint. While the specific addressing options and mechanisms
are out of scope, it is necessary to define a universal addressing
format in order to facilitate the distribution of _address_
information to other _peers_ in the DHT overlay. This standardized
format is the _HELLO Block_ (described in Section 8.2), which
contains sets of addresses. If the address is a URI, it may indicate
which underlay understands the respective _address_ details.
It is expected that the underlay provides basic mechanisms to manage
peer connectivity and addressing. The essence of the _underlay
interface_ is captured by the following set of API calls:
TRY_CONNECT(P, A) This call allows an implementation to signal to
the underlay that it wants to establish a connection to another
peer P using an address A. If the connection attempt is
successful, information on the new peer will be offered through
the PEER_CONNECTED signal.
HOLD(P) This call tells the underlay to keep a hold on to a
connection to a peer P. Underlays are usually limited in the
number of active connections. With this function the DHT can
indicate to the underlay which connections should preferably be
preserved.
DROP(P) This call tells the underlay to drop the connection to a
peer P. This call is only there for symmetry and used during the
peer's shutdown to release all of the remaining HOLDs. As R^5N
always prefers the longest-lived connections, it would never drop
an active connection that it has called HOLD() on before.
Nevertheless, underlay implementations should not rely on this
always being true. A call to DROP() also does not imply that the
underlay must close the connection: it merely removes the
preference to preserve the connection that was established by
HOLD().
SEND(P, M) This call allows the local peer to send a protocol
message M to a peer P.
ESTIMATE_NETWORK_SIZE() -> L2NSE A call that provides an estimate of
Schanzenbach, et al. Expires 20 August 2024 [Page 9]
Internet-Draft The R5N Distributed Hash Table February 2024
the network size. The result, L2NSE, must be the base-2 logarithm
of the estimated number of peers in the network. It is used by
the routing algorithm. If the underlay does not support a
protocol for network size estimation (such as cite paper NSE) the
value is assumed to be provided as a configuration parameter to
the implementation.
The above calls are meant to be actively executed by the
implementation as part of the peer-to-peer protocol. In addition,
the underlay is expected to emit the following signals (usually
implemented as callbacks) based on network events observed by the
underlay implementation:
PEER_CONNECTED -> P is a signal that allows the DHT to react to a
newly connected peer P. Such an event triggers, for example,
updates in the routing table and gossiping of HELLOs to that peer.
Underlays may include meta-data about the connection, for example
to indicate that the connection is from a resource-constrained
host that does not intend to function as a full _peer_ and thus
should not be considered for routing.
PEER_DISCONNECTED -> P is a signal that allows the DHT to react to a
recently disconnected peer. Such an event triggers, for example,
updates in the routing table.
ADDRESS_ADDED -> A The underlay signals indicates that an address A
was added for our local peer and that henceforth the peer may be
reachable under this address. This information is used to
advertise connectivity information about the local peer to other
peers. A must be a URI suitable for inclusion in a HELLO payload
Section 8.2.
ADDRESS_DELETED -> A This underlay signal indicates that an address
A was removed from the set of addresses the local peer is possibly
reachable under. Addresses must have been added before they may
be deleted. This information is used to no longer advertise this
address to other peers.
RECEIVE -> (P, M) This signal informs the local peer that a protocol
message M was received from a peer P.
These signals then drive updates of the routing table, local storage
and message transmission.
Schanzenbach, et al. Expires 20 August 2024 [Page 10]
Internet-Draft The R5N Distributed Hash Table February 2024
6. Routing
To enable routing, any R^5N implementation must keep information
about its current set of neighbors. Upon receiving a connection
notification from the _underlay interface_ through a PEER_CONNECTED
signal, information on the new neighbor MUST be added to the routing
table, except if the respective k-bucket in the routing table is full
or if meta-data is present that indicates that the peer does not wish
to participate in routing. Peers added to the routing table SHOULD
be signalled to the underlay as important connections using a HOLD
call. Similarly when a disconnect is indicated by the underlay
through a PEER_DISCONNECTED signal, the peer MUST be removed from the
routing table.
To achieve logarithmically bounded routing performance, the data
structure for managing neighbors and their metadata MUST be
implemented using the k-buckets concept of [Kademlia] as defined in
Section 6.1. Maintenance of the routing table (after
_bootstrapping_) is described in Section 6.2.
Unlike [Kademlia], routing decisions in R^5N are also influenced by a
Bloom filter in the message that prevents routing loops. This data
structure is discussed in Section 6.3.
In order to select peers which are suitable destinations for routing
messages, R^5N uses a hybrid approach: Given an estimated network
size L2NSE retrieved using ESTIMATE_NETWORK_SIZE (), the peer
selection for the first L2NSE hops is random. After the initial
L2NSE hops, peer selection follows an XOR-based peer distance
calculation. Section 6.4 describes the corresponding routing
functions.
6.1. Routing Table
Whenever a PEER_CONNECTED signal is received from the underlay, the
respective peer is considered for insertion into the routing table.
The routing table consists of an array of k-buckets. Each k-bucket
contains a list of _neighbors_. The i-th k-bucket stores neighbors
whose peer public keys are between distance 2^i and 2^(i+1) from the
local peer. System constraints will typically force an
implementation to impose some upper limit on the number of
_neighbors_ kept per k-bucket. Upon insertion, the implementation
MUST call HOLD on the respective _neighor_.
Implementations SHOULD try to keep at least 5 entries per k-bucket.
Embedded systems that cannot manage this number of connections MAY
use connection-level signalling to indicate that they are merely a
client utilizing a DHT and not able to participate in routing. DHT
Schanzenbach, et al. Expires 20 August 2024 [Page 11]
Internet-Draft The R5N Distributed Hash Table February 2024
peers receiving such connections MUST NOT include connections to such
restricted systems in their k-buckets, thereby effectively excluding
them when making routing decisions.
If a system hits constraints with respect to the number of active
connections, an implementation MUST evict _neighbours_ from those
k-buckets with the largest number of neighbors. The eviction
strategy MUST be to drop the shortest-lived connection per k-bucket
first.
Implementations MAY cache valid _addresses_ of disconnected _peers_
outside of the routing table and sporadically or periodically try to
(re-)establish connection to the _peer_ by making TRY_CONNECT calls
to the _underlay interface_ if the respective k-bucket has empty
slots.
6.2. Peer Discovery
Initially, implementations depend upon either the underlay providing
at least one initial connection to a _neighbor_ (signalled through
PEER_CONNECTED), or the _application_ or even end-user providing at
least one working HELLO which is then in turn used to call
TRY_CONNECT on the underlay in order to trigger a subsequent
PEER_CONNECTED signal from the _underlay interface_. This is
commonly achieved through the configuration of hardcoded bootstrap
peers or bootstrap servers either for the underlay or the R^5N
implementation. While details on how the first connection is
established MAY depend on the specific implementation, this SHOULD
usually be done by an out-of-band exchange of the information from a
HELLO block. Appendix C specifies a URL format for encoding HELLO
blocks as text strings. The URL format thus provides a portable,
human-readable, text-based serialization format that can, for
example, be encoded into a QR code for dissemination. HELLO URLs
SHOULD be supported by implementations for both import and export of
HELLOs.
Schanzenbach, et al. Expires 20 August 2024 [Page 12]
Internet-Draft The R5N Distributed Hash Table February 2024
To discover peers for its routing table, a peer will initiate
GetMessage requests (see Section 7.4) asking for blocks of type HELLO
using its own peer identity in the QUERY_HASH field of the message.
The PEER_BF is initialized and set using the peers own peer identity
as well as the identities of all currently connected _neighbors_.
These requests MUST use the FindApproximate and DemultiplexEverywhere
flags. FindApproximate will ensure that other peers will reply with
results where the keys are merely considered close-enough, while
DemultiplexEverywhere will cause each peer on the path to respond.
The combination of these flags is thus likely to yield HELLOs of
peers that are useful somewhere in the routing table. The
RECOMMENDED replication level to be set in the REPL_LVL field is 4.
The size and format of the result filter is specified in Section 8.2.
The XQUERY MUST be empty.
In order to facilitate the above, the underlay is expected to provide
the implementation with one or more addresses signalled through
ADDRESS_ADDED. Zero addresses MAY be provided if a peer can only
establish outgoing connections and is otherwise unreachable. An
implementation MUST advertise its addresses periodically to its
_neighbors_ through HelloMessages. The advertisement interval and
expiration should be configurable or chosen at the discretion of the
implementation based on external factors such as expiration of DHCP
leases. The specific frequency of advertisements MAY depend on
available bandwidth, the set of already connected neighbors, the
workload of the system and other factors which are at the discretion
of the developer, but SHOULD be a fraction of the expiration period.
Whenever a peer receives such a HELLO message from another peer that
is already in the routing table, it must cache it as long as that
peer remains in its routing table (or until the HELLO expires) and
serve it in response to GET requests for HELLO blocks (see
Section 7.4.3). This behaviour makes it unnecessary to initiate
dedicated PutMessages containing HELLO blocks by the implementation.
6.3. Peer Bloom Filter
As DHT GetMessages and PutMessages traverse a random path through the
network for the first L2NSE hops, a key design objective of R^5N is
to avoid routing loops. The peer Bloom filter is part of the routing
metadata in messages to prevent circular routes. It is updated at
each hop where the hops peer public key is added to it. It is
constant in size at L=1024 bits (128 bytes) and sets k=16 bits per
element. For the next hop selection in both the random and the
deterministic case, any peer which is in the peer Bloom filter for
the respective message is excluded from the peer selection process.
Schanzenbach, et al. Expires 20 August 2024 [Page 13]
Internet-Draft The R5N Distributed Hash Table February 2024
Any peer which is forwarding GetMessages or PutMessages (Section 7)
thus adds its own peer public key to the peer Bloom filter. This
allows other peers to (probabilistically) exclude already traversed
peers when searching for the next hops in the routing table.
The peer Bloom filter follows the definition in Appendix A. The set
of elements E consists of of all possible 256-bit peer public keys.
The mapping function M is defined as follows:
M(e) -> SHA-512 (e) as uint32[]
The element e is the peer public key which is hashed using SHA-512.
The resulting 512-bit peer identity is interpreted as an array of
k=16 32-bit integers in network byte order which are used to set and
check the bits in B using BF-SET and BF-TEST.
We note that the peer Bloom filter may exclude peers due to false-
postive matches. This is acceptable as routing should nevertheless
terminate (with high probability) in close vicinity of the key.
Furthermore, due to the randomization of the first L2NSE hops, it is
possible that false-positives will be different when a request is
repeated.
6.4. Routing Functions
Using the data structures described so far, the R^5N routing
component provides the following functions for message processing
(Section 7):
GetDistance(A, B) -> Distance This function calculates the binary
XOR between A and B. The resulting distance is interpreted as an
integer where the leftmost bit is the most significant bit.
SelectClosestPeer(K, B) -> N This function selects the neighbor N
from our routing table with the shortest XOR-distance to the key
K. This means that for all other peers N' in the routing table
GetDistance(N, K) < GetDistance(N',K). Peers with a positive test
against the peer Bloom filter B are not considered.
SelectRandomPeer(B) -> N This function selects a random peer N from
all neighbors. Peers with a positive test in the peer Bloom
filter B are not considered.
SelectPeer(K, H, B) -> N This function selects a neighbor N
depending on the number of hops H parameter. If H <
NETWORK_SIZE_ESTIMATE returns SelectRandomPeer(B), and otherwise
returns SelectClosestPeer(K, B).
Schanzenbach, et al. Expires 20 August 2024 [Page 14]
Internet-Draft The R5N Distributed Hash Table February 2024
IsClosestPeer(N, K, B) -> true | false This function checks if N is
the closest peer for K (cf. SelectClosestPeer(K, B)). Peers with
a positive test in the Bloom filter B are not considered.
ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE) -> Number This function
computes the number of _neighbors_ that a message should be
forwarded to. The arguments are the desired replication level
(REPL_LVL), the HOPCOUNT of the message so far and and the current
network size estimate (L2NSE) as provided by the underlay. The
result is the non-negative number of next hops to select. The
following figure gives the pseudocode for computing the number of
neighbors the peer should attempt to forward the message to.
function ComputeOutDegree(REPL_LVL, HOPCOUNT, L2NSE)
BEGIN
if (HOPCOUNT > L2NSE * 4)
return 0;
if (HOPCOUNT > L2NSE * 2)
return 1;
if (0 = REPL_LEVL)
REPL_LEVL = 1
if (REPL_LEVEL > 16)
REPL_LEVEL = 16
RM1 = REPL_LEVEL - 1
return 1 + (RM1 / (L2NSE + RM1 * HOPCOUNT))
Figure 2: Computing the number of next hops.
The above calculation may yield values that are not discrete.
Hence, the result MUST be rounded probabilistically to the nearest
discrete value, using the fraction as the probability for rounding
up. This probabillistic rounding is necessary to achieve the
statistically expected value of the replication level and average
number of peers a message is forwarded to.
6.5. Pending Table
R^5N performs stateful routing where the messages only carry the
query hash and do not encode the ultimate source or destination of
the request. Routing a request towards the key is doing hop-by-hop
using the routing table and the query hash. The pending table is
used to route responses back to the originator. In the pending table
each peer primarily associates a query hash with the associated
originator of the request. The pending table MUST store entries for
the last MAX_RECENT requests the peer has encountered. To ensure
that the peer does not run out of memory, information about older
requests is discarded. The value of MAX_RECENT MAY be configurable
and SHOULD be at least 128 * 10^3.
Schanzenbach, et al. Expires 20 August 2024 [Page 15]
Internet-Draft The R5N Distributed Hash Table February 2024
For each entry in the pending table, the DHT MUST track not only the
query key and the origin, but also the extended query, requested
block type and flags, and the result filter. If the query did not
provide a result filter, a fresh result filter MUST still be created
to filter duplicate replies. Details of how a result filter works
depend on the type, as described in Section 8.1.
When a second query from the same origin for the same query hash is
received, the DHT MUST attempt to merge the new request with the
state for the old request. If this is not possible (say because the
MUTATOR differs), the existing result filter MUST be discarded and
replaced with the result filter of the incoming message.
We note that for local applications, a fixed limit on the number of
concurrent requests may be problematic. Hence, it is RECOMMENDED
that implementations track requests from local applications
separately and preserve the information about requests from local
applications until the local application explicitly stops the
request.
7. Message Processing
An implementation will process messages either because it needs to
transmit messages as part of routing table maintenance, or due to
requests from local applications, or because it received a message
from a _neighbor_. If instructed through an application-facing API
such as the one outlined in Appendix B, a peer acts as an _initiator_
of GetMessages or PutMessages. The status of initiator is relevant
for peers when processing ResultMessages due to the required handover
of results to the originating _application_.
The implementation MUST listen for RECEIVE(P, M) signals from the
underlay and react to the respective messages sent by the peer P.
Whether initiated locally or received from a neighbor, an
implementation processes messages according to the wire formats and
the required validations detailed in the following sections. Where
required, the local peer public key is referred to as SELF.
7.1. Message components
This section describes some data structures and fields shared by
various types of messages.
Schanzenbach, et al. Expires 20 August 2024 [Page 16]
Internet-Draft The R5N Distributed Hash Table February 2024
7.1.1. Flags
Flags is an 8-bit vector representing binary options. Each flag is
represented by a bit in the field starting from 0 as the rightmost
bit to 7 as the leftmost bit.
0: DemultiplexEverywhere This bit indicates that each peer along the
way should process the request. If the bit is not set,
intermediate peers only route the message and only peers which
consider themselves closest to the key look for answers in their
local storage for GetMessages, or respectively cache the block in
their local storage for PutMessages and ResultMessages.
1: RecordRoute This bit indicates to keep track of the path that the
message takes in the P2P network.
2: FindApproximate This bit asks peers to return results even if the
_key_ does not exactly match the query hash.
3: Truncated This is a special flag which is set if a peer truncated
the path and thus the first hop on the path is given without a
signature to enable checking of the next signature. This flag
MUST never be set in a query.
4-7: Reserved The remaining bits are reserved for future use and
MUST be set to 0 when initiating an operation. If non-zero bits
are received, implementations MUST preserve these bits when
forwarding messages.
7.1.2. Path Element
A path element represents a hop in the path a message has taken
through the overlay network. The wire format of a path element is
illustrated in Figure 3.
Schanzenbach, et al. Expires 20 August 2024 [Page 17]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE |
| (64 bytes) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PRED PEER PUBLIC KEY |
| (32 bytes) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 3: The Wire Format of a path element.
where:
SIGNATURE is a 64 byte EdDSA signature using the current hop's
private key affirming the peer public keys of the previous and
next hops.
PRED PEER PUBLIC KEY is the EdDSA public key of the previous peer on
the path.
An ordered list of path elements may be appended to any routed
PutMessages or ResultMessages. The last signature (after which the
peer public key is omitted) is created by the current hop only after
the peer made its routing decision identifiying the successor peer.
The peer public key is not included after the last signature as it
must be that of the sender of the message and including it would thus
be redundant. Similarly, the predecessor of the first element of an
untruncated path is not stated explicitly, as it must be ZERO.
Figure 4 shows the wire format of an example path from peer A over
peers B and C and D as it would be received by peer E in the PUTPATH
of a PutMessage, or as the combined PUTPATH and GETPATH of a
ResultMessage. The wire format of the path elements allows a natural
extension of the PUTPATH along the route of the ResultMessage to the
destination forming the GETPATH. The PutMessage would indicate in
the PATH_LEN field a length of 3. The ResultMessage would indicate a
path length of 3 as the sum of the field values in PUTPATH_L and
GETPATH_L. Basically, the last signature does not count for the path
length.
Schanzenbach, et al. Expires 20 August 2024 [Page 18]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE A |
| (64 bytes) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER A |
| (32 bytes) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE B |
| (64 bytes) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER B |
| (32 bytes) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE C |
| (64 bytes) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER C |
| (32 bytes) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE D (last sig) |
| (64 bytes) |
| |
| |
Schanzenbach, et al. Expires 20 August 2024 [Page 19]
Internet-Draft The R5N Distributed Hash Table February 2024
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 4: Example of a path as found in PutMessages or
ResultMessages from A to D.
A path may be truncated in which case the signature of the truncated
path element is omitted leaving only the public key of the peer
preceeding the trunction which is required for the verification of
the subsequent path element signature. Such a truncated path is
indicated with the respective truncated flag (Section 7.1.1). For
truncated paths, the peer public key of the signer of the last path
element is again omitted as it must be that of the sender of the
PutMesssage or ResultMessage. Similarly, the public key of the
receiving peer used in the last path element is omitted as it must be
SELF. The wire format of a truncated example path from peers B over
C and D to E (possibly still originating at A, but the origin is
unknowable to E due to truncation) is illustrated in Figure 5. Here,
a ResultMessage would indicate in the PATH_LEN field a length of 1
while a PutMessage would indicate a length of 1 as the sum of
PUTPATH_L and GETPATH_L fields. Basically, the truncated peer and
the last signature do not count for the path length.
Schanzenbach, et al. Expires 20 August 2024 [Page 20]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER B (truncated) |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE C |
| (64 bytes) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER C |
| (32 bytes) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE D (last sig) |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 5: Example of a truncated path from Peer B to Peer D.
The SIGNATURE field in a path element covers a 64-bit
contextualization header, the the block expiration, a hash of the
block payload, as well as the predecessor peer public key and the
peer public key of the successor that the peer making the signature
is routing the message to. Thus, the signature made by SELF
basically says that SELF received the block payload from PEER
PREDECESSOR and has forwarded it to PEER SUCCESSOR. The wire format
is illustrated in Figure 6.
Schanzenbach, et al. Expires 20 August 2024 [Page 21]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIZE | PURPOSE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| BLOCK HASH |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER PREDECESSOR |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER SUCCESSOR |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 6: The Wire Format of the path element for Signing.
SIZE A 32-bit value containing the length of the signed data in
bytes in network byte order. The length of the signed data MUST
be 144 bytes.
PURPOSE A 32-bit signature purpose flag. This field MUST be 6 (in
network byte order).
EXPIRATION denotes the absolute 64-bit expiration date of the block.
In microseconds since midnight (0 hour), January 1, 1970 UTC in
network byte order.
BLOCK HASH a SHA-512 hash over the block payload.
PEER PREDECESSOR the peer public key of the previous hop. If the
signing peer initiated the PUT, this field is set to all zeroes.
PEER SUCCESSOR the peer public key of the next hop (not of the
signer).
Schanzenbach, et al. Expires 20 August 2024 [Page 22]
Internet-Draft The R5N Distributed Hash Table February 2024
7.2. HelloMessage
When the underlay signals the implementation of added or removed
addresses through ADDRESS_ADDED and ADDRESS_DELETED an implementation
MAY disseminate those changes to neighbors using HelloMessages.
Initiation of such HelloMessages by the implementation itself is
RECOMMENDED. HelloMessages are used to inform neighbors of a peer
about the sender's available addresses. The recipients use these
messages to inform their respective underlays about ways to sustain
the connections and to generate HELLO blocks (see Section 8.2) to
answer peer discovery queries from other peers.
7.2.1. Wire Format
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| MSIZE | MTYPE | VERSION | NUM_ADDRS |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE /
/ (64 bytes) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ ADDRESSES (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 7: The HelloMessage Wire Format.
where:
MSIZE denotes the size of this message in network byte order.
MTYPE is the 16-bit message type. It must be set to the value 157
in network byte order as defined in the GANA "GNUnet Message Type"
registry Section 11.4.
VERSION is a 16-bit field that indicates the version of the
HelloMessage. Must be zero. In the future, this may be used to
extend or update the HelloMessage format.
NUM_ADDRS is a 16-bit number that gives the total number of
addresses encoded in the ADDRESSES field. In network byte order.
SIGNATURE is a 64 byte EdDSA signature using the sender's private
key affirming the information contained in the message. The
signature is signing exactly the same data that is being signed in
a HELLO block as described in Section 8.2.
Schanzenbach, et al. Expires 20 August 2024 [Page 23]
Internet-Draft The R5N Distributed Hash Table February 2024
EXPIRATION denotes the absolute 64-bit expiration date of the
content. The value specified is microseconds since midnight (0
hour), January 1, 1970, but must be a multiple of one million (so
that it can be represented in seconds in a HELLO URL). Stored in
network byte order.
ADDRESSES A sequence of exactly NUM_ADDRS addresses (Section 2)
which can be used to contact the peer. Each address MUST be
0-terminated. The set of addresses MAY be empty.
7.2.2. Processing
If the initiator of a HelloMessage is SELF, the message is simply
sent to all neighbors P currently in the routing table using SEND.
Otherwise, upon receiving a HelloMessage from a peer P an
implementation MUST process it step by step as follows:
1. If P is not in its routing table, the message is discarded.
2. The signature is verified, including a check that the expiration
time is in the future. If the signature is invalid, the message
is discarded.
3. The information contained in the HelloMessage can be used to
synthesize a block of type HELLO (Section 8.2). The block is
cached in the routing table until it expires, the peer is removed
from the routing table, or the information is replaced by another
message from the peer. The implementation SHOULD instruct the
underlay to connect to all now available addresses using
TRY_CONNECT in order to make the underlay aware of alternative
addresses for this connection and to maintain optimal
connectivity.
4. Received HelloMessages MUST NOT be forwarded.
7.3. PutMessage
PutMessages are used to store information at other peers in the DHT.
Any API which allows applications to initiate PutMessages needs to
provide sufficient, implementation-specific information to construct
the initial PutMessage. For example, implementations supporting
multiple applications and blocks will have block type and message
flag parameters in addition to the actual data payload and key.
7.3.1. Wire Format
Schanzenbach, et al. Expires 20 August 2024 [Page 24]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| MSIZE | MTYPE | BTYPE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| VER |FLAGS| HOPCOUNT | REPL_LVL | PATH_LEN |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER_BF /
/ (128 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| BLOCK_KEY /
/ (64 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ TRUNCATED ORIGIN (0 or 32 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ PUTPATH (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ LAST HOP SIGNATURE (0 or 64 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ BLOCK (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 8: The PutMessage Wire Format.
where:
MSIZE denotes the size of this message in network byte order.
MTYPE is the 16-bit message type. Read-only. It must be set to the
value 146 in network byte order as defined in the GANA "GNUnet
Message Type" registry Section 11.4.
BTYPE is a 32-bit block type. The block type indicates the content
type of the payload. Set by the initiator. Read-only. In
network byte order.
VER is a 8-bit protocol version. Set to zero. May be used in
future protocol versions.
FLAGS is a 8-bit vector with binary options (see Section 7.1.1).
Set by the initiator. Read-only.
HOPCOUNT is a 16-bit number indicating how many hops this message
has traversed to far. Set by the initiator to 0. Incremented by
processing peers. In network byte order.
REPL_LVL is a 16-bit number indicating the desired replication level
Schanzenbach, et al. Expires 20 August 2024 [Page 25]
Internet-Draft The R5N Distributed Hash Table February 2024
of the data. Set by the initiator. Read-only. In network byte
order.
PATH_LEN is a 16-bit number indicating the number of path elements
recorded in PUTPATH. As PUTPATH is optional, this value may be
zero. If the PUTPATH is enabled, set initially to 0 by the
initiator. Incremented by processing peers. In network byte
order.
EXPIRATION denotes the absolute 64-bit expiration date of the
content. Set by the initiator. Read-only. In microseconds since
midnight (0 hour), January 1, 1970 in network byte order.
PEER_BF A peer Bloom filter to stop circular routes (see
Section 6.3). Set by the initiator to contain the local peer and
all neighbors it is forwarded to. Modified by processing peers to
include their own peer public key using BF-SET.
BLOCK_KEY The key under which the PutMessage wants to store content
under. Set by the initiator. Read-only.
TRUNCATED ORIGIN is only provided if the TRUNCATED flag is set in
FLAGS. If present, this is the public key of the peer just before
the first entry on the PUTPATH and the first peer on the PUTPATH
is not the actual origin of the message. Thus, to verify the
first signature on the PUTPATH, this public key must be used.
Note that due to the truncation, this last hop cannot be verified
to exist. Value is modified by processing peers.
PUTPATH the variable-length PUT path. The path consists of a list
of PATH_LEN path elements. Set by the initiator to 0.
Incremented by processing peers.
LAST HOP SIGNATURE is only provided if the RECORD ROUTE flag is set
in FLAGS. If present, this is an EdDSA signature of the sender of
this message (using the same format as the signatures in PUTPATH)
affirming that the sender forwarded the message from the
predecessor (all zeros if PATH_LEN is 0, otherwise the last peer
in PUTPATH) to the target peer. Modified by processing peers (if
flag is set).
BLOCK the variable-length block payload. The contents are
determined by the BTYPE field. The length is determined by MSIZE
minus the size of all of the other fields. Set by the initiator.
Read-only.
Schanzenbach, et al. Expires 20 August 2024 [Page 26]
Internet-Draft The R5N Distributed Hash Table February 2024
7.3.2. Processing
Upon receiving a PutMessage from a peer P , or created through
initiation by an overlay API, an implementation MUST process it step
by step as follows:
1. The EXPIRATION field is evaluated. If the message is expired, it
MUST be discarded.
2. If the BTYPE is not supported by the implementation, no
validation of the block payload is performed and processing
continues at (5). If the BTYPE is ANY, then the message MUST be
discarded. Else, the block MUST be validated as defined in (3)
and (4).
3. The message is evaluated using the block validation functions
matching the BTYPE. First, the client attempts to derive the key
using the respective DeriveBlockKey procedure as described in
Section 8.1. If a key can be derived and does not match, the
message MUST be discarded.
4. Next, the ValidateBlockStoreRequest procedure for the BTYPE as
described in Section 8.1 is used to validate the block payload.
If the block payload is invalid, the message MUST be discarded.
5. The peer identity of the sender peer P SHOULD be in PEER_BF. If
not, the implementation MAY log an error, but MUST continue.
6. If the RecordRoute flag is not set, the PATH_LEN MUST be set to
zero. If the flag is set and PATH_LEN is non-zero, the local
peer SHOULD verify the signatures from the PUTPATH. Verification
MAY involve checking all signatures or any random subset of the
signatures. It is RECOMMENDED that peers adapt their behavior to
available computational resources so as to not make signature
verification a bottleneck. If an invalid signature is found, the
PUTPATH MUST be truncated to only include the elements following
the invalid signature.
7. If the local peer is the closest peer (cf. IsClosestPeer(SELF,
BLOCK_KEY, PeerFilter)) or the DemultiplexEverywhere flag ist
set, the message SHOULD be stored locally in the block storage if
possible. The implementation MAY choose not store the block if
external factors or configurations prevent this, such as limited
(alottted) disk space.
8. If the BTYPE of the message indicates a HELLO block, the peer
MUST be considered for the local routing table by using the peer
identity in BLOCK_KEY. If the peer is not either already
Schanzenbach, et al. Expires 20 August 2024 [Page 27]
Internet-Draft The R5N Distributed Hash Table February 2024
connected or the respective k-bucket is not already full the peer
MUST try to establish a connection to the peer indicated in the
HELLO block using the address information from the HELLO block
and the underlay function TRY_CONNECT. The implementation MUST
instruct the underlay to try to connect to all provided addresses
using TRY_CONNECT in order to make the underlay aware of multiple
addresses for this connection. When a connection is established,
the signal PEER_CONNECTED will cause the peer to be added to the
respective k-bucket of the routing table (Section 6).
9. Given the value in REPL_LVL, HOPCOUNT and FALSE =
IsClosestPeer(SELF, BLOCK_KEY, PeerFilter) the number of peers to
forward to MUST be calculated using ComputeOutDegree(). The
implementation SHOULD select up to this number of peers to
forward the message to using the function SelectPeer()
(Section 6.4) using the BLOCK_KEY, HOPCOUNT, and utilizing
PEER_BF as Bloom filter. For each selected peer PEER_BF is
updated with that peer in between calls to SelectPeer(). The
implementation MAY forward to fewer or no peers in order to
handle resource constraints such as limited bandwidth or simply
if there are not suitable peers. For each selected peer with
peer identity P a dedicated PutMessage_P is created containing
the original (and where applicable already updated) fields of the
received PutMessage. In each message the all selected peer
identities and the local peer identity MUST be added to the
PEER_BF and the HOPCOUNT is incremented by 1. If the RecordRoute
flag is set, a new path element is created using the predecessor
peer public key and the signature of the current peer. The path
element is added to the PUTPATH fields and the PATH_LEN field is
incremented by 1. When creating the path element signature, the
successor must be set to the recipient peer P of the PutMessageP.
The successor in the new path element is the recipient peer P of
Finally, the messages are sent using SEND(P, PutMessageP) each
recipient.
7.4. GetMessage
GetMessages are used to request information from other peers in the
DHT. Any overlay API which allows applications to initiate
GetMessages needs to provide sufficient, implementation-specific
information needed to construct the initial GetMessage. For example,
implementations supporting multiple applications and blocks will have
block type and message flag parameters.
7.4.1. Wire Format
Schanzenbach, et al. Expires 20 August 2024 [Page 28]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| MSIZE | MTYPE | BTYPE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| VER |FLAGS| HOPCOUNT | REPL_LVL | RF_SIZE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER_BF /
/ (128 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| QUERY_HASH /
/ (64 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
| RESULT_FILTER /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ XQUERY (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 9: The GetMessage Wire Format.
where:
MSIZE denotes the size of this message in network byte order.
MTYPE is the 16-bit message type. Read-only. It must be set to the
value 147 in network byte order as defined in the GANA "GNUnet
Message Type" registry Section 11.4.
BTYPE is a 32-bit block type field. The block type indicates the
content type of the payload. Set by the initiator. Read-only.
In network byte order.
VER is a 8-bit protocol version. Set to zero. May be used in
future protocol versions.
FLAGS is a 8-bit vector with binary options (see Section 7.1.1).
Set by the initiator. Read-only.
HOPCOUNT is a 16-bit number indicating how many hops this message
has traversed to far. Set by the initiator to 0. Incremented by
processing peers. In network byte order.
REPL_LVL is a 16-bit number indicating the desired replication level
of the data. Set by the initiator. Read-only. In network byte
order.
RF_SIZE is a 16-bit number indicating the length of the result
Schanzenbach, et al. Expires 20 August 2024 [Page 29]
Internet-Draft The R5N Distributed Hash Table February 2024
filter RESULT_FILTER. Set by the initiator. Read-only. In
network byte order.
PEER_BF A peer Bloom filter to stop circular routes (see
Section 6.3). Set by the initiator to include itself and all
connected neighbors in the routing table. Modified by processing
peers to include their own peer identity.
QUERY_HASH The query used to indicate what the key is under which
the initiator is looking for blocks with this request. The block
type may use a different evaluation logic to determine applicable
result blocks. Set by the initiator. Read-only.
RESULT_FILTER the variable-length result filter, described in
Section 7.4.2. Set by the initiator. Modified by processing
peers.
XQUERY the variable-length extended query. Optional. Set by the
initiator. Read-only.
7.4.2. Result Filter
The result filter is used to indicate to other peers which results
are not of interest when processing a GetMessage (Section 7.4). Any
peer which is processing GetMessages and has a result which matches
the query key MUST check the result filter and only send a reply
message if the result does not test positive under the result filter.
Before forwarding the GetMessage, the result filter MUST be updated
using the result of the BTYPE-specific FilterResult (see Section 8.1)
function to filter out all results already returned by the local
peer.
How a result filter is implemented depends on the block type as
described in Section 8.1. Result filters may be probabilistic data
structures. Thus, it is possible that a desireable result is
filtered by a result filter because of a false-positive test.
How exactly a block result is added to a result filter is specified
as part of the definition of a block type (cf. Section 8.2).
7.4.3. Processing
Upon receiving a GetMessage from a peer P, or created through
initiation by the overlay API, an implementation MUST process it step
by step as follows:
Schanzenbach, et al. Expires 20 August 2024 [Page 30]
Internet-Draft The R5N Distributed Hash Table February 2024
1. If the BTYPE is supported, the QUERY_HASH and XQUERY fields are
validated as defined by the respective ValidateBlockQuery
procedure for this type. If the result yields REQUEST_INVALID,
the message MUST be discarded and processing ends. If the BTYPE
is not supported, the message MUST be forwarded (Skip to step 4).
If the BTYPE is ANY, the message is processed further without
validation.
2. The peer identity of the sender peer P SHOULD be in the PEER_BF
Bloom filter. If not, the implementation MAY log an error, but
MUST continue.
3. The local peer SHOULD try to produce a reply in any of the
following cases: (1) If the local peer is the closest peer (cf.
IsClosestPeer (SELF, QueryHash, PeerFilter), or (2) if the
DemultiplexEverywhere flag is set, or (3) if the local peer is
not the closest and a previously cached ResultMessage also
matches this request (Section 7.5.2).
The reply is produced (if one is available) using the following
steps:
a) If the BTYPE is HELLO, the implementation MUST only consider
synthesizing its own addresses and the addresses it has
cached for the peers in its routing table as HELLO block
replies. Otherwise, if the BTYPE does not indicate a request
for a HELLO block or ANY, the implementation MUST only
consider blocks in the local block storage and previously
cached ResultMessages.
b) If the FLAGS field includes the flag FindApproximate, the
peer SHOULD respond with the closest block (smallest value of
GetDistance(QUERY_HASH, BLOCK_KEY)) it can find that is not
filtered by the RESULT_BF. Otherwise, the peer MUST respond
with the block with a BLOCK_KEY that matches the QUERY_HASH
exactly and that is not filtered by the RESULT_BF.
c) Any resulting (synthesized) block is encapsulated in a
ResultMessage. The ResultMessage SHOULD be transmitted to
the neighbor from which the request was received.
Implementations MAY not reply if they are resource-constrained.
However, ResultMessages MUST be given the highest priority among
competing transmissions.
If the BTYPE is supported and ValidateBlockReply for the given
query has yielded a status of FILTER_LAST, processing MUST end
and not continue with forwarding of the request to other peers.
Schanzenbach, et al. Expires 20 August 2024 [Page 31]
Internet-Draft The R5N Distributed Hash Table February 2024
4. The implementation SHOULD create (or merge) an entry in the
pending table Section 6.5 for the query represented by this
GetMessage. If the peer is unable to handle an additional entry
in the table, the message MUST be discarded and processing ends.
5. Using the value in REPL_LVL, the number of peers to forward to
MUST be calculated using ComputeOutDegree(). If there is at
least one peer to forward to, the implementation SHOULD select up
to this number of peers to forward the message to. The
implementation SHOULD select up to this number of peers to
forward the message to using the function SelectPeer()
(Section 6.4) using the QUERY_HASH, HOPCOUNT, an appropriate
bloom filter (FIXME: Start with PEER_BF?). The implementation
MAY forward to fewer or no peers in order to handle resource
constraints such as bandwidth. The peer Bloom filter PEER_BF
MUST be updated with the local peer identity SELF for any
forwarded message. For all peers with peer identity P chosen to
forward the message to, SEND(P, GetMessageP) is called. Here,
GetMessageP is the original message with the updated fields for
HOPCOUNT (incremented by 1), PEER_BF and RESULT_FILTER.
7.5. ResultMessage
ResultMessages are used to return information to other peers in the
DHT or to applications using the overlay API that previously
initiated a GetMessage. The initiator of a ResultMessage is a peer
triggered through the processing of a GetMessage.
7.5.1. Wire Format
Schanzenbach, et al. Expires 20 August 2024 [Page 32]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| MSIZE | MTYPE | BTYPE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| RESERVED | VER |FLAGS| PUTPATH_L | GETPATH_L |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| QUERY_HASH /
/ (64 byte) |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ TRUNCATED ORIGIN (0 or 32 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ PUTPATH /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ GETPATH /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ LAST HOP SIGNATURE (0 or 64 bytes) /
+-----+-----+-----+-----+-----+-----+-----+-----+
/ BLOCK /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 10: The ResultMessage Wire Format
where:
MSIZE denotes the size of this message in network byte order.
MTYPE is the 16-bit message type. Set by the initiator. Read-only.
It must be set to the value 148 in network byte order as defined
in the GANA "GNUnet Message Type" registry Section 11.4.
BTYPE is a 32-bit block type field. The block type indicates the
content type of the payload. Set by the initiator. Read-only.
In network byte order.
RESERVED is a 16-bit value. Implementations MUST set this value to
zero when originating a result message. Implementations MUST
forward this value unchanged even if it is non-zero.
VER is a 8-bit protocol version in network byte order. Set to zero.
May be used in future protocol versions.
FLAGS is a 8-bit vector with binary options (see Section 7.1.1).
Set by the initiator.
Schanzenbach, et al. Expires 20 August 2024 [Page 33]
Internet-Draft The R5N Distributed Hash Table February 2024
PUTPATH_L is a 16-bit number indicating the number of path elements
recorded in PUTPATH. As PUTPATH is optional, this value may be
zero even if the message has traversed several peers. Set by the
initiator to the PATH_LEN of the PutMessage from which the block
originated. Modified by processing peers in case of path
truncation. In network byte order.
GETPATH_L is a 16-bit number indicating the number of path elements
recorded in GETPATH. As GETPATH is optional, this value may be
zero even if the message has traversed several peers. Set by the
initiator to 0. Modified by processing peers. In network byte
order.
EXPIRATION denotes the absolute 64-bit expiration date of the
content. In microseconds since midnight (0 hour), January 1, 1970
in network byte order. Set by the initiator to the expiration
value as recorded from the PutMessage from which the block
originated. Read-only.
QUERY_HASH the query hash corresponding to the GetMessage which
caused this reply message to be sent. Set by the initiator using
the value of the GetMessage. Read-only.
TRUNCATED ORIGIN is only provided if the TRUNCATED flag is set in
FLAGS. If present, this is the public key of the peer just before
the first entry on the PUTPATH and the first peer on the PUTPATH
is not the actual origin of the message. Thus, to verify the
first signature on the PUTPATH, this public key must be used.
Note that due to the truncation, this last hop cannot be verified
to exist. Set by processing peers.
PUTPATH the variable-length PUT path. The path consists of a list
of PUTPATH_L path elements. Set by the initiator to the the
PUTPATH of the PutMessage from which the block originated.
Modified by processing peers in case of path truncation.
GETPATH the variable-length PUT path. The path consists of a list
of GETPATH_L path elements. Set by processing peers.
LAST HOP SIGNATURE is only provided if the RecordRoute flag is set
in FLAGS. If present, this is an EdDSA signature of the sender of
this message (using the same format as the signatures in PUTPATH)
affirming that the sender forwarded the message from the
predecessor (all zeros if PATH_LEN is 0, otherwise the last peer
in PUTPATH) to the target peer.
BLOCK the variable-length resource record data payload. The
Schanzenbach, et al. Expires 20 August 2024 [Page 34]
Internet-Draft The R5N Distributed Hash Table February 2024
contents are defined by the respective type of the resource
record. Set by the initiator. Read-only.
7.5.2. Processing
Upon receiving a ResultMessage from a connected peer or triggered by
the processing of a GetMessage, an implementation MUST process it
step by step as follows:
1. First, the EXPIRATION field is evaluated. If the message is
expired, it MUST be discarded.
2. If the BTYPE is supported, then the BLOCK MUST be validated
against the requested BTYPE. To do this, the peer checks that
the block is valid using ValidateBlockStoreRequest. If the
result is BLOCK_INVALID, the message MUST be discarded.
3. If the PUTPATH_L or the GETPATH_L are non-zero, the local peer
SHOULD verify the signatures from the PUTPATH and the GETPATH.
Verification MAY involve checking all signatures or any random
subset of the signatures. It is RECOMMENDED that peers adapt
their behavior to available computational resources so as to not
make signature verification a bottleneck. If an invalid
signature is found, the path MUST be truncated to only include
the elements following the invalid signature. In particular, any
invalid signature on the GETPATH will cause PUTPATH_L to be set
to 0.
4. The peer also attempts to compute the key using DeriveBlockKey.
This may result in NONE. The result is used later. Note that
even if a key was computed, it does not have to match the
QUERY_HASH.
5. If the BTYPE of the message indicates a HELLO block, the peer
SHOULD be considered for the local routing table by using the
peer identity computed from the block using DeriveBlockKey. An
implementation MAY choose to ignore the HELLO, for example
because the routing table or the respective k-bucket is already
full. If the peer is a suitable candidate for insertion, the
local peer MUST try to establish a connection to the peer
indicated in the HELLO block using the address information from
the HELLO block and the underlay function TRY_CONNECT. The
implementation MUST instruct the underlay to connect to all
provided addresses using TRY_CONNECT in order to make the
underlay aware of multiple addresses for this connection. When a
connection is established, the signal PEER_CONNECTED will cause
the peer to be added to the respective k-bucket of the routing
table (Section 6).
Schanzenbach, et al. Expires 20 August 2024 [Page 35]
Internet-Draft The R5N Distributed Hash Table February 2024
6. If the QUERY_HASH of this ResultMessage does not match an entry
in the pending table (Section 6.5), then the message is discarded
and processing ends. Otherwise, processing continues for each
entry in the table as follows.
a) If the FindApproximate flag was not set in the query and the
BTYPE allowed the implementation to compute the key from the
block, the computed key must exactly match the QUERY_HASH,
otherwise the result does not match the pending query and
processing continues with the next pending query.
b) If the BTYPE is supported, result block MUST be validated
against the specific query using the respective
FilterBlockResult function. This function MUST update the
result filter if a result is returned to the originator of
the query.
c) If the BTYPE is not supported, filtering of exact duplicate
replies MUST still be performed before forwarding the reply.
Such duplicate filtering MAY be implemented
probabilistically, for example using a Bloom filter. The
result of this duplicate filtering is always either
FILTER_MORE or FILTER_DUPLICATE.
d) If the RecordRoute flag is set in FLAGS, the local peer
identity MUST be appended to the GETPATH of the message and
the respective signature MUST be set using the query origin
as the PEER SUCCESSOR and the response origin as the PEER
PREDECESSOR. If the flag is not set, the GETPATH_L and
PUTPATH_L MUST be set to zero when forwarding the result.
e) If the result filter result is either FILTER_MORE or
FILTER_LAST, the message is forwarded to the origin of the
query as defined in the entry which may either be the local
peer or a remote peer. In case this is a query of the local
peer the result may have to be provided to applications
through the overlay API. Otherwise, the result is forwarded
using SEND(P, ResultMessage') where ResultMessage' is the now
modified message. If the result was FILTER_LAST, the query
is removed from the pending table.
8. Finally, the implementation SHOULD cache ResultMessages in order
to provide already seen replies to future GetMessages. The
implementation MAY choose not no cache any or a limited number of
ResultMessages for reasons such as resource limitations.
Schanzenbach, et al. Expires 20 August 2024 [Page 36]
Internet-Draft The R5N Distributed Hash Table February 2024
8. Blocks
This section describes various considerations R^5N implementations
must consider with respect to blocks. Specifically, implementations
SHOULD be able to validate and persist blocks. Implementations MAY
not support validation for all types of blocks. On some devices,
storing blocks MAY also be impossible due to lack of storage
capacity.
Applications can and should define their own block types. The block
type determines the format and handling of the block payload by peers
in PutMessages and ResultMessages. Block types MUST be registered
with GANA (see Section 11.1).
8.1. Block Operations
Block validation may be necessary for all types of DHT messages. To
enable these validations, any block type specification MUST define
the following functions:
ValidateBlockQuery(Key, XQuery) -> RequestEvaluationResult is used
to evaluate the request for a block as part of GetMessage
processing. Here, the block payload is unkown, but if possible
the XQuery and Key SHOULD be verified. Possible values for the
RequestEvaluationResult are:
REQUEST_VALID Query is valid.
REQUEST_INVALID Query format does not match block type. For
example, a mandatory XQuery was not provided, or of the size of
the XQuery is not appropriate for the block type.
DeriveBlockKey(Block) -> Key | NONE is used to synthesize the block
key from the block payload as part of PutMessage and ResultMessage
processing. The special return value of NONE implies that this
block type does not permit deriving the key from the block. A Key
may be returned for a block that is ill-formed.
ValidateBlockStoreRequest(Block) -> BlockEvaluationResult is used to
evaluate a block payload as part of PutMessage and ResultMessage
processing. Possible values for the BlockEvaluationResult are:
BLOCK_VALID Block is valid.
BLOCK_INVALID Block payload does not match the block type.
SetupResultFilter(FilterSize, Mutator) -> RF is used to setup an
Schanzenbach, et al. Expires 20 August 2024 [Page 37]
Internet-Draft The R5N Distributed Hash Table February 2024
empty result filter. The arguments are the set of results that
must be filtered at the initiator, and a MUTATOR value which MAY
be used to deterministically re-randomize probabilistic data
structures. The specification MUST also include the wire format
for BF.
FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult,
RF') is used to filter results against specific queries. This
function does not check the validity of Block itself or that it
matches the given key, as this must have been checked earlier.
Thus, locally stored blocks from previously observed
ResultMessages and PutMessages use this function to perform
filtering based on the request parameters of a particular GET
operation. Possible values for the FilterEvaluationResult are:
FILTER_MORE Valid result, and there may be more.
FILTER_LAST Last possible valid result.
FILTER_DUPLICATE Valid result, but duplicate (was filtered by the
result filter).
FILTER_IRRELEVANT Block does not satisfy the constraints imposed
by the XQuery.
If the main evaluation result is FILTER_MORE, the function also
returns an updated result filter where the block is added to the
set of filtered replies. An implementation is not expected to
actually differenciate between the FILTER_DUPLICATE and
FILTER_IRRELEVANT return values: in both cases the block is
ignored for this query.
8.2. HELLO Blocks
For bootstrapping and peer discovery, the DHT implementation uses its
own block type called "HELLO". HELLO blocks are the only type of
block that MUST be supported by every R^5N implementation. A block
with this block type contains the peer public key of the peer that
published the HELLO together with a set of addresses of this peer.
The key of a HELLO block is the SHA-512 of the peer public key and
thus the peer's identity in the DHT.
The HELLO block type wire format is illustrated in Figure 11. A
query for block of type HELLO MUST NOT include extended query data
(XQuery). Any implementation encountering a request for a HELLO with
non-empty XQuery data MUST consider the request invalid and ignore
it.
Schanzenbach, et al. Expires 20 August 2024 [Page 38]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| PEER PUBLIC KEY |
| (32 byte) |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIGNATURE |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
/ ADDRESSES /
/ (variable length) /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 11: The HELLO Block Format.
PEER PUBLIC KEY is the public key of the peer which has generated
this HELLO.
EXPIRATION denotes the absolute 64-bit expiration date of the
content. The value specified is microseconds since midnight (0
hour), January 1, 1970, but must be a multiple of one million (so
that it can be represented in seconds in a HELLO URL). Stored in
network byte order.
ADDRESSES is a list of UTF-8 addresses (Section 2) which can be used
to contact the peer. Each address MUST be 0-terminated. The set
of addresses MAY be empty.
SIGNATURE is the signature of the HELLO. It covers a 64-bit pseudo
header derived from the information in the HELLO block. The
pseudo header includes the expiration time, a constant that
uniquely identifies the purpose of the signature, and a hash over
the addresses. The wire format is illustrated in Figure 12.
Schanzenbach, et al. Expires 20 August 2024 [Page 39]
Internet-Draft The R5N Distributed Hash Table February 2024
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| SIZE | PURPOSE |
+-----+-----+-----+-----+-----+-----+-----+-----+
| EXPIRATION |
+-----+-----+-----+-----+-----+-----+-----+-----+
| H_ADDRS |
| (64 byte) |
| |
| |
| |
| |
| |
| |
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 12: The Wire Format of the HELLO for Signing.
SIZE A 32-bit value containing the length of the signed data in
bytes in network byte order. The length of the signed data
MUST be 80 bytes.
PURPOSE A 32-bit signature purpose flag. This field MUST be 7
(in network byte order).
EXPIRATION denotes the absolute 64-bit expiration date of the
HELLO. In microseconds since midnight (0 hour), January 1,
1970 in network byte order.
H_ADDRS a SHA-512 hash over the addresses in the HELLO. H_ADDRS
is generated over the ADDRESSES field as provided in the HELLO
block using SHA-512 [RFC4634].
The HELLO block functions MUST be implemented as follows:
ValidateBlockQuery(Key, XQuery) -> RequestEvaluationResult To
validate a block query for a HELLO is to simply check that the
XQuery is empty. If it is empty, REQUEST_VALID ist returned.
Otherwise, REQUEST_INVALID.
DeriveBlockKey(Block) -> Key | NONE To derive a block key for a
HELLO is to simply hash the peer public key from the HELLO. The
result of this function is always the SHA-512 hash over the PEER
PUBLIC KEY.
ValidateBlockStoreRequest(Block) -> BlockEvaluationResult To
Schanzenbach, et al. Expires 20 August 2024 [Page 40]
Internet-Draft The R5N Distributed Hash Table February 2024
validate a block store request is to verify the EdDSA SIGNATURE
over the hashed ADDRESSES against the public key from the PEER
PUBLIC KEY field. If the signature is valid BLOCK_VALID is
returned. Otherwise BLOCK_INVALID.
SetupResultFilter(FilterSize, Mutator) -> RF The RESULT_FILTER for
HELLO blocks is implemented using a Bloom filter following the
definition from Appendix A and consists of a variable number of
bits L. L depends on the number of connected peers |E| known to
the peer creating a HELLO block from its own addresses: L is set
to the minimum of 2^18 bits (2^15 bytes) and the lowest power of 2
that is strictly larger than 2*K*|E| bits (K*|E|/4 bytes).
The k-value for the Bloom filter is 16. The elements used in the
Bloom filter consist of an XOR between the H_ADDRS field (as
computed using SHA-512 over the ADDRESSES) and the SHA-512 hash of
the MUTATOR field from a given HELLO block. The mapping function
M(H_ADDRS XOR MUTATOR) is defined as follows:
M(e = H_ADDR XOR MUTATOR) -> e as uint32[]
M is an identity function and returns the 512-bit XOR result
unmodified. This resulting byte string is interpreted as k=16
32-bit integers in network byte order which are used to set and
check the bits in B using BF-SET and BF-TEST. The 32-bit Mutator
is prepended to the L-bit Bloom filter field HELLO_BF containing B
to create the result filter for a HELLO block:
0 8 16 24 32 40 48 56
+-----+-----+-----+-----+-----+-----+-----+-----+
| MUTATOR | HELLO_BF /
+-----+-----+-----+-----+ (variable length) /
/ /
+-----+-----+-----+-----+-----+-----+-----+-----+
Figure 13: The HELLO Block Result Filter.
where:
MUTATOR The 32-bit mutator for the result filter.
HELLO_BF The L-bit Bloom filter array.
Schanzenbach, et al. Expires 20 August 2024 [Page 41]
Internet-Draft The R5N Distributed Hash Table February 2024
The MUTATOR value is used to additionally "randomize" the
computation of the Bloom filter while remaining deterministic
across peers. It is only ever set by the peer initiating the GET
request, and changed every time the GET request is repeated.
Peers forwarding GET requests MUST not change the mutator value
included in the RESULT_FILTER as they might not be able to
recalculate the result filter with a different MUTATOR value.
Consequently, repeated requests have statistically independent
probabilities of creating false-positives in a result filter.
Thus, even if for one request a result filter may exclude a result
as a false-positive match, subsequent requests are likely to not
have the same false-positives.
HELLO result filters can be merged if the Bloom filters have the
same size and MUTATOR by setting all bits to 1 that are set in
either Bloom filter. This is done whenever a peer receives a
query with the same MUTATOR, predecessor and Bloom filter size.
FilterResult(Block, Key, RF, XQuery) -> (FilterEvaluationResult,
RF') The H_ADDRS field is XORed with the SHA-512 hash of the MUTATOR
field from the HELLO block and the resulting value is checked
against the Bloom filter in RF. Consequently, HELLOs with
completely identical sets of addresses will be filtered and
FILTER_DUPLICATE is returned. Any small variation in the set of
addresses will cause the block to no longer be filtered (with high
probability) and FILTER_MORE is returned.
8.3. Persistence
An implementation SHOULD provide a local persistence mechanism for
blocks. Embedded systems that lack storage capability MAY use
connection-level signalling to indicate that they are merely a client
utilizing a DHT and are not able to participate with storage. The
local storage MUST provide the following functionality:
Store(Key, Block) Stores a block under the specified key. If an
block with identical payload exists already under the same key,
the meta data should be set to the maximum expiration time of both
blocks and use the corresponding PUTPATH (and if applicable
TRUNCATED ORIGIN) of that version of the block.
Lookup(Key) -> List of Blocks Retrieves blocks stored under the
specified key.
LookupApproximate(Key) -> List of Blocks Retrieves the blocks stored
under the specified key and any blocks under keys close to the
specified key, in order of decreasing proximity.
Schanzenbach, et al. Expires 20 August 2024 [Page 42]
Internet-Draft The R5N Distributed Hash Table February 2024
8.3.1. Approximate Search Considerations
Over time a peer may accumulate a significant number of blocks which
are stored locally in the persistence layer. Due to the expected
high number of blocks, the method to retrieve blocks close to the
specified lookup key in the LookupApproximate API must be implemented
with care with respect to efficiency.
It is RECOMMENDED to limit the number of results from the
LookupApproximate procedure to a result size which is easily
manageable by the local system.
In order to efficiently find a suitable result set, the
implementation SHOULD follow the following procedure:
1. Sort all blocks by the block key in ascending (decending) order.
The block keys are interpreted as integer.
2. Alternatingly select a block with a key larger and smaller from
the sortings. The resulting set is sorted by XOR distance. The
selection process continues until the upper bound for the result
set is reached and both sortings do not yield any closer blocks.
An implementation MAY decide to use a custom algorithm in order to
find the closest blocks in the local storage. But, especially for
more primitive approaches, such as only comparing XOR distances for
all blocks in the storage, the procedure may become ineffective for
large storages.
8.3.2. Caching Strategy Considerations
An implementation MUST implement an eviction strategy for blocks
stored in the block storage layer.
In order to ensure the freshness of blocks, an implementation MUST
evict expired blocks in favor of new blocks.
An implementation MAY preserve blocks which are often requested.
This approach can be expensive as it requires the implementation to
keep track of how often a block is requested.
An implementation MAY preserve blocks which are close to the local
peer public key.
An implementation MAY provide configurable storage quotas and adapt
its eviction strategy based on the current storage size or other
constrained resources.
Schanzenbach, et al. Expires 20 August 2024 [Page 43]
Internet-Draft The R5N Distributed Hash Table February 2024
9. Security Considerations
If an upper bound to the maximum number of neighbors in a k-bucket is
reached, the implementation MUST prefer to preserve the oldest
working connections instead of new connections. This makes Sybil
attacks less effective as an adversary would have to invest more
resources over time to mount an effective attack.
The ComputeOutDegree function limits the REPL_LVL to a maximum of 16.
This imposes an upper limit on bandwidth amplification an attacker
may achieve for a given network size and topology.
9.1. Disjoint Underlay or Application Protocol Support
We note that peers implementing disjoint sets of underlay protocols
may experience difficulties communicating (unless other peers bridge
the respective underlays). Similarly, peers that do not support a
particular application will not be able to validate application-
specific payloads and may thus be tricked into storing or forwarding
corrupt blocks.
9.2. Approximate Result Filtering
When a FindApproximate request is encountered, a peer will try to
respond with the closest block it has that is not filtered by the
result Bloom filter. Implementations MUST ensure that the cost of
evaluating any such query is reasonably small. For example,
implementations MAY consider to avoid an exhaustive search of their
database. Not doing so can lead to denial of service attacks as
there could be cases where too many local results are filtered by the
result filter.
9.3. Access Control
By design R^5N does not rely on strict admission control through the
use of either centralized enrollment servers or pre-shared keys.
This is a key distintion over protocols that do rely on this kind of
access control such as [RFC6940] which, like R^5N, provides a peer-
to-peer (P2P) signaling protocol with extensible routing and topology
mechanisms. Some decentralized applications, such as the GNU Name
System ([RFC9498]), require an open system that enables ad-hoc
participation.
Schanzenbach, et al. Expires 20 August 2024 [Page 44]
Internet-Draft The R5N Distributed Hash Table February 2024
10. IANA Considerations
IANA maintains a registry called the "Uniform Resource Identifier
(URI) Schemes" registry. The registry should be updated to include
an entry for the 'gnunet' URI scheme. IANA is requested to update
that entry to reference this document when published as an RFC.
11. GANA Considerations
11.1. Block Type Registry
GANA [GANA] is requested to create a "DHT Block Types" registry. The
registry shall record for each entry:
* Name: The name of the block type (case-insensitive ASCII string,
restricted to alphanumeric characters
* Number: 32-bit
* Comment: Optionally, a brief English text describing the purpose
of the block type (in UTF-8)
* Contact: Optionally, the contact information of a person to
contact for further information
* References: Required, references (such as an RFC) specifying the
block type and its block functions
The registration policy for this sub-registry is "First Come First
Served", as described in [RFC8126]. GANA created the registry as
follows:
Number| Name | References | Description
------+----------------+------------+-------------------------
0 ANY [This.I-D] Reserved
13 DHT_HELLO [This.I-D] Address data for a peer
Contact: r5n-registry@gnunet.org
Figure 14: The Block Type Registry.
11.2. GNUnet URI Schema Subregistry
GANA [GANA] is requested to create a "gnunet://" sub-registry. The
registry shall record for each entry:
* Name: The name of the subsystem (case-insensitive ASCII string,
restricted to alphanumeric characters)
Schanzenbach, et al. Expires 20 August 2024 [Page 45]
Internet-Draft The R5N Distributed Hash Table February 2024
* Comment: Optionally, a brief English text describing the purpose
of the subsystem (in UTF-8)
* Contact: Optionally, the contact information of a person to
contact for further information
* References: Optionally, references describing the syntax of the
URL (such as an RFC or LSD)
The registration policy for this sub-registry is "First Come First
Served", as described in [RFC8126]. GANA created this registry as
follows:
Name | References | Description
---------------+------------+-------------------------
HELLO [This.I-D] How to contact a peer.
ADDRESS N/A Network address.
Contact: gnunet-registry@gnunet.org
Figure 15: GNUnet scheme Subregistry.
11.3. GNUnet Signature Purpose Registry
GANA amended the "GNUnet Signature Purpose" registry as follows:
Purpose | Name | References | Description
--------+-----------------+------------+---------------
6 DHT PATH ELEMENT [This.I-D] DHT message routing data
7 HELLO PAYLOAD [This.I-D] Peer contact information
Figure 16: The Signature Purpose Registry Entries.
11.4. GNUnet Message Type Registry
GANA is requested to amend the "GNUnet Message Type" registry as
follows:
Type | Name | References | Description
--------+-----------------+------------+---------------
146 DHT PUT [This.I-D] Store information in DHT
147 DHT GET [This.I-D] Request information from DHT
148 DHT RESULT [This.I-D] Return information from DHT
157 HELLO Message [This.I-D] Peer contact information
Figure 17: The Message Type Registry Entries.
12. Test Vectors
Schanzenbach, et al. Expires 20 August 2024 [Page 46]
Internet-Draft The R5N Distributed Hash Table February 2024
13. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
2003, <https://www.rfc-editor.org/info/rfc3629>.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, DOI 10.17487/RFC3986, January 2005,
<https://www.rfc-editor.org/info/rfc3986>.
[RFC4634] Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms
(SHA and HMAC-SHA)", RFC 4634, DOI 10.17487/RFC4634, July
2006, <https://www.rfc-editor.org/info/rfc4634>.
[RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
Specifications: ABNF", STD 68, RFC 5234,
DOI 10.17487/RFC5234, January 2008,
<https://www.rfc-editor.org/info/rfc5234>.
[RFC6940] Jennings, C., Lowekamp, B., Ed., Rescorla, E., Baset, S.,
and H. Schulzrinne, "REsource LOcation And Discovery
(RELOAD) Base Protocol", RFC 6940, DOI 10.17487/RFC6940,
January 2014, <https://www.rfc-editor.org/info/rfc6940>.
[RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for
Writing an IANA Considerations Section in RFCs", BCP 26,
RFC 8126, DOI 10.17487/RFC8126, June 2017,
<https://www.rfc-editor.org/info/rfc8126>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8324] Klensin, J., "DNS Privacy, Authorization, Special Uses,
Encoding, Characters, Matching, and Root Structure: Time
for Another Look?", RFC 8324, DOI 10.17487/RFC8324,
February 2018, <https://www.rfc-editor.org/info/rfc8324>.
[RFC9498] Schanzenbach, M., Grothoff, C., and B. Fix, "The GNU Name
System", RFC 9498, DOI 10.17487/RFC9498, November 2023,
<https://www.rfc-editor.org/info/rfc9498>.
Schanzenbach, et al. Expires 20 August 2024 [Page 47]
Internet-Draft The R5N Distributed Hash Table February 2024
[ed25519] Bernstein, D., Duif, N., Lange, T., Schwabe, P., and B.
Yang, "High-Speed High-Security Signatures", 2011,
<http://link.springer.com/
chapter/10.1007/978-3-642-23951-9_9>.
[GANA] GNUnet e.V., "GNUnet Assigned Numbers Authority (GANA)",
April 2020, <https://gana.gnunet.org/>.
14. Informative References
[R5N] Evans, N. S. and C. Grothoff, "R5N: Randomized recursive
routing for restricted-route networks", 2011,
<https://doi.org/10.1109/ICNSS.2011.6060022>.
[Kademlia] Maymounkov, P. and D. Mazieres, "Kademlia: A peer-to-peer
information system based on the xor metric.", 2002,
<http://css.csail.mit.edu/6.824/2014/papers/kademlia.pdf>.
[cadet] Polot, B. and C. Grothoff, "CADET: Confidential ad-hoc
decentralized end-to-end transport", 2014,
<https://doi.org/10.1109/MedHocNet.2014.6849107>.
Appendix A. Bloom filters in R^5N
R^5N uses Bloom filters in several places. This section gives some
general background on Bloom filters and defines functions on this
data structure shared by the various use-cases in R^5N.
A Bloom filter (BF) is a space-efficient probabilistic datastructure
to test if an element is part of a set of elements. Elements are
identified by an element ID. Since a BF is a probabilistic
datastructure, it is possible to have false-positives: when asked if
an element is in the set, the answer from a BF is either "no" or
"maybe".
Bloom filters are defined as a string of L bits. The bits are
initially always empty, meaning that the bits are set to zero. There
are two functions which can be invoked on the Bloom filter "bf": BF-
SET(bf, e) and BF-TEST(bf, e) where "e" is an element that is to be
added to the Bloom filter or queried against the set.
A mapping function M is used to map each ID of each element from the
set to a subset of k bits. In the original proposal by Bloom, M is
non-injective and can thus map the same element multiple times to the
same bit. The type of the mapping function can thus be described by
the following mathematical notation:
Schanzenbach, et al. Expires 20 August 2024 [Page 48]
Internet-Draft The R5N Distributed Hash Table February 2024
------------------------------------
# M: E->B^k
------------------------------------
# L = Number of bits
# B = 0,1,2,3,4,...L-1 (the bits)
# k = Number of bits per element
# E = Set of elements
------------------------------------
Example: L=256, k=3
M('element-data') = {4,6,255}
Figure 18: Bloom filter mapping function.
When adding an element to the Bloom filter bf using BF-SET(bf,e),
each integer n of the mapping M(e) is interpreted as a bit offset n
mod L within bf and set to 1.
When testing if an element may be in the Bloom filter bf using BF-
TEST(bf,e), each bit offset n mod L within bf MUST have been set to
1. Otherwise, the element is not considered to be in the Bloom
filter.
Appendix B. Overlay Operations
An implementation of this specification commonly exposes the two
overlay operations "GET" and "PUT". The following are non-normative
examples of APIs for those operations. Their behaviour is described
prosaically in order to give implementers a fuller picture of the
protocol.
B.1. GET operation
A basic GET operation interface may be exposed as:
GET(Query-Key, Block-Type) -> Results as List
The procedure typically takes at least two arguments to initiate a
lookup:
QueryKey: is the 512-bit key to look for in the DHT.
Block-Type: is the type of block to look for, possibly "any".
The GET procedure may allow a set of optional parameters in order to
control or modify the query:
Replication-Level: is an integer which controls how many nearest
Schanzenbach, et al. Expires 20 August 2024 [Page 49]
Internet-Draft The R5N Distributed Hash Table February 2024
peers the request should reach.
Flags: is a 16-bit vector which indicates certain processing
requirements for messages. Any combination of flags as defined in
Section 7.1.1 may be specified.
eXtended-Query (XQuery): is medatadata which may be required
depending on the respective Block-Type. A Block-Type must define
if the XQuery can or must be used and what the specific format of
its contents should be. Extended queries are in general used to
implement domain-specific filters. These might be particularly
useful in combination with FindApproximate to add a well-defined
filter by an application-specific distance. Regardless, the DHT
does not define any particular semantics for an XQuery. See also
Section 8.
Result-Filter: is data for a Block-type-specific filter which allows
applications to indicate results which are not relevant anymore to
the caller (see Section 7.4.2).
The GET procedure should be implemented as an asynchronous operation
that returns individual results as they are found in the DHT. It
should terminate only once the application explicitly cancels the
operation. A single result commonly consists of:
Block-Type: is the desired type of block in the result.
Block-Data: is the application-specific block payload. Contents are
specific to the Block-Type.
Block-Expiration: is the expiration time of the block. After this
time, the result should no longer be used.
Key: is the key under which the block was stored. This may be
different from the key that was queried if the flag
FindApproximate was set.
GET-Path: is a signed path of the public keys of peers which the
query traversed through the network. The DHT will try to make the
path available if the RecordRoute flag was set by the application
calling the PUT procedure. The reported path may have been
silently truncated from the beginning.
PUT-Path: is a signed path of the public keys of peers which the
Schanzenbach, et al. Expires 20 August 2024 [Page 50]
Internet-Draft The R5N Distributed Hash Table February 2024
result message traversed. The DHT will try to make the path
available if the RecordRoute flag was set for the GET procedure.
The reported path may have been silently truncated from the
beginning. As the block was cached by the node at the end of this
path, this path is more likely to be stale compared to the GET-
Path.
B.2. PUT operation
A PUT operation interface may be exposed as:
PUT(Key, Block-Type, Block-Expiration, Block-Data)
The procedure typically takes at least four parameters:
Key: is the key under which to store the block.
Block-Type: is the type of the block to store.
Block-Expiration: specifies when the block should expire.
Block-Data: is the application-specific payload of the block to
store.
The PUT procedure may allow a set of optional parameters in order to
control or modify the query:
Replication-Level: is an integer which controls how many nearest
peers the request should reach.
Flags: is a bit-vector which indicates certain processing
requirements for messages. Any combination of flags as defined in
Section 7.1.1 may be specified.
The PUT procedure does not necessarily yield any information.
Appendix C. HELLO URLs
The general format of a HELLO URL uses "gnunet://" as the scheme,
followed by "hello/" for the name of the GNUnet subsystem, followed
by "/"-separated values with the GNS Base32 encoding ([RFC9498]) of
the peer public key, a Base32-encoded EdDSA signature, and an
expiration time in seconds since the UNIX Epoch in decimal format.
After this a "?" begins a list of key-value pairs where the key is
the URI scheme of one of the peer's addresses and the value is the
URL-escaped payload of the address URI without the "://".
Schanzenbach, et al. Expires 20 August 2024 [Page 51]
Internet-Draft The R5N Distributed Hash Table February 2024
The general syntax of HELLO URLs specified using Augmented Backus-
Naur Form (ABNF) of [RFC5234] is:
hello-URL = "gnunet://hello[:version]/" meta [ "?" addrs ]
version = *(DIGIT)
meta = pid "/" sig "/" exp
pid = *bchar
sig = *bchar
exp = *DIGIT
addrs = addr *( "&" addr )
addr = addr-name "=" addr-value
addr-name = scheme
addr-value = *pchar
bchar = *(ALPHA / DIGIT)
Figure 19
'scheme' is defined in [RFC3986] in Section 3.1. 'pchar' is defined
in [RFC3986], Appendix A.
For example, consider the following URL:
gnunet://hello/1MVZC83SFHXMADVJ5F4
S7BSM7CCGFNVJ1SMQPGW9Z7ZQBZ689ECG/
CFJD9SY1NY5VM9X8RC5G2X2TAA7BCVCE16
726H4JEGTAEB26JNCZKDHBPSN5JD3D60J5
GJMHFJ5YGRGY4EYBP0E2FJJ3KFEYN6HYM0G/
1708333757?foo=example.com&bar+baz=1.2.3.4%3A5678%2Ffoo
Figure 20
It specifies that the peer with the _pid_ "1MVZ..." is reachable via
"foo" at "example.com" and "bar+baz" at "1.2.3.4" on port 5678 until
1708333757 seconds after the Epoch. Note that "foo" and "bar+baz"
here are underspecified and just used as a simple example. In
practice, the _addr-name_ refers to a scheme supported by a DHT
underlay.
Authors' Addresses
Martin Schanzenbach
Fraunhofer AISEC
Lichtenbergstrasse 11
85748 Garching
Germany
Email: martin.schanzenbach@aisec.fraunhofer.de
Schanzenbach, et al. Expires 20 August 2024 [Page 52]
Internet-Draft The R5N Distributed Hash Table February 2024
Christian Grothoff
Berner Fachhochschule
Hoeheweg 80
CH-2501 Biel/Bienne
Switzerland
Email: grothoff@gnunet.org
Bernd Fix
GNUnet e.V.
Boltzmannstrasse 3
85748 Garching
Germany
Email: fix@gnunet.org
Schanzenbach, et al. Expires 20 August 2024 [Page 53]