Internet-Draft | Gateway Crash Recovery | May 2021 |
Belchior, et al. | Expires 26 November 2021 | [Page] |
This memo describes the crash recovery mechanism for the Open Digital Asset Protocol (ODAP), called ODAP-2PC. The goal is to assure gateways running ODAP to be able to recover from crashes, and thus preserve the consistency of an asset across ledgers (i.e., double spend does not occur). This draft includes the description of the messaging and logging flow necessary for the correct functioning of ODAP-2PC.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 26 November 2021.¶
Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
Gateway systems that perform virtual asset transfers among DLTs must possess a degree of resiliency and fault tolerance in the face of possible crashes. Accounting for the possibiility of crashes is particularly important to guarantee asset consistency across DLTs.¶
ODAP-2PC [BVCH21] uses 2PC, an atomic commitment protocol (ACP). 2PC considers two roles: a Coordinator that manages the execution of the protocol and Participants that manage the resources that must be kept consistent. The source gateway plays the ACP role of Coordinator, and the recipient gateway plays the Participant role in relay mode. Gateways exchange messages corresponding to the protocol execution, generating log entries for each one.¶
Log entries are organized into logs. Logs enable either the same or other backup gateways to resume any phase of ODAP. This log can also serve as an accountability tool in case of disputes. Another key component is an atomic commit protocol (ACP) that guarantees that the source and target DLTs are modified consistently (atomicity) and permanently (durability), e.g., that assets that are taken from the source DLT are persisted into the recipient DLT.¶
Log entries are then the basis satisfying one of the key deployment requirements of gateways for asset transfers: a high degree of availability. In this document, we consider two common strategies to increase availability: (1) to support the recovery of the gateways and (2) to employ backup gateways with the ability to resume a stalled transfer.¶
This memo proposes: (i) the logging model of ODAP-2PC; (ii) the log storage types; (iii) the log storage API; (iv) the log entry format; (v) the recovery and rollaback procedures;¶
There following are some terminology used in the current document:¶
Gateways store logs to map state. There are two types of logs: a private log that stores the current state; and a shared log that stores the joint state between two gateways. Using a shared, decentralized log can alleviate trust assumptions between gateways, by providing an agreed upon source of truth.¶
We consider the log file to be a stack of log entries. Each time a log entry is added, it goes to the top of the stack (the highest index).¶
To manipulate the log, we define a set of log primitives, that translate log entry requests from a process into log entries, realized by the log storage API (for the context of ODAP, Section 3.5):¶
From these primitives, other functions can be built:¶
Example 2.1 shows a simplified version log referring to the transfer initiation flow ODAP phase. Each log entry (simplified, definition in Section 3) is composed by metadata (phase, sequence number) and one attribute from the payload (operation). Operations map behavior to state (see Section 3).¶
The following table illustrates the log storage API. The Function describes the primitive supported by the log storage API. The Parameters column specifies the parameters given to the endpoint as query parameters. Endpoint specifies the endpoint mapping a certain log primitive. The column Returns specifies what the contents of "response_data" mean. This last field is illustrated by column Response Example.¶
Example 2.1 shows the sequence of logging operations over part of the first phase of ODAP (simplified):¶
Different log storage types (or log support) exist.¶
The private log can in several supports: 1) off-chain storage (with the possibility of a hash of the logs being stored on-chain), where logs are stored on the hard-drive of the computer system performing the role of a gateway; 2) cloud storage; 3) on-chain storage, i.e., using a DLT. Shared logs can use supports 2 and 3.¶
Saving logs locally is faster than saving them on the respective ledger but delivers weaker integrity and availability guarantees. Saving log entries on a DLT may slow down the protocol because issuing a transaction is several orders of magnitude slower than writing on disk or accessing a cloud service.¶
We assume the storage service used provides the means necessary to assure the logs' confidentiality and integrity, stored and in transit. The service must provide an authentication and authorization scheme, e.g., based on OAuth and OIDC [OIDC], and use secure channels based on TLS/HTTPS [TLS].¶
The log storage API allows for developers to abstract the log storage support, providing a standardized way to interact with logs (e.g., relational vs. non-relational, local vs on-chain). It also handles access control if needed.¶
The following table maps the respecetive return values and response examples:¶
The log storage API MUST respond with return codes indicating the failure (error 5XX) or success of the operation (200). The application may carry out further operation in future to determine the ultimate status of the operation.¶
The log storage API response is in JSON format and contains two fields: 1) success: true if the operation was successful, and 2) response_data: contains the payload of the response generated by the log storage API.¶
The log entries are stored by a gateway in its log, and they capture gateways operations. Entries account for the current status of one of the three ODAP flows: Transfer Initiation flow, Lock-Evidence flow, and Commitment Establishment flow.¶
The recommended format for log entries is JSON [xxx], with protocol-specific mandatory fields, support for a free format field for plaintext or encrypted payloads directed at the DLT gateway or an underlying DLT. Although the recommended format is JSON, other formats can be used (e.g., XML).¶
The mandatory fields of a log entry, that are generated by ODAP, are:¶
In addition to the attributes that belong to ODAP s schema, each log entry REQUIRES the following attributes:¶
Optional field entries are:¶
Example of a log entry created by G1, corresponding to locking an asset (phase 2.3 of the ODAP protocol) :¶
Example of a log entry created by G2, acknowledging G1 locking an asset (phase 2.4 of the ODAP protocol) :¶
This section defines general considerations about crash recovery. ODAP-2PC is the application of the gateway crash recovery mechanism to asset transfers, across all ODAP phases.¶
We assume gateways fail by crashing, i.e., by becoming silent, not arbitrary or Byzantine faults. We assume authenticated reliable channels obtained using TLS/HTTPS [TLS]. To recover from these crashes, gateways store in persistent storage data about the step of their protocol. This allows the system to recover by getting from the log the first step that may have failed. We consider two recovery models:¶
In Self-healing mode, when a gateway restarts after a crash, it reads the state from the log and continues executing the protocol from that point on. We assume the gateway does not lose its long-term keys (public-private key pair) and can reestablish all TLS connections.¶
In Primary-backup mode, we assume that after a period T of the primary gateway failure, a backup gateway detects that failure unequivocally and takes the role of the primary gateway. The failure is detected using heartbeat messages and a conservative value for T. The backup gateway does virtually the same as the gateway in self-healing mode: reads the log and continues the process. The difference is that the log must be shared between the primary and the backup gateways. If there is more than one backup, a leader-election protocol may be executed to decide which backup will take the primary role.¶
Gateways can crash at several points of the protocol.¶
In 2PC and 3PC, recovery requires that the protocol steps are recorded in a log immediately before sending a message and immediately after receiving a message. When a node crashes:¶
Upon recovery, the recovered node attempts to retrieve the most recent log of operations. Two situations might occur: for gateways with their local log plus a shared log, the crashed gateway attempts to perform an update to its local log, using getLogDiff from the shared log.¶
If there is no shared log, the crashed gateway needs to synchronize itself with the counterparty gateway, by querying the counterparty gateway with a recovery message containing the latest log before crash. This message allows the non-crashed log to collect the potentially missing log entries from the crashed log. After that, the non-crashed log shares those entries with the now recover gateway.¶
The recovered gateway can now reconstruct the updated log and derive the current state of the asset transfer. For each phase:¶
For every step of this phase, logs are written before operations are executed. A log entry is written when an operation finishes its execution. If a gateway crashes, upon recovery, it sends a special message RECOVER to the counterparty gateway. The counterparty gateway derives the latest log entry the recover gateway holds, and calculates the difference between its own log (RESPONSE-UPDATE). After that, it sends it back to the recovered gateway, which then updates its own log. After that, a recovery confirmation message is sent (RECOVERY-CONFIRM), and the respective acknowledgment sent by the counterparty gateway (RECOVERY-ACK). The gateways now share the same log, and can proceed its operation. Note that if the shared log is blockchain or cloud based, the same flow applies, but the recovered gateway derives the new log, rather than the counterparty gateway.¶
If a crash occurs during the lock-evidence flow, the procedure is the same as the transfer initiation flow. However¶
This flow requires changes in distributed ledgers - which implies issuing transactions against them. As transactions cannot be undone on blockchains, we use a rollback list - keeping an history of the issued transactions. If a crash occurs and requires reverting state, transactions with the contrary effects of what is present on the rollaback lists are issued.¶
ODAP-2PC messages are used to recover from crashes at the several ODAP phases. These messages inform gateways of the current state of a recovery procedure. ODAP-2PC messages follow log format from Section 4.¶
A recover message is sent from the crashed gateway to the counterparty gateway, sending its most recent state. This message type is encoded on the recovery message field of an ODAP log.¶
The parameters of the recovery message payload consists of the following:¶
The recover update message is sent by the counterparty gateway after receiving a recover message from a recovered gateway. The recovered gateway informs of its current state (via the current state of the log). The counterparty gateway now calculates the difference between the log entry corresponding to the received sequence number from the recovered gateway and the latest sequence number (corresponding to the latest log entry). This state is sent to the recovered gateway.¶
The parameters of the recover update payload consists of the following:¶
The recover-update ack message (response to RECOVER-UPDATE) states if the recovered gateway's logs has been successfully updated. If inconsistencies are detected, the recovered gateway answers with initiates a dispute (RECOVER-DISPUTE message).¶
The parameters of this message consists of the following:¶
The recover-ack message is sent by the counterparty gateway to the recovered gateway acknowledging that the state is synchronized.¶
The parameters of this message consists of the following:¶
There are several situations when a crash may occur.¶
The following figure represents the source gateway (G1) crashing before it issued an init command to the recipient gateway (G2).¶
The second scenario requires further synchronization (figure below). At the retrieval of the latest log entry, G1 notices its log is outdated. It updates it upon necessary validation and then communicates its recovery to G2. The process then continues as defined.¶
We assume a trusted, secure communication channel between gateways (i.e., messages cannot be spoofed and/or altered by an adversary) using TLS 1.3 or higher. Clients support ?acceptable? credential schemes such as OAuth2.0.¶
The present protocol is crash fault-tolerant, meaning that it handles gateways that crash for several reasons (e.g., power outage). The present protocol does not support Byzantine faults, where gateways can behave arbitrarily (including being malicious). This implies that both gateways are considered trusted. We assume logs are not tampered with or lost.¶
Log entries need integrity, availability, and confidentiality guarantees, as they are an attractive point of attack [BVC19]. Every log entry contains a hash of its payload for guaranteeing integrity. If extra guarantees are needed (e.g., non-repudiation), a log entry might be signed by its creator. Availability is guaranteed by the usage of the log storage API that connects a gateway to a dependable storage (local, external, or DLT-based). Each underlying storage provides different guarantees. Access control can be enforced via the access control profile that each log can have associated with, i.e., the profile can be resolved, indicating who can access the log entry in which condition. Access control profiles can be implemented with access control lists for simple authorization. The authentication of the entities accessing the logs is done at the Log Storage API level (e.g., username+password authentication in local storage vs. blockchain-based access control in a DLT).¶
For extra guarantees, the nodes running the log storage API (or the gateway nodes themselves) can be protected by hardening technologies such as Intel SGX [CD16].¶