Internet DRAFT - draft-zhou-rtgwg-sinc
draft-zhou-rtgwg-sinc
Network Working Group D. Lou
Internet-Draft L. Iannone
Intended status: Experimental Y. Zhou
Expires: 26 August 2023 J. Yang
C. Zhang
Huawei
22 February 2023
Signaling In-Network Computing operations (SINC)
draft-zhou-rtgwg-sinc-00
Abstract
This memo introduces "Signaling In-Network Computing operations"
(SINC), a mechanism to enable signaling in-network computing
operations on data packets in specific scenarios like NetReduce,
NetDistributedLock, NetSequencer, etc. In particular, this solution
allows to flexibly communicate computational parameters, to be used
in conjunction with the payload, to in-network SINC-enabled devices
in order to perform computing operations.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 26 August 2023.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
Lou, et al. Expires 26 August 2023 [Page 1]
Internet-Draft SINC February 2023
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3
3. SINC Relevant Use Cases . . . . . . . . . . . . . . . . . . . 3
3.1. NetReduce . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2. NetDistributedLock . . . . . . . . . . . . . . . . . . . 4
3.3. NetSequencer . . . . . . . . . . . . . . . . . . . . . . 4
4. In-Network Generic Operations . . . . . . . . . . . . . . . . 5
5. SINC Framework Overview . . . . . . . . . . . . . . . . . . . 6
6. Data Operation Mode . . . . . . . . . . . . . . . . . . . . . 7
6.1. Individual Computing Mode . . . . . . . . . . . . . . . . 7
6.2. Batch Computing Mode . . . . . . . . . . . . . . . . . . 8
7. SINC Header . . . . . . . . . . . . . . . . . . . . . . . . . 8
8. SINC Control Plane Requirements . . . . . . . . . . . . . . . 10
9. Security Considerations . . . . . . . . . . . . . . . . . . . 10
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11
11. Acknowledgements {#Acknowledgements} {: numbered="false"} . . 11
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 11
12.1. Normative References . . . . . . . . . . . . . . . . . . 11
12.2. Informative References . . . . . . . . . . . . . . . . . 11
Appendix A. Computing Capability Operation abstraction . . . . . 14
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15
1. Introduction
According to the original design, the Internet performs just "store
and forward" of packets, and leaves more complex operations at the
end-points. However, new emerging applications could benefit from
in-network computing to improve the overall system efficiency
([GOBATTO], [ZENG]).
The formation of the COmputing In-Network (COIN) Research Group
[COIN], in the IRTF, encourages people to explore this emerging
technology and its impact on the Internet architecture. The "Use
Cases for In-Network Computing" document [I-D.irtf-coinrg-use-cases]
introduces some use cases to demonstrate how real applications can
benefit from COIN and show essential requirements demanded by COIN
applications.
Recent research has shown that network devices undertaking some
computing tasks can greatly improve the network and application
performance in some scenarios, like for instance aggregating path-
Lou, et al. Expires 26 August 2023 [Page 2]
Internet-Draft SINC February 2023
computing [NetReduce], key-value(K-V) cache [NetLock], and strong
consistency [GTM]. Their implementations mainly rely on programmable
network devices, by using P4 [P4] or other languages. In the context
of such heterogeneity of scenarios, it is desirable to have a generic
and flexible framework, able to explicitly signaling the computing
operation to be performed by network devices, which should be
applicable to many use cases, enabling easier deployment.
This document specifies such a Signaling In-Network Computing (SINC)
framework for, as the name states, in-network computing operation.
The computing functions are hosted on network devices, which, in this
memo, are generally named as SINC switches/routers.
2. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] and [RFC8174] when, and only when, they appear in all
capitals, as shown here.
3. SINC Relevant Use Cases
Hereafter a few relevant use cases are described, namely NetReduce,
NetDistributedLock, and NetSequencer, in order to help understanding
the requirements for a framework. Such a framework, should be
generic enough to accommodate a large variety of use cases, besides
the ones described in this document.
3.1. NetReduce
Over the last decade, the rapid development of Deep Neural Networks
(DNN) has greatly improved the performance of many Artificial
Intelligence (AI) applications like computer vision and natural
language processing. However, DNN training is a computation
intensive and time consuming task, which has been increasing
exponentially (computation time gets doubled every 3.4 months
[OPENAI]) in the past 10 years. Scale-up techniques concentrating on
the computing capability of a single device cannot meet the
expectation. Distributed DNN training approaches with synchronous
data parallelism like Parameter Server [PARAHUB] and All-Reduce
[MGWFBP] are commonly employed in practice, which on the other hand,
become increasingly a network-bound workload since communication
becomes a bottleneck at scale.
Comparing to host-oriented solutions, in-network aggregation
approaches like SwitchML [SwitchML] and SHARP [SHARP] could
potentially reduce to nearly half the bandwidth needed for data
Lou, et al. Expires 26 August 2023 [Page 3]
Internet-Draft SINC February 2023
aggregation, by offloading gradients aggregation from the host to
network switches. The SwitchML solution uses UDP for network
transport. The system solely relies on application layer logic to
trigger retransmission for packet loss, which leads to extra latency
and reduces the training performance. The SHARP solution on the
contrary, uses Remote Direct Memory Access (RDMA) to provide reliable
transmission [ROCEv2]. As the Infini-Band (IB) technology requires
specific hardware support, this solution is not very cost-effective.
NetReduce [NetReduce] does not depend on dedicated hardware and
provides a general in-network aggregation solution that is suitable
for Ethernet networks.
3.2. NetDistributedLock
In the majority of distributed system, the lock primitive is a widely
used concurrency control mechanism. For large distributed systems,
there is usually a dedicated lock manager that nodes contact to gain
read and/or write permissions of a resource. The lock manager is
often abstracted as Compare And Swap (CAS) or Fetch Add (FA)
operations.
The lock manager is typically running on a server, causing a
limitation on the performance by the speed of disk I/O transaction.
When the load increases, for instance in the case of database
transactions processed on a single node, the lock manager becomes a
major performance bottleneck, consuming nearly 75% of transaction
time [OLTP]. The multi-node distributed lock processing superimposes
the communication latency between nodes, which makes the performance
even worse. Therefore offloading the lock manager function from the
server to the network switch might be a better choice, since the
switch is capable of managing lock function efficiently. Meanwhile
it liberate the server for other computation tasks.
The test results in NetLock [NetLock] show that the lock manager
running on a switch is able to answer 100 million requests per
second, nearly 10 times more than what a lock server can do.
3.3. NetSequencer
Transaction managers are centralized solutions to guarantee
consistency for distributed transactions, such as GTM in Postgre-XL
([GTM], [CALVIN]). However, as a centralized module, transaction
managers have become a bottleneck in large scale high-performance
distributed systems. The work by Kalia et al. [HPRDMA] introduces a
server based networked sequencer, which is a kind of task manager
assigning monotonically increasing sequence number for transactions.
In [HPRDMA], the authors shows that the maximum throughput is 122
Million requests per second (Mrps), at the cost of an increased
Lou, et al. Expires 26 August 2023 [Page 4]
Internet-Draft SINC February 2023
average latency. This bounded throughput will impact the scalability
of distributed systems. The authors also test the bottleneck for
varies optimization methods, including CPU, DMA bandwidth and PCIe
RTT, which is introduced by the CPU centric architecture.
For a programmable switch, a sequencer is a rather simple operation,
while the pipeline architecture can avoid bottlenecks. It is worth
implementing a switch-based sequencer, which sets the performance
goal as hundreds of Mrps and latency in the order of microseconds.
4. In-Network Generic Operations
The COIN use case draft [I-D.irtf-coinrg-use-cases] illustrates some
general requirements for scenarios like in-network control and
distributed AI, where the aforementioned use cases belong to. One of
the requirements is that any in-network computing system must provide
means to specify the constraints for placing execution logic in
certain logical execution points (and their associated physical
locations). In case of NetReduce, NetDistributedLock, and
NetSequencer, data aggregation, lock management and sequence number
generation functions can be offloaded onto the in-network device. It
can be observed that those functions are based on "simple" and
"generic" operators, as shown in Table 1. Programmable switches are
capable of performing basic operations by executing one or more
operators, without impacting the forwarding performance ([NetChain],
[ERIS]).
+==============+===============+=================================+
| Use Case | Operation | Description |
+==============+===============+=================================+
| NetReduce | Sum value | The in-network device sums the |
| | (SUM) | data together and outputs the |
| | | resulting value. |
+--------------+---------------+---------------------------------+
| NetLock | Compare And | By comparing the request with |
| | Swap or | the status of its own lock, the |
| | Fetch-and-Add | in-network device sends out |
| | (CAS or FA) | whether the host has the |
| | | acquired the lock. Through the |
| | | CAS and FA, host can implement |
| | | shared and exclusive locks. |
+--------------+---------------+---------------------------------+
| NetSequencer | Fetch-and-Add | The in-network device provides |
| | (FA) | a monotonically increasing |
| | | counter number for the host. |
+--------------+---------------+---------------------------------+
Table 1: Example of in-network operations.
Lou, et al. Expires 26 August 2023 [Page 5]
Internet-Draft SINC February 2023
5. SINC Framework Overview
This section describes the various elements of the SINC framework and
explains how they work together.
The SINC protocol and extensions are designed for deployment in
limited domains, such as a data center network, rather than
deployment across the open Internet. The requirements and semantics
are specifically limited, as defined in the previous sections.
Figure 1 shows the overall SINC framework, consisting of Hosts, the
SINC Ingress Proxy, SINC switch/router (SW/R), the SINC Egress Proxy
and normal switches/routers(if any).
+---------+ +---------+
| Host A | | Host B |
+---------+ +---------+
| |
| |
+------------+ +------+ +-----------+ +------+ +-----------+
|SINC Ingress| | | | | | | |SINC Egress|
|Proxy |-->| SW/R |-->| SINC SW/R |-->| SW/R |-->|Proxy |
+------------+ +----- + +-----------+ +------+ +-----------+
Figure 1: General SINC deployment.
In the SINC domain, a host MUST be SINC-aware. It defines the data
operation to be executed. However, it does not need to be aware of
where the operation will be executed and how the traffic will be
steered in the network. The host sends out packets with a SINC
header containing the definition and parameters of data operations.
The SINC header could be placed directly after the transport layer,
before the computing data as part of the payload. However, the SINC
header can also potentially be positioned at layer 4, layer 3, or
even layer 2, depending on the network context of the applications
and the deployment consideration. This will be discussed in further
details in [I-D.zhou-sinc-deployment-considerations].
The SINC proxies are responsible for encapsulating/decapsulating
packets in order to steer them through the right network path and
nodes. The SINC proxies may or may not be collocated with hosts.
The SINC Ingress Proxy encapsulates and forwards packets containing a
SINC header, to the right node(s) with SINC operation capabilities.
Such an operation may involve the use of protocols like Service
Function Chaining (SFC [RFC7665]), LISP [RFC9300], Geneve [RFC8926],
or even MPLS [RFC3031]. Based on the definition of the required data
processing and the network capabilities, the SINC ingress proxy can
determine whether the data processing defined in the SINC header
Lou, et al. Expires 26 August 2023 [Page 6]
Internet-Draft SINC February 2023
could be executed in a single node or in multiple nodes. The SINC
Egress Proxy is responsible for decapsulating packets before
forwarding them to the destination host.
The SINC switch/router is the node equipped with in-network computing
capabilities. Upon receiving a SINC packet, the SINC switch/router
data-plane processes the SINC header, executes required operations,
updates the payload with results (if necessary) and forwards the
packet to the destination.
The SINC workflow is as follows:
1. Host A transmits a packet with the SINC header and data to the
SINC Ingress proxy.
2. The SINC Ingress proxy encapsulates and forwards the original
packet to a SINC switch/router(s).
3. The SINC switches/routers verifies the source, checks the
integrity of the data and performs the required data processing
defined in the SINC header. When the computing is done, if
necessary, the payload is updated with the result and then
forwarded to the SINC Egress proxy.
4. When the packet reaches the SINC Egress Proxy, the encapsulation
will be removed and the inner packet will be forwarded to the
final destination (Host B).
6. Data Operation Mode
According to the SINC scenarios, the SINC processing can be divided
into two modes: individual computing mode and batch computing mode.
Individual operations include all operations that can be performed on
data coming from a single packet (e.g., Netlock). Conversely, batch
operations include all operations that require to collect data from
multiple packets (e.g., NetReduce data aggregation).
6.1. Individual Computing Mode
The NetLock is a typical scenario involving individual operations,
where the SINC switch/router acts as a lock server, generating a lock
for a packet coming from one host.
This kind of operation has some general aspects to be considered:
Lou, et al. Expires 26 August 2023 [Page 7]
Internet-Draft SINC February 2023
* Initialization of the context on the computing device. The
context is the information necessary to perform operations on the
packets. For instance, the context for a lock operation includes
selected keys, lock states (values) for granting locks.
* Error conditions. Operations may fail and, as a consequence,
sometimes actions needs to be taken, e.g. sending a message to the
source host. However, error handling is not necessarily handled
by the SINC switch/router, which could simply roll back the
operation and forward the packet unchanged to the destination
host. The destination host will in this case perform the
operation. If the operation fails again, the destination host
will handle the error condition and may send a message back to the
source host. In this way SINC switches/router operation remains
relatively simple.
6.2. Batch Computing Mode
The batch operations require to collect data from multiple before
actually being able to perform the required operations. For
instance, in the NetReduce scenario, the gradient aggregation
requires packets carrying gradient arrays from each host to generate
the desired result array.
In this scenario, beside the general issues mentioned for the
individual operations, the batch operation may fail because some
packets do not arrive (or arrive too late). The time the packets are
temporarily cached on the SINC switch/router should be carefully
configured. On the one hand, it has to be sufficiently long so that
there is enough time to receive all required packets. On the other
hand, it has to be sufficiently short so that no retransmissions are
triggered at the transport or application layers on the end hosts.
Similarly to the error condition for the individual operations, if
the SINC switch/router does not receive all required packets in the
configured time interval, it can simply forward the packets to the
end host so that they deal with packet losses and retransmissions if
necessary.
7. SINC Header
The SINC header carries the data operation information and it has a
fixed length of 16 octets, as shown in Figure 2.
Lou, et al. Expires 26 August 2023 [Page 8]
Internet-Draft SINC February 2023
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved |D|L| Group ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| No. of Data Sources | Data Source ID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SeqNum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data Operation | Data Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2: SINC Header.
* Reserved: Flags field reserved for future use. MUST be set to
zero on transmission and MUST be ignored on reception.
* Done flag (D): Zero (0) indicates that the request operation is
not yet performed. One (1) indicates the operation has been done.
The source host MUST set this bit to 0. The in-network switch/
router performing the operation MUST set this flag to 1 after the
operation is executed.
* Loopback flag (L): Zero (0) indicates that the packet SHOULD be
sent to the destination after the data operation. One (1)
indicates that the packet SHOULD be sent back to the source node
after the data operation.
* Group ID: The group ID identifies different groups. Each group is
associated with one task.
* Number of Data Sources: Total number of data source nodes that are
part of the group.
* Data Source ID: Unique identifier of the data source node of the
packet.
* Sequence Number (SeqNum): The SeqNum is used to identify different
requests within one group.
* Data Operation: The operation to be executed by the SINC switch/
router. Appendix A briefly discusses possible operations.
* Data Offset: The in-packet offset from the SINC header to the data
required by the operation. This field is useful in cases where
the data is not right after the SINC header, the offset indicates
directly where, in the packet, the data is located.
Lou, et al. Expires 26 August 2023 [Page 9]
Internet-Draft SINC February 2023
8. SINC Control Plane Requirements
The SINC control plane has to configure SINC network elements to
ensure the proper execution of the computing task. The SINC
framework can work with either centralized or distributed control
planes However, this document does not assume any specific control
plane design. The basic requirements of the control plane shall
include the following:
* The SINC control plane should be aware of the switch resources.
This may be achieved by regularly querying the devices.
* The SINC control plane should be able to select the switches/
roouters based on certain constraints. For instance selecting
switches/routers that are able to perform a specific more complex
operations, or being able to distribute the load on various
alternative switches/routers without increasing the transmission
delay.
* The SINC control plane should be able to provide the necessary
configuration so that packets flow to the right place and
encapsulation/decapsulation operations are performed correctly.
This means for instance configuring the parameters of the selected
transport and its forwarding rules.
* The SINC control plane should provide monitoring and failover
mechanism in order to handle errors and failures.
9. Security Considerations
In-network computing exposes computing data to network devices, which
inevitably raises security and privacy considerations. The security
problems faced by in-network computing include, but are not limited
to:
* Trustworthiness of participating devices
* Data hijacking and tampering
* Private data exposure
This documents assume that the deployment is done in a trusted
environment. For example, in a data center network or a private
network.
A fine security analysis will be provided in future revisions of this
memo.
Lou, et al. Expires 26 August 2023 [Page 10]
Internet-Draft SINC February 2023
10. IANA Considerations
This document makes no requests to IANA.
11. Acknowledgements {#Acknowledgements} {: numbered="false"}
Dirk Trossen's feedback was of great help in improving this document.
12. References
12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
12.2. Informative References
[CALVIN] Thomson, A., Diamond, T., Weng, S., Ren, K., Shao, P., and
D. Abadi, "Calvin: fast distributed transactions for
partitioned database systems", Proceedings of the 2012 ACM
SIGMOD International Conference on Management of Data,
DOI 10.1145/2213836.2213838, May 2012,
<https://doi.org/10.1145/2213836.2213838>.
[COIN] "Computing in the Network, COIN, proposed IRTF group",
n.d., <https://datatracker.ietf.org/rg/coinrg/about/>.
[ERIS] Li, J., Michael, E., and D. R. K. Ports, "Eris:/
Coordination-Free Consistent Transactions Using In-Network
Concurrency Control", SOSP '17:/ Proceedings of the 26th
Symposium on Operating Systems Principles , 2017.
[GOBATTO] Reinehr Gobatto, L., Rodrigues, P., Tirone, M., Cordeiro,
W., and J. Azambuja, "Programmable Data Planes meets In-
Network Computing: A Review of the State of the Art and
Prospective Directions", Journal of Integrated Circuits
and Systems vol. 16, no. 2, pp. 1-8,
DOI 10.29292/jics.v16i2.497, August 2021,
<https://doi.org/10.29292/jics.v16i2.497>.
Lou, et al. Expires 26 August 2023 [Page 11]
Internet-Draft SINC February 2023
[GTM] "GTM and Global Transaction Management", n.d.,
<https://www.postgres-xl.org/documentation/xc-overview-
gtm.html>.
[HPRDMA] Kalia, A., Kaminsky, M., and D. G. Andersen, "Design
Guidelines for High Performance RDMA Systems", 2016 USENIX
Annual Technical Conference (USENIX ATC 16) , 2016,
<https://www.usenix.org/conference/atc16/technical-
sessions/presentation/kalia>.
[I-D.irtf-coinrg-use-cases]
Kunze, I., Wehrle, K., Trossen, D., Montpetit, M., de Foy,
X., Griffin, D., and M. Rio, "Use Cases for In-Network
Computing", Work in Progress, Internet-Draft, draft-irtf-
coinrg-use-cases-02, 7 March 2022,
<https://datatracker.ietf.org/doc/html/draft-irtf-coinrg-
use-cases-02>.
[I-D.zhou-sinc-deployment-considerations]
"*** BROKEN REFERENCE ***".
[MGWFBP] Shi, S., Chu, X., and B. Li, "MG-WFBP:/ Efficient data
communication for distributed synchronous SGD algorithms",
IEEE INFOCOM 2019-IEEE Conference on Computer
Communications. IEEE , 2019.
[NetChain] Jin, X., Li, X., and H. Zhang, "NetChain:/ Scale-free sub-
RTT coordination", 2018.
[NetLock] Z, Y., Y, Z., and B. V, "Netlock:/ Fast, centralized lock
management using programmable switches", Proceedings of
the Annual conference of the ACM Special Interest Group on
Data Communication on the applications, technologies,
architectures, and protocols for computer communication. ,
2020.
[NetReduce]
Liu, S., Wang, Q., and J. Zhang, "NetReduce:/ RDMA-
compatible in-network reduction for distributed DNN
training acceleration", 2020.
[OLTP] R, J., I, P., and A. A, "Improving OLTP scalability using
speculative lock inheritance", Proceedings of the VLDB
Endowment , 2009.
[OPENAI] "OpenAI. AI and compute", 2018,
<https://openai.com/blog/ai-and-compute/>.
Lou, et al. Expires 26 August 2023 [Page 12]
Internet-Draft SINC February 2023
[P4] Bosshart, P., Daly, D., Gibb, G., Izzard, M., McKeown, N.,
Rexford, J., Schlesinger, C., Talayco, D., Vahdat, A.,
Varghese, G., and D. Walker, "P4: programming protocol-
independent packet processors", ACM SIGCOMM Computer
Communication Review vol. 44, no. 3, pp. 87-95,
DOI 10.1145/2656877.2656890, July 2014,
<https://doi.org/10.1145/2656877.2656890>.
[PARAHUB] L, L., J, N., and C. L, "Parameter hub:/ a rack-scale
parameter server for distributed deep neural network
training", Proceedings of the ACM Symposium on Cloud
Computing. , 2018.
[RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol
Label Switching Architecture", RFC 3031,
DOI 10.17487/RFC3031, January 2001,
<https://www.rfc-editor.org/rfc/rfc3031>.
[RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function
Chaining (SFC) Architecture", RFC 7665,
DOI 10.17487/RFC7665, October 2015,
<https://www.rfc-editor.org/rfc/rfc7665>.
[RFC8926] Gross, J., Ed., Ganga, I., Ed., and T. Sridhar, Ed.,
"Geneve: Generic Network Virtualization Encapsulation",
RFC 8926, DOI 10.17487/RFC8926, November 2020,
<https://www.rfc-editor.org/rfc/rfc8926>.
[RFC9300] Farinacci, D., Fuller, V., Meyer, D., Lewis, D., and A.
Cabellos, Ed., "The Locator/ID Separation Protocol
(LISP)", RFC 9300, DOI 10.17487/RFC9300, October 2022,
<https://www.rfc-editor.org/rfc/rfc9300>.
[ROCEv2] "InfiniBand Architecture Specification Release 1.2.1 Annex
A17 RoCEv2", InfiniBand Trade Association , September
2014, <https://cw.infinibandta.org/document/dl/7781>.
[SHARP] L, G. R., L, L., and B. D, "Scalable hierarchical
aggregation and reduction protocol (SHARP) TM streaming-
aggregation hardware design and evaluation", International
Conference on High Performance Computing , 2020.
[SwitchML] A, S., M, C., and C. Ho, "Scaling distributed machine
learning with in-network aggregation", 2019.
[ZENG] Zeng, D., Ansari, N., Montpetit, M., Schooler, E., and D.
Tarchi, "Guest Editorial: In-Network Computing: Emerging
Trends for the Edge-Cloud Continuum", IEEE Network vol.
Lou, et al. Expires 26 August 2023 [Page 13]
Internet-Draft SINC February 2023
35, no. 5, pp. 12-13, DOI 10.1109/mnet.2021.9606835,
September 2021,
<https://doi.org/10.1109/mnet.2021.9606835>.
Appendix A. Computing Capability Operation abstraction
In-Network computing can greatly help distributed applications that
make an intensive usage of the network. Yet, not all of the
operations can be performed in-network, since the computational
resources are usually very limited. Disassembling complex tasks into
basic calculation operation, such as addition, subtraction, Max, etc.
is the most appropriate approach for offloading these operations on
in-network devices at line rate.
SINC aims at providing a general way for signaling the operation to
be performed on the data. As such, the definition of the operations
are orthogonal to the SINC proposal it self, as long as it is
possible to identify the different operations via a code point. An
example of basic operation that may be performed in-network are
listed in Table 2
+========+================================================+
| OpName | Operation Explanation |
+========+================================================+
| Max | Maximum value of several parameters |
+--------+------------------------------------------------+
| MIN | Minimum value |
+--------+------------------------------------------------+
| SUM | Sum value |
+--------+------------------------------------------------+
| PROD | Product value |
+--------+------------------------------------------------+
| LAND | Logical and |
+--------+------------------------------------------------+
| BAND | Bit-wise and |
+--------+------------------------------------------------+
| LOR | Logical or |
+--------+------------------------------------------------+
| BOR | Bit-wise or |
+--------+------------------------------------------------+
| LXOR | Logical xor |
+--------+------------------------------------------------+
| BXOR | Bit-wise xor |
+--------+------------------------------------------------+
| WRITE | Write value accord to key |
+--------+------------------------------------------------+
| READ | Read value accord to key |
+--------+------------------------------------------------+
Lou, et al. Expires 26 August 2023 [Page 14]
Internet-Draft SINC February 2023
| DELETE | Delete value accord to key |
+--------+------------------------------------------------+
| CAS | Compare and swap. compare the value of the key |
| | and old value. If not same, swap old value to |
| | key value. Return old key value. |
+--------+------------------------------------------------+
| CAADD | Compare and add. compare the value of the key |
| | and expected value. If same, add add-value to |
| | key value. Return old key value. |
+--------+------------------------------------------------+
| CASUB | Compare and subtract. compare the value of the |
| | key and expected value. If same, sub sub- |
| | value to key value. Return old key value. |
+--------+------------------------------------------------+
| FA | Fetch and add. Fetch value according key. |
| | Add add-value to key value. Return old key- |
| | value. |
+--------+------------------------------------------------+
| FASUB | Fetch and subtract.Fetch value according key. |
| | Subtract sub-value to key value. Return old |
| | key value. |
+--------+------------------------------------------------+
| FAOR | Fetch and OR. Fetch value according key. Key |
| | value get logical or operation with parameter. |
| | Return old key value. |
+--------+------------------------------------------------+
| FAADD | Fetch and ADD. Fetch value according key. |
| | Key value get logical add operation with |
| | parameter. Return old key value. |
+--------+------------------------------------------------+
| FANAND | Fetch and NAND. Fetch value according key. |
| | Key value get logical NAND operation with |
| | parameter. Return old key value. |
+--------+------------------------------------------------+
| FAXOR | Fetch and XOR. Fetch value according key. |
| | Key value get logical XOR operation with |
| | parameter. Return old key value. |
+--------+------------------------------------------------+
Table 2: Example of in-network operations.
Authors' Addresses
Zhe Lou
Huawei Technologies
Riesstrasse 25
80992 Munich
Germany
Lou, et al. Expires 26 August 2023 [Page 15]
Internet-Draft SINC February 2023
Email: zhe.lou@huawei.com
Luigi Iannone
Huawei Technologies France S.A.S.U.
18, Quai du Point du Jour
92100 Boulogne-Billancourt
France
Email: luigi.iannone@huawei.com
Yujing Zhou
Huawei Technologies
Beiqing Road, Haidian District
Beijing
100095
China
Email: zhouyujing3@huawei.com
Jinze Yang
Huawei Technologies
Beiqing Road, Haidian District
Beijing
100095
China
Email: yangjinze@huawei.com
Cuimin Zhang
Huawei Technologies
Huawei base in Bantian, Longgang District
Shenzhen
China
Email: zhangcuimin@huawei.com
Lou, et al. Expires 26 August 2023 [Page 16]