<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!-- generated by https://github.com/cabo/kramdown-rfc version 1.7.31 (Ruby 3.2.3) -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-intellinode-ai-semantic-contract-00" category="std" consensus="true" submissionType="IETF" tocInclude="true" sortRefs="true" symRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.31.0 -->
  <front>
    <title abbrev="Semantic Shaping Contract">Semantic-Driven Traffic Shaping Contract for AI Networks</title>
    <seriesInfo name="Internet-Draft" value="draft-intellinode-ai-semantic-contract-00"/>
    <author fullname="Teng Gao">
      <organization>Peng Cheng Laboratory</organization>
      <address>
        <email>gaot@pcl.ac.cn</email>
      </address>
    </author>
    <date year="2026" month="March" day="01"/>
    <area>Routing</area>
    <workgroup>Computing-Aware Traffic Steering</workgroup>
    <keyword>Semantic-Driven</keyword>
    <keyword>Traffic Shaping</keyword>
    <keyword>AI Networks</keyword>
    <keyword>QoS</keyword>
    <abstract>
      <?line 38?>

<t>This document defines a "Semantic-Driven Shaping Contract". Traditional network protocols treat AI training and inference traffic as opaque byte streams, leading to highly inefficient scheduling. This contract allows applications or distributed training frameworks to explicitly pass "minimum necessary semantics" to the underlying network. In exchange, the network commits to executing fine-grained, differentiated forwarding and resource allocation actions for tensor flows with diverse semantics, based on predefined rules and global real-time states. This model significantly improves overall resource utilization and task completion times in heterogeneous computing networks, cross-domain intelligent computing centers, and integrated training-inference scenarios.</t>
    </abstract>
  </front>
  <middle>
    <?line 42?>

<section anchor="problem-statement-limitations-of-existing-network-mechanisms">
      <name>Problem Statement: Limitations of Existing Network Mechanisms</name>
      <t>In the era of large AI models, the "importance" of traffic dynamically shifts with the model's phase and exhibits a high degree of computability. Existing traffic control and Quality of Service (QoS) mechanisms suffer from fundamental flaws in this context:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Coarse QoS Granularity and Invalid Implicit Assumptions:</strong> Traditional QoS assumes that traffic within the same class has negligible variance and its importance remains stable at the session level. However, in AI scenarios, QoS fails to differentiate between "early-layer activations" and "late-layer activations," nor can it distinguish between the "KV Cache of early tokens" and "tail tokens."</t>
        </li>
        <li>
          <t><strong>Static and Incomputable DiffServ Semantics:</strong> Differentiated Services (DiffServ/DSCP) rely on static markings that the network blindly executes. It cannot express dynamic computing semantics, such as "this flow is quantizable during congestion," "this flow tolerates a 5ms store-and-forward delay," or "this flow requires absolute preemption."</t>
        </li>
        <li>
          <t><strong>Passive and Dimensionless ECN Feedback:</strong> Existing Explicit Congestion Notification (ECN) mechanisms operate on the assumption that "end-systems know best how to respond to congestion" and that "rate reduction is the only correct response." It possesses zero understanding of computing semantics, treating activations, gradients, blocking, and non-blocking operations equally. In AI inference, the correct response to congestion is often "precision degradation (quantization)" or "prioritizing the draining of critical tensors," rather than blind rate reduction.</t>
        </li>
      </ul>
    </section>
    <section anchor="cross-domain-amplification-of-challenges">
      <name>Cross-Domain Amplification of Challenges</name>
      <t>In cross-domain intelligent computing networks characterized by multi-tasking, multi-tenancy, and integrated training and inference, the aforementioned flaws are severely amplified:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Time-scale Mismatch:</strong> Cross-domain Round-Trip Times (RTT) reach the millisecond level, easily exceeding the "effective value window" of sensitive tensors like late-layer activations. The network MUST make differentiation and routing decisions instantaneously during forwarding; post-facto congestion control feedback is entirely ineffective.</t>
        </li>
        <li>
          <t><strong>Resource &amp; Path Asymmetry:</strong> Cross-domain links are scarce, high-cost resources. Delay-tolerant and compressible intermediate activations absolutely MUST NOT compete equally for cross-domain bandwidth with critical gradients that require immediate delivery.</t>
        </li>
        <li>
          <t><strong>Tight Compute-Network Coupling:</strong> Traffic steering is no longer merely about "delivery to a fixed destination." It requires dynamic selection based on compute heterogeneity (e.g., local GPU vs. remote FPGA). A lack of semantic understanding leads to a severe mismatch between computing power and network resources.</t>
        </li>
      </ul>
    </section>
    <section anchor="the-semantic-driven-mapping-loop-the-contract">
      <name>The Semantic-Driven Mapping Loop: The Contract</name>
      <t>The core of this draft is to establish a closed-loop mapping mechanism from "application-layer semantic input" to "network-side action commitment."</t>
      <section anchor="semantic-information-model-metadata-model">
        <name>Semantic Information Model (Metadata Model)</name>
        <t>The application layer MUST expose "exchangeable Semantic Metadata" to the network. Based on the commonalities and specifics of training and inference tasks, this is categorized as follows:</t>
        <ul spacing="normal">
          <li>
            <t><strong>Traffic Class:</strong> Explicitly identifies the data type (e.g., Activation, Gradient, KV Cache, Parameter, Collaborative State Synchronization).</t>
          </li>
          <li>
            <t><strong>Urgency &amp; Dependency:</strong> Provides coarse-grained dependency hints (e.g., Early-token vs. Late-token) and the current layer or stage of the model (Layer ID / Pipeline Stage).</t>
          </li>
          <li>
            <t><strong>Tolerance &amp; Sensitivity:</strong>
            </t>
            <ul spacing="normal">
              <li>
                <t><strong>Fidelity/Accuracy Sensitivity:</strong> Indicates whether in-network low-precision quantization is permitted.</t>
              </li>
              <li>
                <t><strong>Loss/Latency Tolerance:</strong> Indicates whether the flow permits buffering (store-and-forward) or dropping.</t>
              </li>
            </ul>
          </li>
          <li>
            <t><strong>Compute Affinity:</strong> Indicates the preferred characteristics of the underlying computing power (e.g., GPU, FPGA, CPU, or specific operator acceleration hardware).</t>
          </li>
        </ul>
      </section>
      <section anchor="network-policy-action-set">
        <name>Network Policy / Action Set</name>
        <t>Upon receiving the aforementioned semantics, network nodes with global state awareness can execute a set of policies that transcend traditional routing:</t>
        <ul spacing="normal">
          <li>
            <t><strong>Queueing / Scheduling:</strong> Identifies flow states to guarantee absolute preemption for highly time-sensitive traffic.</t>
          </li>
          <li>
            <t><strong>Buffering / Store-and-forward:</strong> Utilizes the storage resources of network devices to temporarily delay flows with high latency tolerance (e.g., large-block parameter pulls); it also implements cache multiplexing for inference requests from different users, directly optimizing hardware throughput without altering the model structure.</t>
          </li>
          <li>
            <t><strong>Shaping &amp; In-network Quantization:</strong> Triggers in-network low-precision quantization and sparsity strategies during congestion, rather than relying on simple packet dropping.</t>
          </li>
          <li>
            <t><strong>Steering:</strong> Intelligently guides task flows to the most appropriate heterogeneous computing nodes based on Compute Affinity.</t>
          </li>
        </ul>
      </section>
    </section>
    <section anchor="extended-use-case-top-k-routing-semantics-for-moe-architecture">
      <name>Extended Use Case: Top-K Routing Semantics for MoE Architecture</name>
      <t>For dynamic computing architectures like Mixture-of-Experts (MoE), this contract supports the definition of more complex routing metadata for intelligent scheduling in the network data plane:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Model Router Metadata:</strong> Carries Token ID / Query vector summaries, Top-K candidate expert lists, weights/confidence levels, and positional markers (Token_pos).</t>
        </li>
        <li>
          <t><strong>System State Semantics:</strong> Network nodes maintain real-time metrics for each expert node, including backlog queues, computing load, network latency, and bandwidth utilization.</t>
        </li>
      </ul>
      <t>By matching these two semantics, the network can instantaneously determine which Expert node with the lightest load should receive the Token flow at the moment of forwarding.</t>
    </section>
    <section anchor="deployment-considerations">
      <name>Deployment Considerations</name>
      <section anchor="decision-location-why-in-network">
        <name>Decision Location: Why In-Network?</name>
        <t>Compared to edge devices (GPUs/NICs) that only possess local queuing information, in-network nodes (e.g., Core/Spine Switches) maintain a global perspective. The network can perceive concurrent multi-tenant tasks and real-time multipath congestion states. Crucially, it can make immediate decisions to buffer, slice, or reroute cross-domain traffic before it enters high-cost bottleneck links.</t>
      </section>
      <section anchor="rdma-rocev2-integration">
        <name>RDMA / RoCEv2 Integration</name>
        <t>Intelligent computing centers rely heavily on RDMA. The Semantic Header defined in this contract will be designed as Extension Headers for RoCEv2/UDP packets, or carried using specific reserved fields. This enables supporting hardware (such as the FPGA and parsing pipelines in the IntelliNode architecture) to extract metadata and execute policies at line rate (e.g., 400Gbps).</t>
      </section>
    </section>
    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>To ensure the integrity of the Semantic-Driven Shaping Contract, the system MUST:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Authentication and Anti-Spoofing:</strong> Prevent malicious tenants from tampering with the Urgency level or forging network states to unfairly preempt high-priority queues.</t>
        </li>
      </ul>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>This document requests that IANA allocate specific protocol numbers or RoCEv2 option type spaces for the AI Semantic Header to facilitate standardized deployment.</t>
    </section>
  </middle>
  <back>








  </back>
  <!-- ##markdown-source:
H4sIAAAAAAAAA5VaXW/cxhV9N+D/MNgArWQsV27S9kEp2igrOTFiuYolNy8F
illydndgksNwhpLWv77n3JkhuZJTpEEQe8n5uJ/nnnuZoihevnj5IthQm3O1
uDWNboMti8ve3ptW3fV6u7Wlut3rzrY7tXZt6HUZ1Nb16uKtem/Cg+s/+cXL
F3qz6c397Ixnm7Co1MHsXH84Vz5UvLhyZasbXF3hplDYNpi6tq2rTKFt4bM4
ZTqieP365Qs/bBrrvcWzQ4etb6/u3rx80Q7NxvTnOBJ3nKuvX3/91+L1N8Xr
P+FW13rT+sGfq9AP5uULiPkNJO6Nhrwf3BAgJqSjKrveDR2erl3TyfPi4gEL
J0sEY/q4/JM5YEd1Tj1UoZ7YTp49sZ88m5lNfv/sbnkCtgwGZ/0fAkT1F7/g
KNr5B+7E40bbGo//BmP776wJ25Xrd3+ni/pyzxf7EDp/fnbGhXwGcVd53Rkf
nG169+DNGU84407sHcLe0bwKAaPwz3ao6+i6xZ3h7dot4hucolv7WQd46Fzd
8OV6z/++0xvX6wD/x4UmSbrTLnzXlfVKl6uyleta1zc44N6IdW27PfpdFIXS
Gy8hwd93e+sVQmloTBtUZba2NV7p5/H8LCRXtGtlKaquVRv9orreBVe62iNe
jA50GZbbllt1WynIY3rTloaPxSvaK9fpXwejNodgEN3Y1/ilqg1Ox67g1N7u
9vUBew13WErqy72pBsT7DnJQhxznStc1PKB019W2FEvigl5VFifbzRBMNUm0
7eEGiSdeYx65Bfl8UJ32Xi0arGqGBsqVxnvdH1ROK7/ghrA3amgr09cHnpZs
sFJvW5xV7nW7M0tZlK1TuqaxIV1mSglSRZMXO4pkqiXk3IqFgtUUFd5DDFfZ
fL3xbuhhPWoZtVNQWpQksAQkK/7YigkebNjjvHvTezNJvlQb7XEydna9iR7H
wUNNv+OKXe02cCi8UBfBNvQIJPHJzA3wpVbe7loLV+BE+qWB1++xHf/pIdgk
JRSsUzjL2UF7MUJXG3nG8z38qvYmmN7tTGvc4GVFtE0yHIQue+d9UTkEfqsS
2O0YCdPiEj+h6zIFGvCy13N3F1PweazVvXV+lZOisVVVG/76St30blObBoCB
/UyMc/XOwnE5mrbq6hHhxDsTHqlrQ39b33gegQCg22ENLgZS7AwzQYznY0gs
YDXXBw1pFlyU06E6ABpg2RqG9Xu7DcmP3CP7/+hVt4cHRUvzuLcbRpSWJEEC
73pjeF40i97AAeGwmgTO90i+uFpO+XnQXMVtt6a/tzDQCbD1VDWjVsoPjEtk
jGsAYG2laRfEybbWD+LCkLPQPAaBmlfAqVev1k4z/HAcUFa3A4zBq3jt2/Ye
9+LPJuaduvB+aDox8vmrV0f4wv2arxEvYQ9gyXrQODaa20MmVdZMXRgIsbND
iFh4Ut3jUlo6RgbMNRkfwcqQ8gxzLuXRPMtImQQK3Zt6pX50D/hLv6SicOQY
PksRbAswlqQ+Sl61QWwYQOfCaCBEUesDDMhsvY+BtBBxFjXWPn+5XCgguUKK
QWBBL3hvsH4/HitR9NO/1FoDC+k7uQZifDLj2QGSpSerRXYJw5rAKy7IcQLV
LyE9/T/WY/HC5TEgpQDx6iQvP7u8Xd+cwo64HBbz8fRGS2nN3pqB4AaoXWFt
REAiy9tAPVsXCMEAD5+zYJbcM/zyQ7ln2VhIyBHrFP78deD7z6JJNfQCCA4I
7GlOWHO2OrjaEBqYNn9hbKOwgjW1VZHgFnkEh2ATPDDb15tfB9tz28a7GrIT
QU2M2Mm8NwhAgK7Y9xIA1zKQamp1tX6v3hhTbXT5iaYdk/IqVR5W1ySxeu+C
QKz8OMHWo3R0nWhAg9O2esycaO+FgTL+4AFfXn1qIfsGx6q9KE987hzh2M1M
FCMm7pajURwGKS20Li+BFgds6HuDMhvP8Ga1oPs6gLPhv+ozcDwWRQRCK4Vr
RKMnfhSGIJVtFvZqx6xHuLFQocYxiCKit64t8pOkv8AxnEK0lLKL3BxBPuLs
U3mPlaZqbhuYpHBlaSXnCaK6SobPYcVfpzEgOmQ+QMx+FjzFHVXmE9SUbwDf
qRQzjyHoHskN07Yx+NWxfVex6qylwl3GCndBTBzdj3PXeyhpKHcqML+jIObq
qRA25EZgv5+RwpuDaoY62ILlWMybfgLU2vLwm/XzmMBF82pkjJRIiEm6ItWA
pNsTLwkJOmpiqllRuENaFB5WMuoa4awD2DXyYT1XCc0FYviut526E5pw8uHu
jjADuIvV0EJjb0pGssD0EgjorSBLiSzL3lmANRpGGOtADZ75ABe4B6m6bG6s
vEruUrX9ZNSXIZkUaIKx64+3d0A5rJ6jfmY6fWyNEEsxqFgimRD4lxQHQiaM
mgjet8yiUGxx41GI5kq9TcjBmOVlYl3hxFG7Vbbuh0y//qBuEHooq4emMQG9
41MbIxg/JXeV6F7gU7II9Iw+jCQOal8SDYuImggw6scgI1JLeWWs9A0szrCe
GWzESQgq5nr/zzvZCbaX01Zo61Esb3D+g60guBCfMaFGYIgglcAYpTzfDNAm
2T2spjDb7YmpzAdTZKq2Rq/HxiFRDGERPrWGtG3rVE3r98DbGMEbOFMt8vFE
EA3W/mhYJwjgOhYAAuFYInIF86Y2EUVH2h0T1MxILxnRiVntVmh7HHX94eaj
uoflkVsOK9/c/HBxulIXCEz4X+I2TQqOkZY9k4/yxfxDksT0GlnDhA4dOE0f
kTVZZnJ5RCSG+9M+8BqNFXe/cw69NlfkjjC2k4K4QkekbspwQgoIWh6hWKQw
GjTNwRpFjVOQRPHIsbxFlrmYtXApGUe1bQslpAdbJOELbyuTeqHUZxGWVtIT
f/XVqAeqROqIse5aupmTaxOI9zr+Ps2KzO5X8X4JYnAUyA5YSS2eMI7x+HzW
2CCOXeH32f+xLDUNiS1iO/VdvgNUIBZ96ga+2DMDsKV/gEHJtuNMSEBdswGU
1neOsym81yTFkXGMLS7M1ZJhmFjcRX+ORXIgXox5vCRzl9Rbqsw3l4AWNs+B
pHiNi+OAgkgqXZO6PbTlvnd5mnE6JuVHNEOoMgCnS9OBpvAHRUPTdQ+Z2EOw
Y8gdMVIsrwI4MfuTfFdCqoXbSqq8I2bLz9PEZGDloScuJ+8BaBCBuxScqZ1S
J+/k5dtLdaZubIckb0WHnZlkvovQJ5B6m0oGchZix3FMXPXGEiLC4eyixMUa
Ah+vRehVjCfo+IDcJyewbZGTD64rJg4ypx10NegOIhrVeDW/8R1g84yK0zqj
kF++ihoLi41HebWRjo5BdvKMAJ/KyKR3kpirqZeLwHWBoGqf6cQLoADOBLOZ
UQ4fclAfj0ueIlHyK6BvKYCHuOJf6bWUGon1OVbl0tSJAaLZ6ysO+k5XKdcz
0t84BPsBfr2IsHBrBKQ+gggC60oDxySO8ITGzEhqdg8Hq6kVTxMSmYsozZtb
snu2aqmnEfwN1LmjCHbWtLbsHYVTjb1t4gqzvP15MIOhbGfqdhx0ibWnpBVf
xtkMoWY3aBZnNP9faE6kyKYpWhDuNfGeCBGjj78fowJ3Pw0LivBRZjrJ3wwc
ZtRYOahzNlllYqdIIDRsuNEyk/iQTcxHVDK5qFMUhzHVckHk9CQSf9VlzFHd
UNf+9Fv2xrr2jg19LQ6kH9gPC6XFs8dEsmYoyhqNWuRjmRnZmxq8TI8qy4aB
zSxs10SWn0MMSsNbuz0CV2QnMdB1iAabMMWHHtR+6CdOlsenf0DCjCn/8yzJ
IxuxO/AO/zthIVYNgCXpA0e6qAaMjOfd71ELQk4jzQp6dTEbzFp+Qrg+y/c8
MI95PnYYMM1uEKiWeV70ZKp2DakjCieO6oWU/eZkT/Jp5ERPoSUxkKvHQPCv
1EdU3DUWg3K4rvhJpU8P06hCfHztrtQFx/LBiPl5yBsi2bNxgp6tSqT/2j7y
V+G2Bcqk6VlqcODpchptyYDZDx2HR6lsGpE3NWkNqU+cbz6OLUCT2UWMwqlR
m2bYKk2wxszh8q5GrzADhUhWqDiZSDpUKL3ue/r9TmqhFDIACIjqPfQjfA5N
o7limYxXki7yYw+5DBSF/p799oMhYfZnUHVLeoBckcYqjVRBezJkcbrDQD2R
O//9H7yaquWtzB0yD5jPkt4fgSnZfiDjn2bNbFOyM6XRSwJyA4dvZT0I0WUf
VLsd8gFAyenw6Nja6WpC7YQqUf6psZjNpSXQvkdHTJacsphTggd3NKqYD/E5
knvazDHKGzKHh72F2FeT2NP4tqZ1OYahjMoDPOoq1SEjC6IDBdjTzKxx8l0G
oTX1iSk1wJ5qd5DXYOBkv2kgkorgZcaMd+lDwbn6ZX8g/CQv/IMLmXea9Zr8
vNqZEbRPUIf92fu3a38aa5dMf9KcJ3UptH6M3pFSL+fYFd2ccHyN3Di77YRd
wSSIfZw8hoDOVRV2Y7mXhvao36bZ8TJaCxGaud1sdhEiQU5fSsaYkkrARnjW
VOePGmsAtWUfumQp4R3S0s+bytzAw0KRMi2VR1E3Qk16wzQ3xx1snk1vDGkF
D45fJmbd9caFUAMUUdSkCc/U5cPl9QXy94NbX91/LaC7i26NU5//8dkjDmD3
Rt/bOIjlUaujHk79iP4Q4JE/+syn9gJtD7auITXbWrtrY1chECyBFHfH5IwC
nn28vEnVw4s5SoGiCqVUZn2ZtwFjTX/P+ZA1dZW/JcFlG351SoB6VGdP8pCX
WUA2GBGIxY6EMfF0n4EzWeY9820O7afxU1vUbgTi+OUkMrWRoGnCIIJTZnMp
Zv/8+vUPm84nYgk7IuhYbJ8n3B2uaf3Qx0SOw7P0SSV8oYt++jU1Ikwc2EqX
OcP9iwHvuHmq+hf4Wdx2zm1Tdb7pgdNMBk1tWGljQiSOE3TTRY4yglFuwwTg
6Tp4dTcbGs6o5dButeWnhUQoYxinKeghYXAy0duL9xdfMs/RZ+aRgAmwyJb0
PdNMMZO/JKv4vyfIR9yUFi5NutmtggARruTj516+sj2Ndmiw1SW/g8n5nJUQ
SD/H3jJhKMT/L7sxMBTRIQAA

-->

</rfc>
