<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!-- generated by https://github.com/cabo/kramdown-rfc version 1.7.31 (Ruby 3.2.3) -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-li-cats-aisemantic-contract-00" category="std" consensus="true" submissionType="IETF" tocInclude="true" sortRefs="true" symRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.31.0 -->
  <front>
    <title abbrev="Semantic Shaping Contract">Semantic-Driven Traffic Shaping Contract for AI Networks</title>
    <seriesInfo name="Internet-Draft" value="draft-li-cats-aisemantic-contract-00"/>
    <author fullname="Qing Li">
      <organization>Pengcheng Laboratory</organization>
      <address>
        <email>liq@pcl.ac.cn</email>
      </address>
    </author>
    <author fullname="Teng gao">
      <organization>Pengcheng Laboratory</organization>
      <address>
        <email>gaot@pcl.ac.cn</email>
      </address>
    </author>
    <author fullname="Yong Jiang">
      <organization>Tsinghua Shenzhen International Graduate School &amp; Pengcheng Laboratory</organization>
      <address>
        <email>jiangy@sz.tsinghua.edu.cn</email>
      </address>
    </author>
    <date year="2026" month="March" day="01"/>
    <area>Routing</area>
    <workgroup>Computing-Aware Traffic Steering</workgroup>
    <keyword>Semantic-Driven</keyword>
    <keyword>Traffic Shaping</keyword>
    <keyword>AI Networks</keyword>
    <keyword>QoS</keyword>
    <abstract>
      <?line 48?>

<t>This document defines a "Semantic-Driven Shaping Contract". Traditional network protocols treat AI training and inference traffic as opaque byte streams, leading to highly inefficient scheduling. This contract allows applications or distributed training frameworks to explicitly pass "minimum necessary semantics" to the underlying network. In exchange, the network commits to executing fine-grained, differentiated forwarding and resource allocation actions for tensor flows with diverse semantics, based on predefined rules and global real-time states. This model significantly improves overall resource utilization and task completion times in heterogeneous computing networks, cross-domain intelligent computing centers, and integrated training-inference scenarios.</t>
    </abstract>
  </front>
  <middle>
    <?line 52?>

<section anchor="problem-statement-limitations-of-existing-network-mechanisms">
      <name>Problem Statement: Limitations of Existing Network Mechanisms</name>
      <t>In the era of large AI models, the "importance" of traffic dynamically shifts with the model's phase and exhibits a high degree of computability. Existing traffic control and Quality of Service (QoS) mechanisms suffer from fundamental flaws in this context:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Coarse QoS Granularity and Invalid Implicit Assumptions:</strong> Traditional QoS assumes that traffic within the same class has negligible variance and its importance remains stable at the session level. However, in AI scenarios, QoS fails to differentiate between "early-layer activations" and "late-layer activations," nor can it distinguish between the "KV Cache of early tokens" and "tail tokens."</t>
        </li>
        <li>
          <t><strong>Static and Incomputable DiffServ Semantics:</strong> Differentiated Services (DiffServ/DSCP) rely on static markings that the network blindly executes. It cannot express dynamic computing semantics, such as "this flow is quantizable during congestion," "this flow tolerates a 5ms store-and-forward delay," or "this flow requires absolute preemption."</t>
        </li>
        <li>
          <t><strong>Passive and Dimensionless ECN Feedback:</strong> Existing Explicit Congestion Notification (ECN) mechanisms operate on the assumption that "end-systems know best how to respond to congestion" and that "rate reduction is the only correct response." It possesses zero understanding of computing semantics, treating activations, gradients, blocking, and non-blocking operations equally. In AI inference, the correct response to congestion is often "precision degradation (quantization)" or "prioritizing the draining of critical tensors," rather than blind rate reduction.</t>
        </li>
      </ul>
    </section>
    <section anchor="cross-domain-amplification-of-challenges">
      <name>Cross-Domain Amplification of Challenges</name>
      <t>In cross-domain intelligent computing networks characterized by multi-tasking, multi-tenancy, and integrated training and inference, the aforementioned flaws are severely amplified:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Time-scale Mismatch:</strong> Cross-domain Round-Trip Times (RTT) reach the millisecond level, easily exceeding the "effective value window" of sensitive tensors like late-layer activations. The network MUST make differentiation and routing decisions instantaneously during forwarding; post-facto congestion control feedback is entirely ineffective.</t>
        </li>
        <li>
          <t><strong>Resource &amp; Path Asymmetry:</strong> Cross-domain links are scarce, high-cost resources. Delay-tolerant and compressible intermediate activations absolutely MUST NOT compete equally for cross-domain bandwidth with critical gradients that require immediate delivery.</t>
        </li>
        <li>
          <t><strong>Tight Compute-Network Coupling:</strong> Traffic steering is no longer merely about "delivery to a fixed destination." It requires dynamic selection based on compute heterogeneity (e.g., local GPU vs. remote FPGA). A lack of semantic understanding leads to a severe mismatch between computing power and network resources.</t>
        </li>
      </ul>
    </section>
    <section anchor="the-semantic-driven-mapping-loop-the-contract">
      <name>The Semantic-Driven Mapping Loop: The Contract</name>
      <t>The core of this draft is to establish a closed-loop mapping mechanism from "application-layer semantic input" to "network-side action commitment."</t>
      <section anchor="semantic-information-model-metadata-model">
        <name>Semantic Information Model (Metadata Model)</name>
        <t>The application layer MUST expose "exchangeable Semantic Metadata" to the network. Based on the commonalities and specifics of training and inference tasks, this is categorized as follows:</t>
        <ul spacing="normal">
          <li>
            <t><strong>Traffic Class:</strong> Explicitly identifies the data type (e.g., Activation, Gradient, KV Cache, Parameter, Collaborative State Synchronization).</t>
          </li>
          <li>
            <t><strong>Urgency &amp; Dependency:</strong> Provides coarse-grained dependency hints (e.g., Early-token vs. Late-token) and the current layer or stage of the model (Layer ID / Pipeline Stage).</t>
          </li>
          <li>
            <t><strong>Tolerance &amp; Sensitivity:</strong>
            </t>
            <ul spacing="normal">
              <li>
                <t><strong>Fidelity/Accuracy Sensitivity:</strong> Indicates whether in-network low-precision quantization is permitted.</t>
              </li>
              <li>
                <t><strong>Loss/Latency Tolerance:</strong> Indicates whether the flow permits buffering (store-and-forward) or dropping.</t>
              </li>
            </ul>
          </li>
          <li>
            <t><strong>Compute Affinity:</strong> Indicates the preferred characteristics of the underlying computing power (e.g., GPU, FPGA, CPU, or specific operator acceleration hardware).</t>
          </li>
        </ul>
      </section>
      <section anchor="network-policy-action-set">
        <name>Network Policy / Action Set</name>
        <t>Upon receiving the aforementioned semantics, network nodes with global state awareness can execute a set of policies that transcend traditional routing:</t>
        <ul spacing="normal">
          <li>
            <t><strong>Queueing / Scheduling:</strong> Identifies flow states to guarantee absolute preemption for highly time-sensitive traffic.</t>
          </li>
          <li>
            <t><strong>Buffering / Store-and-forward:</strong> Utilizes the storage resources of network devices to temporarily delay flows with high latency tolerance (e.g., large-block parameter pulls); it also implements cache multiplexing for inference requests from different users, directly optimizing hardware throughput without altering the model structure.</t>
          </li>
          <li>
            <t><strong>Shaping &amp; In-network Quantization:</strong> Triggers in-network low-precision quantization and sparsity strategies during congestion, rather than relying on simple packet dropping.</t>
          </li>
          <li>
            <t><strong>Steering:</strong> Intelligently guides task flows to the most appropriate heterogeneous computing nodes based on Compute Affinity.</t>
          </li>
        </ul>
      </section>
    </section>
    <section anchor="extended-use-case-top-k-routing-semantics-for-moe-architecture">
      <name>Extended Use Case: Top-K Routing Semantics for MoE Architecture</name>
      <t>For dynamic computing architectures like Mixture-of-Experts (MoE), this contract supports the definition of more complex routing metadata for intelligent scheduling in the network data plane:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Model Router Metadata:</strong> Carries Token ID / Query vector summaries, Top-K candidate expert lists, weights/confidence levels, and positional markers (Token_pos).</t>
        </li>
        <li>
          <t><strong>System State Semantics:</strong> Network nodes maintain real-time metrics for each expert node, including backlog queues, computing load, network latency, and bandwidth utilization.</t>
        </li>
      </ul>
      <t>By matching these two semantics, the network can instantaneously determine which Expert node with the lightest load should receive the Token flow at the moment of forwarding.</t>
    </section>
    <section anchor="deployment-considerations">
      <name>Deployment Considerations</name>
      <section anchor="decision-location-why-in-network">
        <name>Decision Location: Why In-Network?</name>
        <t>Compared to edge devices (GPUs/NICs) that only possess local queuing information, in-network nodes (e.g., Core/Spine Switches) maintain a global perspective. The network can perceive concurrent multi-tenant tasks and real-time multipath congestion states. Crucially, it can make immediate decisions to buffer, slice, or reroute cross-domain traffic before it enters high-cost bottleneck links.</t>
      </section>
      <section anchor="rdma-rocev2-integration">
        <name>RDMA / RoCEv2 Integration</name>
        <t>Intelligent computing centers rely heavily on RDMA. The Semantic Header defined in this contract will be designed as Extension Headers for RoCEv2/UDP packets, or carried using specific reserved fields. This enables supporting hardware (such as the FPGA and parsing pipelines in the IntelliNode architecture) to extract metadata and execute policies at line rate (e.g., 400Gbps).</t>
      </section>
    </section>
    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>To ensure the integrity of the Semantic-Driven Shaping Contract, the system MUST:</t>
      <ul spacing="normal">
        <li>
          <t><strong>Authentication and Anti-Spoofing:</strong> Prevent malicious tenants from tampering with the Urgency level or forging network states to unfairly preempt high-priority queues.</t>
        </li>
      </ul>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>This document requests that IANA allocate specific protocol numbers or RoCEv2 option type spaces for the AI Semantic Header to facilitate standardized deployment.</t>
    </section>
  </middle>
  <back>








  </back>
  <!-- ##markdown-source:
H4sIAAAAAAAAA51aa2/cxhX9bsD/YbABWslYrtz08UEt2igrOXVrubIltyhQ
oJglZ3enJjkMZyhp/et7zp0Zkiu5RdogibPkPO7z3HMvUxTFyxcvXwQbanOu
Frem0W2wZXHZ23vTqrteb7e2VLd73dl2p9auDb0ug9q6Xl28Ve9NeHD9Z794
+UJvNr25n53xbBMWlTqYnesP58qHihdXrmx1g6sr3BSK2hZY4gttfZakTLuL
169fvvDDprHeWzw7dNj19uruzcsX7dBsTH+O03D8ufr29be/KV7/snj9C1zo
Wm9aP/hzFfrBvHwBCX8JYXujIepHNwRICMGoxa53Q4ena9d08ry4eMDCyQjB
mD4u/2wO2FGdUwVVqCdmk2dPTCfPZhaT3x/cLU/AlsHgrP9BgKj+4m84iib+
gTvxuNG2xuPf0YjfWRO2K9fvfk/v9OWeL/YhdP787IwL+QzirvK6Mz442/Tu
wZsznnDGndg7hL2jeRViReGv7VDX0WuLD7z9nV3EFzhEt/aLDnDQubox7a7c
Gy7QG9frAMfHdSbJWdsfv+vKeqXLVdmmM756zR1P2Wn3f96DneHooq9e8neH
I/5ktRj4+TV3HrruBw2PmvYL/lFv22D6Vl7rGj7Q1YAAVLfl3rla/ewnCPYv
3nb4zn9ZhXT6ylRDlrF1fYPT743EmW23R7+LolB64yU5+Ptub71CPg2NaYOq
zNa2xiv9PKmf5eWKEVbZpEcbI1R1vQuudLVH5hgdGLxYbltu1W2lII/pTVsa
Ppb41F65Tv84GLU5wA6e+xq/VLXB6dgVnNrb3b4+YK/hDktJPUxUDTUWQA7q
kDNe6bpGLCrddbUtxcy4oFeVxcl2MwRTTRJtezhRMovXmEduAagdVKe9V4sG
q5qhgXKl8V73B5UBxi+4IeyNGtrK9PWBpyUbrOBinFXu4SWzlEXZOqVrGhvS
ZaaUdFU0ebGjSKZaQs6tWChYTVHhPWRzlc3XG++GHtajllE7BaVFSaJrAGzh
j62Y4MGGPc67N703k+RLtdEeJ2Nn15vocRw81PQ7rtjVbgOHwgt1EWxDj0AS
n8zcuMrUyttda+EKnEi/NPD6PbbjXz0Em6SEgnVKBTk7aC9G6Gojz3i+h1/V
3iAp3M60xg1eVkTbJMNB6LJ33heVQw602BFMXdsdI2FaXBqmFtbGQEPR6PXc
3cUUfB5rdW+dX+WkaGxV1Ya/vlE3vdvUpgF0Yj8T4xxoBcflaNqqq0eEE+9M
yKyuDf1tfeN5BAKAboc1uBiYuTPMBDGejyGxgNVcHzSkWXBRTofqAGCBZWsY
1u/tNiQ/co/s/7lX3R4eFC3N495uGFFakgQJvOuN4XnRLHoDB4TDahI43yP5
AsDhKR8GzVXcdmv6ewsDnaDKnKpm1Er5gXGJjHEN4K+tNO2CONnW+kFcGHIW
mscgUPMKkPXq1dpphh+OI9a1A4zBq3jt2/Ye9+LPJuaduvB+aDox8vmrV0f4
wv2arxEvYQ9gyXrQODaa20MmVdZMXRgIsbNDiFh4Ut3jUlo6RgbMNRkfwcqQ
8gxzLuXRPMsIYQAK3Zt6pf7oHvAf/ZKKwpFj+CxFsC1wWZL6KHnVBrFhAJ0L
o4EQRa0PMCCz9T4G0kLEWdRY+/zlcqGA5AopBoEFveC9wfr9eKxE0Z//qtYa
WEjfyTUQ47MZzw6QLD1ZLbJLGNYEXnFBjhOofgnp6f+RmYgXLo8BKQWIVyd5
+dnl7frmFHbE5bCYj6c3WkhG9tYMBDdA7QprIwISWd4G6tm6QAgGePicBbPk
nuGXH8o9y8ZCQo5Yp/DnjwPffxFNqqEXQEBtNp7mhDVnq4OrDaGBafNrxjZq
rClgjyLBLfIIDsEmeGC2rzc/Drbnto13NWQngpoYsZN5bxCAAF2x7yUArmUg
1dTqav1evTGm2ujyM007JuVVqjysrkli9d4FgVj5cYKtR+noOtGABqdt9Zg5
0d4LA2X8wQO+vPrcQvYNjlV7UZ743DnCsZuZKEZM3C1HozgMUlpoXV4CLQ7Y
0PcGZTae4c1qQfd1AGfDv9UX4HgsigiEVgrXiEZP/CgMQSrbLOzVjlmPcGOh
Qo1jEEVEb11b5CdJf4FjOIVoKWUXuTmCfMTZp/IeK03V3DYwSeHK0krOE0R1
lQyfw4q/TmNAdMh8gJj9IniKO6rMJ6gp3wC+UylmHkPQPZIbpm1j8Ktj+65i
1VlLhbuMFe6CmDi6H+eu91DSUO5UYH5CQczVUyFsyI3QB3xBCm8OqhnqYAuW
YzFv+glQa8vDf6yfxwQumlcjY6REQkzSFakGbD888ZKQoKMmppoVhTukReFh
JaOuEc46oM9APqznKqHNQgzf9bZTd0ITTj7e3RFmAHexGlpo7E3JSBaYXgIB
vRVkKZFl2TsLsEbDCGMdqMEzH+AC9yBVl22elVfJXaq2n436OiSTAk0wdv3p
9g4oh9Vz1M9Mp49NImIpBhVLJBMCf5PiQMiEURPB+y2zKBRb3HgUorlSbxNy
MGZ5mVhXOHHUbpWt+zHTLzQTCD2U1UPTmIAG+qmNEYyfk7tK9HHwKVkEumcf
RhIHtS+JhkVETQQY9WOQEamlvDJW+gYWZ1jPDDbiJAQVc73/y53sBNvLaSu0
9SiWNzj/wVYQXIjPmFAjMESQSmCMUp5vBmiT7B5WU5jt9sRU5oMpMlVbo+tl
45AohrAIn5pk2rZ1qqb1e+BtjOANnKkW+XgiiAZrfzSsEwRwHQsAgXAsEbmC
eVObiKIj7Y4Jamakl4zoxKx2K7Q9jrr+cPNJ3cPyyC2HlW9ufrg4XakLBCb8
L3GbxiXHSMueyUf5Yv4hSWJ6jaxhQocOnKaPyJosM7k8IhLD/WkfeI3GStp3
57pzWZE7wthOCuIKHZG6KRMaKSBoeYRikcJo0DQHaxQ1TkESxSPH8hZZ5mLW
wqVkHNW2LZSQHmyRhC+8rUzqhVKfRVhaSU/8zTejHqgSqSPGumvpZk6uTSDe
6/j7NCsyu1/F+yWIwVEgO2AltXjCOMbj81ljgzh2hd9n/8ey1DQktojt1Hf5
DlCBWPSpG/hqzwzAlv4BBiXbjoMxAXXNBlBa3znOpvBekxRHxjG2uDBXS4Zh
YnEX/TkgyoF4MebxUqYUTL2lynxzCWhh8xxIite4OM4qiKTSNanbQ1vue5cn
IadjUn5CM4QqA3C6NB1oCn9QNDRd95CJPQQ7htwRI8XyKoATsz/JdyWkWrit
pMo7Yrb8PE1MBlYeeuJy8h6ABhG4S8GZ2il18k5evr1UZ+rGdkjyVnTYmUnm
uwh9Aqm3qWQgZyF2nMzEVW8sISIczi5KXKwh8PFahF7FeIKOD8h9cgLbFjn5
4Lpi4iBz2kFXg+4golGNV/Mb3wE2z6g4rTMK+fWrqLGw2HiUVxvp6BhkJ88I
8KmMTHonibmaerkIXBcIqvaZTrwACuBMMJsZ5fAhB/XxuOQpEiW/AvqWAniI
K/4nvZZSI7E+x6pcmjoxQDR7fcWR5+kq5XpG+huHYD/ArxcRFm6NgNQnEEFg
XWngmMQRntCYGUnN7mkdY1MqUpqQyFxEad7ckt2zVUs9jeBvoM4dRbCzprVl
7yicauxtE1eY5e2HwQyGsp1xLpgGXWLtKWnFl3E2Q6jZDZrFGc3/V5oTKbJp
ihaEe028J0LE6OPvx6jA3U/DgiJ8kplO8jcDhxk1Vg7qnE1WmdgpEggNG260
zCQ+ZBPzEZVMLuoUxWFMtVwQOT2JxF91GXNUN9S1P/0te2Nde8eGvhYH0g/s
h4XS4tljIlkzFGWNRi3yscyM7E0NXqZHlWXDwGYWtmsiy88hBqXhrd0egSuy
kxjoOkSDTZjiQw9qP/QTJ8vj058hYcaU/zBL8shG7A68w/9EWIhVA2BJ+sCR
LqoBI+N593vUgpDTSLOCXl3MBrOWnxGuz/I9fzqIeT52GDDNbhColnle9GSq
dg2pIwonjuqFlP3HyZ7k08iJnkJLYiBXj4HgX6lPqLhrLAblcF3xZ5U+wkyj
CvHxtbtSF/xAEYyYn4e8IZI9Gyfo2apE+q/tI38VblugTJqepQYHni6n0ZYM
mP3QcXiUyqYReVOT1pD6xPnm49gCNJldxCicGrVphq3SBGvMHC7vavQKM1CI
ZIWKk4mkQ4XS676n3++kFkohA4CAqN5DP8Ln0DSaK5bJeCXpIj97kctAUejv
2W8/GBJmfwZVt6QHyBVprNJIFbQnQxanOwzUE7nzH//Eq6la3srcIfOA+Szp
/RGYku0HMv5p1sw2JTtTGr0kIDdw+FbWgxBd9kG12yEfAJScDo+OrZ2uJtRO
qBLlnxqL2VxaAu17dMRkySmLOSV4cEejivkQnyO5p80co7whc3jYW4h9NYk9
jW9rWpdjGMqoPMCjrlIdMrIgOlCAPc3MGiffZRBaU5+YUgPsqXYHeQ0GTvab
BiKpCF5mzHiXPhScq7/tD4Sf5IU/cCHzTrNek59XOzOC9gnqsD97/3btT2Pt
kulPmvOkLoXWj9E7UurlHLuimxOOr5EbZ7edsCuYBLGPk8cQ0Lmqwm4s99LQ
HvXbNDteRmshQjO3m80uQiTI6UvJGFNSCdgIz5rq/FFjDaC27EOXLCW8Q1r6
eVOZG3hYKFKmpfIo6kaoSW+Y5ua4g82z6Y0hreDB8cvErLveuBBqgCKKmjTh
mbp8vLy+QP5+dOur+28FdHfRrXHq818+e8QB7N7oexsHsTxqddTDqT+iPwR4
5I8+86m9QNuDrWtIzbbW7trYVQgESyDF3TE5o4Bnny5vUvXwYo5SoKhCKZVZ
X+ZtwFjT33M+ZE1d5W9JcNmGX50SoB7V2ZM85GUWkA1GBGKxI2FMPN1n4EyW
ec98m0P7afzUFrUbgTh+OYlMbSRomjCI4JTZXIrZX71+/cOm84lYwo4IOhbb
5wl3h2taP/QxkePwLH1SCV/pop9+TY0IEwe20mXOcP9iwDtunqr+BX4Wt51z
21Sdb3rgNJNBUxtW2pgQieME3XSRo4xglNswAXi6Dl7dzYaGM2o5tFtt+Wkh
EcoYxmkKekgYnEz09uL9xdfMc/SZeSRgAiyyJX3PNFPM5C/JKv6PGvIRN6WF
S5NudqsgQIQr+fi5l69sT6MdGmx1ye9gcj5nJQTSL7G3TBgK8f8NcMmRFdYi
AAA=

-->

</rfc>
