<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" submissionType="IETF" docName="draft-hss-bgp-srv6-routing-planes-00" category="info" ipr="trust200902" obsoletes="" updates="" xml:lang="en" symRefs="true" sortRefs="true" tocInclude="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.16.0 -->
  <!-- Generated by id2xml 1.5.0 on 2023-03-06T17:34:20Z -->
	<front>
    <title abbrev="BGP based SRv6 Routing Planes for DC network">BGP based SRv6 Routing Planes for DC network</title>
    <seriesInfo name="Internet-Draft" value="draft-hss-bgp-srv6-routing-planes-00"/>
    <author initials="S." surname="Sangli" fullname="Srihari Sangli">
      <organization>HPE</organization>
      <address>
        <postal>
          <street>Mahadevapura</street>
          <street>Bangalore, KA  560048</street>
          <street>India</street>
        </postal>
        <email>srihari.sangli@hpe.com</email>
      </address>
    </author>
    <author initials="S." surname="Hegde" fullname="Shraddha Hegde">
      <organization>HPE</organization>
      <address>
        <postal>
          <street>Mahadevapura</street>
          <street>Bangalore, KA  560048</street>
          <street>India</street>
        </postal>
        <email>shraddha.hegde@hpe.com</email>
      </address>
    </author>
    <author initials="M." surname="Styszynski" fullname="Michal Styszynski">
      <organization>HPE</organization>
      <address>
        <postal>
          <street>France</street>
        </postal>
        <email>mlstyszynski@juniper.net</email>
      </address>
    </author>
    <date year="2026" month="February" day="28"/>
    <abstract>
        <t>This document introduces a BGP-based multi-planar routing 
    architecture for modern data center networks, with a particular focus on 
    environments running AI/ML workloads that demand traffic segregation.
	The proposed solution enables deterministic 
    routing for workloads with characteristics such as collective communication
    and multi-tenancy. It allows the creation of multiple logical routing 
    planes over a shared physical infrastructure by defining planes through 
    three key elements: Constraints (e.g., fabric color inclusion/exclusion)
    Calculation types (e.g., shortest path) and Metric types (e.g., cost, 
    delay, bandwidth).</t>
    </abstract>
  </front>
  <middle>
  <section anchor="Sec-1" numbered="true" toc="default">
      <name>Introduction</name>
        <t>Modern Data Center (DC) networks are typically built using Clos 
    topologies, which provide an n-hop path (commonly 3, 5, or 7 hops) between 
    ingress and egress with a minimal number of intermediate nodes. This design
    offers straightforward scalability as traffic demands increase. Several 
    factors influence DC network buildout, including traffic characteristics, 
    AI workload requirements, data generation rates, user distribution, and the
    placement of compute, storage, and application resources. DC networks 
    generally operate as pure IP fabrics, using the BGP routing paradigm. 
    Nodes (switches or routers) establish single-hop eBGP sessions with 
    their neighbors <xref target="RFC7938"/>.</t>
        <t>When hosting AI workloads, DC networks are optimized to maximize 
    bandwidth usage and handle traffic with low entropy characteristics. AI 
    models have grown dramatically, with parameter counts reaching billions or 
    even trillions. This scale requires distributing workloads across multiple 
    datacenters, where inter-DC networks must support mixed traffic types. 
    Consequently, AI workloads share the same physical infrastructure with 
    other applications such as storage, etc., each with distinct bandwidth and 
    latency requirements.</t>
        <t> Logical routing 
    planes provide strict separation between traffic types while leveraging the
    same physical infrastructure. This ensures predictable performance across 
    different types of applications. BGP is widely deployed in datacenters 
	and often serves as the 
    routing protocol for interconnecting regional datacenters located within 
    close proximity (e.g., 100–120 km). While mechanisms for logical routing 
    planes have been defined for IGP protocols <xref target="RFC9350"/>, a 
    comparable capability is required for BGP.</t>
  </section>

  <section anchor="Sec-2" numbered="true" toc="default">
      <name>Requirements Language</name>
   <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, 
   they appear in all capitals, as shown here.</t>
  </section>

  <section anchor="Sec-3" numbered="true" toc="default">  
    <name>BGP based Routing Planes</name>
      
      <figure anchor="dc-clos-network">
        <name>Data Center Clos Network</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[



                                      +-----+
      +-------------------------------|     |------------------------------+
      |          +--------------------| Sn  |-----------------+            |
      |          |            +-------|     |------+          |            |
      |          |            |       +-----+      |          |            |
      |          |            |          .         |          |            |
      |          |            |          .         |          |            |
      |          |            |          .         |          |            |
      |          |            |       +-----+      |          |            |
      | +--------^------------^-------|     |------^----------^----------+ |
      | |        | +----------^-------| S2  |------^--------+ |          | |
      | |        | |          | +-----|     |----+ |        | |          | |
      | |        | |          | |     +-----+    | |        | |          | |            
      | |        | |          | |                | |        | |          | |
      | |        | |          | |     +-----+    | |        | |          | |
      | | +------^-^----------^-^-----|     |----^-^--------^-^--------+ | |
      | | |      | | +--------^-^-----| S1  |----^-^------+ | |        | | |
      | | |      | | |        | | +---|     |--+ | |      | | |        | | |
      | | |      | | |        | | |   +-----+  | | |      | | |        | | |
     +-----+    +-----+      +-----+          +-----+    +-----+      +-----+
     | L1  |    | L2  | ...  | L4  |          | L5  |    | L6  | ...  | L8  |
     +-----+    +-----+      +-----+          +-----+    +-----+      +-----+
      ||||       ||||         ||||             ||||       ||||          ||||
      OOOO       OOOO         OOOO             OOOO       OOOO          OOOO
     

      Legend: S: Spine, L: Leaf, O: Compute Server NIC
 
]]></artwork>
      </figure>
  
        <t>This document proposes a BGP-based multi-planar architecture that 
    enables the creation of multiple routing planes within a data center 
    fabric. The key characteristics are as follows:</t>
        <t>* Routing Plane Definition:</t>
        <t>Each routing plane is defined by a set of constraints, a calculation
    type, and a metric type. These parameters, applied to the physical topology    
	of nodes and links, form a logical routing plane.</t>
        <t>* Fabric Colors, Metrics and Calculation-type:</t>
        <t>Physical links can be tagged with Fabric Colors. Routing planes may 
    include or exclude specific colors, and/or be differentiated by metric 
    types such as cost, delay, or bandwidth. The calculation-type refers to 
    the consistent way for best path selection that is applied within a routing
    plane. For example, all routers in a Routing Plane apply same criteria
    expressed via BGP import policy for best path computation. </t>
         <t>* Expressing constraints:</t>
         <t>The Routing Plane configuration in conjunction with BGP policy can 
    combine multiple characteristics (e.g., exclude a fabric color while 
    optimizing for delay) thereby providing flexibility.</t>
         <t>* Pre-Built Configuration:</t>
         <t>Routing planes are provisioned via configuration. The BGP routing 
    protocol builds routes and next-hops according to defined constraints. 
    Application traffic is mapped to one or the other routing planes based on 
    application intent.</t>
         <t>* Application Intent Expression:</t>
         <t>Application intent is conveyed using BGP extended color 
    communities, which are associated with prefix advertisements.</t>
         <t>* Failure Handling:</t>
         <t>In the event of link or node failures, a routing plane may become 
    partitioned. Traffic can fallback to alternate planes.</t>
         <t>* Policy-Based Control:</t>
         <t>Routing plane definitions are applied as import/export policies in 
    BGP advertisements. Importantly, this framework does not require new BGP 
    protocol extensions.</t>
         <t>Motivated by the deterministic path forwarding mechanism described 
    in <xref target="I-D.wang-idr-dpf"/>, the approach outlined here provides 
    a generic and extensible framework for defining routing planes. The goal 
    is to demonstrate how routing planes can be constructed in SRv6 networks by 
    leveraging existing segment routing constructs.</t>
  </section>

  <section anchor="Sec-4" numbered="true" toc="default">
      <name>BGP Routing Planes applied to SRv6 network</name>
        <t>The following section describe the BGP Routing Plane solution
    applied to SRv6 networks.</t>

        <t><xref target="dc-clos-network"/> diagram illustrates a multi-planar 
    data center fabric in which nodes L1, L2, and spines S1, S2 belong to the 
    Green routing plane, while nodes L5, L6 and spines S3, S4 belong to the 
    Blue routing plane. Servers (e.g., Server1 and Server2) are dual-homed, 
    with connections to both planes.</t>
        <t>The requirement is to construct distinct Green and Blue routing 
    planes across the fabric. Routing plane definitions can be consistently 
    applied across the network, ensuring that each plane enforces its 
    constraints and provides deterministic forwarding paths for application 
    traffic.</t>
        <t>Routing Plane Definition:</t> 
        
        <t>To achieve routing planes for the fabric described in
    <xref target="dc-clos-network"/>, the Routing plane definition is 
    described below.</t>

        <t>Green routing plane:</t>
        <t>Calculation type: BGP Best path</t>
        <t>Metric Type: standard metric</t>
        <t>Set of constraints: Exclude Blue</t>
         
        <t>Blue Routing Plane:</t>
        <t>Calculation type: BGP Best path</t>
        <t>Metric Type: standard metric</t>
        <t>Set of constraints: Exclude Green</t>

        <t>Each node in the fabric is provisioned with SRv6 locators along 
    with the corresponding uN and uA SIDs derived from those locators. Nodes 
    that belong to the Green routing plane are additionally configured with 
    Green-specific locators, while nodes in the Blue routing plane are 
    provisioned with Blue-specific locators.</t>

      <figure anchor="SRv6-sids">
        <name>SRv6 SID</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
          SRv6 block for the fabric  2100:db8::/32

         L1 instantiates the SID 2100:db8:0100::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)        
         L2 instantiates the SID 2100:db8:0200::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)         
         L5 instantiates the SID 2100:db8:0500::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)
         L6 instantiates the SID 2100:db8:0600::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)
         S1 instantiates the SID 2100:db8:0900::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)
         S2 instantiates the SID 2100:db8:0a00::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) 
         S3 instantiates the SID 2100:db8:0b00::/48 associated with the uN 
   instruction (End with NEXT-CSID, PSP & USD)
         S4 instantiates the SID 2100:db8:0c00::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)
   
    Green Routing Plane:
   
         L1 instantiates the SID 2100:db8:1100::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Green Routing 
   Plane.
         L2 instantiates the SID 2100:db8:1200::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Green Routing
  Plane.
         L5 instantiates the SID 2100:db8:1500::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Green Routing 
   Plane.
         L6 instantiates the SID 2100:db8:1600::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)corresponding to Green Routing 
   Plane.
         S1 instantiates the SID 2100:db8:1900::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Green Routing 
   Plane.
         S2 instantiates the SID 2100:db8:1a00::/48 associated with the uN 
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Green Routing 
   Plane.
   
    Blue Routing Plane:
        L1 instantiates the SID 2100:db8:2100::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Blue Routing 
   Plane.
         L2 instantiates the SID 2100:db8:2200::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Blue Routing 
   Plane.
   
         L5 instantiates the SID 2100:db8:2500::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD) corresponding to Blue Routing 
   Plane.
         L6 instantiates the SID 2100:db8:2600::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)corresponding to Blue Routing 
   Plane.
         S3 instantiates the SID 2100:db8:2b00::/48 associated with the uN 
   instruction (End with NEXT-CSID, PSP & USD)corresponding to Blue Routing 
   Plane.
         S4 instantiates the SID 2100:db8:2c00::/48 associated with the uN
   instruction (End with NEXT-CSID, PSP & USD)corresponding to Blue Routing 
   Plane.
]]></artwork>
      </figure>

        <t>The BGP sessions in the Green routing plane are associated with 
   Green admin-group <xref target="RFC5305"/> and the BGP sessions in the Blue 
   routing plane are associated with Blue admin-group.</t>

  <section anchor="Sec-4.1" numbered="true" toc="default">
      <name>BGP Procedures for building SRv6 Based Routing plane </name>
        <t>The network is provisioned with initial configurations as described 
   in [SRv6-sids]. This configuration is performed once per routing plane and 
   does not require modification based on changing traffic demands.</t>

        <t>* Locator Advertisements:</t>

        <t>Green locators are advertised as standard IPv6 prefixes (AFI-2, 
   SAFI-1) and are tagged with the extended color community 
   <xref target="RFC4360"/> corresponding to Green.</t>
        <t>Blue locators are advertised similarly, with the extended color 
   community corresponding to Blue.</t>

        <t>* BGP Policy Mapping:</t>
        <t>Each node is configured with BGP policies that map incoming extended
   color communities to the appropriate routing plane. When a policy maps to a 
   routing plane definition, the routing plane’s characteristics are applied to
   the incoming advertisement to determine acceptance or rejection.</t>
        <t>A locator advertisement tagged for the Green plane is accepted only 
   if received on a BGP session associated with the Green admin-group.</t>
        <t>Similarly, a locator advertisement tagged for the Blue plane is 
   accepted only if received on a BGP session associated with the Blue 
   admin-group.</t>
        <t>Once the control plane has been established for multiple routing 
   planes, collective communications can leverage the data plane mechanisms 
   described in the <xref target="Sec-7"/> to forward traffic across the
   appropriate planes. BGP Routing Planes solution builds deterministic paths 
   inside a fabric purely based on routing. It does not require any controller 
   based or out-of-band path calculation, path provisioning etc.</t>
   
   <figure anchor="BGP-Routing-based-deterministic-paths">
        <name>BGP Routing based deterministic paths</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
     collective1 uses Blue routing plane :
            Srv6 encapsulated data packet loadbalanced across L5 & L6:
                assuming destination prefix is associated with L5/L6
                 2100:db8:2500
                 2100:db8:2600
     collective2 uses Green routing plane
            Srv6 encapsulated data packet loadbalanced across L1 & L2:
                assuming destination prefix is associated with L1/L2
                 2100:db8:1100
                 2100:db8:1200
]]></artwork>
      </figure>
   
  </section>
  </section>
  
  <section anchor="Sec-5" numbered="true" toc="default">
      <name>Multi Tenancy</name>
        <t>Cloud providers often face the requirement of supporting multiple 
   customer AI/ML workloads simultaneously within the same data center. To 
   ensure isolation, customer traffic must be carried on separate paths, 
   preventing one workload from impacting another.</t>
        <t>This separation can be achieved by constructing source-routed paths 
   within the routing planes, using mechanisms described in 
   <xref target="I-D.filsfils-srv6ops-srv6-ai-backend"/>. For example:</t>

        <t>A source-routed path for Customer A, Collective Type 1 may be built 
   using uA and uN SIDs defined for the Blue routing plane on node S3.</t>
        <t>A source-routed path for Customer B, Collective Type 1 may be built 
   using uA and uN SIDs defined for the Blue routing plane on node S4.</t>

        <t>This approach ensures that each customer’s workload traffic remains 
   isolated within its designated routing plane, while still leveraging the 
   shared physical infrastructure and this is possible only with source based
   routing.</t>
        <t>Such source routing based solutions MUST require controller or any
   out-of-band mechanisms. With this, one can learn the fabric network 
   topology, the details of the hosts network attachment. It is also very
   essential to collect the current operational state of the nodes and the 
   links etc. for providing input to the soruce based path computation.</t>

   <figure anchor="source-routed-paths">
        <name>Source routed paths</name>
        <artwork name="" type="" align="left" alt=""><![CDATA[
     Source Routed path for customer A collective1 uses Blue routing plane S3:
		 2100:db8:2b00:2500
     Source Routed path for customer B collective1 uses Blue routing plane S4:
		 2100:db8:2C00:2600
     Source Routed path for customer A collective2 uses Green routing plane S1:
		 2100:db8:1900:2100
     Source Routed path for customer B collective2 uses Green routing plane S2:
		 2100:db8:1a00:2200
]]></artwork>
      </figure>
   
  </section>

  <section anchor="Sec-6" numbered="true" toc="default">
      <name>Scaling across multiple data-centers</name>

        <t>AI/ML training models continue to grow in size and complexity, often
   requiring deployment across multiple datacenters. In such scenarios, the 
   Data Center Interconnect (DCI) network must be designed to optimize for the 
   lowest delay metric, ensuring efficient distribution of workloads.</t>
        <t>Operators may deploy either IGP or BGP for DCI routing; in many 
   cases, BGP is preferred due to its flexibility and widespread use. The 
   mechanism for advertising delay metrics in BGP is defined in 
   <xref target="I-D.ietf-idr-bgp-generic-metric"/>. Delay values may be 
   configured statically or measured dynamically using protocols such as TWAMP 
   <xref target="RFC5357"/>.</t>
         <t>To construct a routing plane based on delay:</t>

         <t>* The metric-type in the routing plane definition <xref target="Sec-4"/>
   is set to delay.</t>

         <t>When multiple BGP advertisements exist for the same prefix, best 
   path selection is performed using the delay metric carried in the 
   generic-metric attribute.</t>

         <t>This framework is generic and extensible, allowing operators to 
   define multi-planar networks using a variety of metric types (e.g., cost, 
   bandwidth, delay) and constraints, depending on operational requirements.</t>
  </section>
  
  <section anchor="Sec-7" numbered="true" toc="default">
      <name>Data Plane Considerations</name>

           <t>Traffic in data center and interconnect networks typically
   consists of two patterns: bandwidth-intensive “elephant flows” and
   short-lived “mice flows.” These traffic patterns exhibit low entropy, and
   because AI computations are highly sensitive to latency, any congestion in
   the network can significantly degrade performance. Coping with congestion
   requires a combination of strategies: avoidance, detection, notification,
   and reaction.</t>
           <t>* Congestion Avoidance:</t>
           <t>Mechanisms such as strategic traffic segregation via routing
   planes and packet spraying across available links are employed to reduce the
   likelihood of congestion:</t>
           <t>* Congestion Detection and Notification:</t>
           <t>Techniques like Explicit Congestion Notification (ECN) and
   latency measurements can be scoped to individual routing planes. This allows
   congestion signals to be delivered to the sender with plane-specific
   granularity.</t>
           <t>* Congestion Reaction:</t>
            <t>Within a routing plane, BGP can select multiple paths to a
   destination, designating one or more as primary and others as backup. Backup
   paths can be pre-programmed, enabling traffic to switch at millisecond
   granularity when congestion occurs.</t>
            <t>* Policy Enforcement:</t>
            <t>Routing plane policies can reflect customer intent. For example,
   links experiencing quality degradation may be excluded, and traffic can be
   redirected to an alternate routing plane designated as backup.</t>
            <t>The traffic can be classified based on DSCP marking to
   distinguish the collectives it belongs to.</t>
  </section>

  <section anchor="Sec-8" numbered="true" toc="default">
      <name>IANA Considerations</name>
      <t>TBD</t>
  </section>

  <section anchor="Sec-9" numbered="true" toc="default">
      <name>Acknowledgements</name>
      <t>The authors would like to thank Jeffrey Haas, Zhaohui(Jeffrey) Zhang, 
   Kevin Wang and Ron Bonica for their valuable feedback.</t>
    </section>
  </middle>
  <back>
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <?rfc include='reference.RFC.4271'?>
		<?rfc include='reference.RFC.9350'?>
      </references>
      <references>
        <name>Informative References</name>
        <?rfc include='reference.RFC.2119'?>
        <?rfc include='reference.RFC.8174'?>
        <?rfc include='reference.RFC.4360'?>
        <?rfc include='reference.RFC.5305'?>
        <?rfc include='reference.RFC.5357'?>
		<?rfc include='reference.RFC.7938'?>
        <?rfc include='reference.I-D.ietf-idr-bgp-generic-metric'?> 
        <?rfc include='reference.I-D.wang-idr-dpf'?> 
        <?rfc include='reference.I-D.filsfils-srv6ops-srv6-ai-backend'?> 		

      </references>
    </references>
  </back>
</rfc>
