zzhang-pim-multicast-scaling-considerations-00.txt

Internet DRAFT - draft-zzhang-pim-multicast-scaling-considerations
draft-zzhang-pim-multicast-scaling-considerations

Last Version:	draft-zzhang-pim-multicast-scaling-considerations-00.txt	Tracker Entry
Date:	`25-Oct-2022`
Disposition:	expired





pim                                                             Z. Zhang
Internet-Draft                                          Juniper Networks
Intended status: Informational                                 R. Parekh
Expires: 27 April 2023                                             Cisco
                                                              H. Bidgoli
                                                                   Nokia
                                                                Z. Zhang
                                                                     ZTE
                                                         24 October 2022


                    Multicast Scaling Considerations
          draft-zzhang-pim-multicast-scaling-considerations-00

Abstract

   This informational document discusses various multicast scaling
   aspects, compares different multicast technologies with respect to
   scaling, and suggests a general approach of combined solutions to
   scale multicast.  This discussion is independent of IPv4/IPv6 or
   MPLS/SRv6 data planes.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 27 April 2023.

Copyright Notice

   Copyright (c) 2022 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights



Zhang, et al.             Expires 27 April 2023                 [Page 1]

Internet-Draft              Multicast Scaling               October 2022


   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Different Scaling Aspects . . . . . . . . . . . . . . . . . .   2
     1.1.  Scaling in the number of receivers  . . . . . . . . . . .   3
     1.2.  Scaling in the number of trees  . . . . . . . . . . . . .   3
   2.  Multicast Tunnels . . . . . . . . . . . . . . . . . . . . . .   3
   3.  New Multicast Technologies  . . . . . . . . . . . . . . . . .   4
     3.1.  BIER  . . . . . . . . . . . . . . . . . . . . . . . . . .   4
     3.2.  BIER-TE . . . . . . . . . . . . . . . . . . . . . . . . .   5
     3.3.  CGM2  . . . . . . . . . . . . . . . . . . . . . . . . . .   5
   4.  Multicast Tunnel Segmentation . . . . . . . . . . . . . . . .   6
   5.  Signaling for Tunneling and Segmentation  . . . . . . . . . .   7
     5.1.  MVPN as Flow Overlay Signaling for Tunneling  . . . . . .   7
     5.2.  PIM/mLDP as Flow Overlay Signaling over BIER  . . . . . .   8
     5.3.  Flow Overlay Signaling and Tunnel Segmentation  . . . . .   8
   6.  Overall Considerations for Multicast Scaling  . . . . . . . .   8
     6.1.  Observations  . . . . . . . . . . . . . . . . . . . . . .   9
     6.2.  Considerations  . . . . . . . . . . . . . . . . . . . . .   9
       6.2.1.  Reduce Number of Underlay Tunnels or Amount of Tunnel
               Binding Signaling . . . . . . . . . . . . . . . . . .   9
       6.2.2.  Scale Up Segmentation Points  . . . . . . . . . . . .  10
       6.2.3.  Scale out Segmentation Points . . . . . . . . . . . .  11
   7.  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .  12
   8.  Informative References  . . . . . . . . . . . . . . . . . . .  12
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  15

1.  Different Scaling Aspects

   IP Multicast [RFC1112] architecture involves an IP multicast group/
   destination address that, together with the source address,
   identifies a multicast tree rooted at the First Hop Router (FHR) that
   the source is attached to.  Multicast traffic is forwarded along the
   tree, which is typically set up by Protocol Independent Multicast
   (PIM) [RFC7761].

   In this document, each (source, group) pair is referred to a flow or
   a tree.  Typically, each flow is for a "piece of content", e.g., an
   audio or video content.

   While a bidirectional tree [RFC5015] can be used for multicast among
   a set of sources and receivers using only group/destination address,
   it is typically used only in a small domain and is not considered in
   this document.



Zhang, et al.             Expires 27 April 2023                 [Page 2]

Internet-Draft              Multicast Scaling               October 2022


1.1.  Scaling in the number of receivers

   With the above-mentioned multicast tree, each tree node only need to
   replicate to a minimum number of downstream nodes, which could be
   transit or leaf nodes.  Except in the case of Broadband Node Gateway
   (BNG) connecting to a huge number of home subscribers or in the case
   of a spine switch connecting to a large number of leaf switches in a
   Data Center, the number of replication is small, yet the number of
   total receivers can be unlimited as the tree grows lengthwise and
   width-wise.

1.2.  Scaling in the number of trees

   The number of (source, group) pairs could be huge.  Each (source,
   group) pair would need a tree, with each replication point being a
   tree node.  Each tree node needs to have state specifically for each
   tree both in forwarding plane and control plane (that is used to set
   up the forwarding plane state).

   The number of multicast trees that a network can support may have to
   be limited because of the tree state, yet the number of multicast
   trees transiting a network may be huge (simply because of the huge
   number of multicast flows).

   Chances are, many of the flows have a common set of ingress and
   egress points in the transit network.  In that case, instead of
   setting up individual multicast (source, group) trees across this
   network, tunnels can be to transport those individual trees.  For
   example, a single mLDP [RFC6388] tunnel can be used to transport
   thousands of IP multicast flows, greatly reducing the number of trees
   in the network.

2.  Multicast Tunnels

   A multicast tunnel also corresponds to a multicast tree and requires
   corresponding state on tree nodes, though it is not identified by a
   (source, group) pair.

   In case of mLDP [RFC6388] or RSVP-TE P2MP [RFC4875] tunnel, the tree
   is identified by an mLDP FEC or RSVP P2MP Session Object in the
   control plane, and a label in the forwarding plane.  The tree will
   likely have transit (non-root/leaf) tree nodes.

   An MPLS SR-P2MP [I-D.ietf-pim-sr-p2mp-policy] tunnel is identical to
   mLDP/RSVP-TE in the forwarding plane.  It's only different from the
   latter in the control plane - different control plane identifier and
   different setup procedures, but still requires per-tree state on all
   tree nodes.



Zhang, et al.             Expires 27 April 2023                 [Page 3]

Internet-Draft              Multicast Scaling               October 2022


   An SRv6-P2MP tunnel is also similar to mLDP/RSVP-TE/SR MPLS P2MP
   tunnel even in data plane - it's just that the data plane identifier
   is now part of IPv6 destination address.

   An Ingress Replication (IR) [RFC7988] tunnel is a special/degraded
   multicast tree in that it does not have transit tree nodes.  The tree
   root (ingress) replicates traffic and tunnel to tree leaves directly
   using unicast, without the need for tree state on transit routers.
   Obviously, it does not have efficient replication - the ingress may
   need to send multiple copies through common downstream nodes - but it
   may still be desired in certain situation.

   We refer the tunnels to as underlay tunnels and multicast traffic
   (e.g.  IP Multicast) that the undelay tunnels carry to as overlay
   flows.  Notice that, an underlay tunnel could be carried by another
   underlay tunnel (e.g. a part of an mLDP tunnel is transported by
   RSVP-TE P2MP tunnel).

3.  New Multicast Technologies

   The IP multicast and non-IR tunnels described above require per-tree/
   tunnel state on transit tree nodes.  While use of tunnels removes
   individual IP Multicast trees in the tunnels' region, the number of
   tunnels may still be large.  New multicast technologies have been
   developed to remove the per-tree/tunnel state while still allow
   efficient replication, as summarized below.

3.1.  BIER

   Bit Index Explicit Replication (BIER) [RFC8279] is a multicast
   architecture in which forwarding is based on a BitString in the BIER
   header preceding the payload.  Each Bit Position (BP) that is set in
   the BitString represents a BIER Forwarding Egress Router (BFER) that
   needs to receive the packet.  A BIER Forwarding Router (BFR) checks
   the set BPs to determine the set of next hop BFR to replicate the
   packet to, by repeatedly using a BP as lookup key in a BIER Index
   Forwarding Table (BIFT).  A clever way ensures that the number of
   lookup is bound by the number of replications, no matter how many and
   which BPs are set.

   The length of the BitString is limited by maximum transmission unit
   (MTU) of the BIER domain, the reserved space in a packet for
   payloads, and more importantly the implementation tradeoff in the
   forwarding plane.  A typical value is 256.  While larger BitString
   lengths could be used, it may be suboptimal if the BFERs for a packet
   are sparsely distributed.





Zhang, et al.             Expires 27 April 2023                 [Page 4]

Internet-Draft              Multicast Scaling               October 2022


   If the number of BFERs is larger than the BitString length, there are
   two ways to make it work:

   *  Send multiple copies - each copy is for a different set of BFERs.
      The same BP in different copies corresponds to different BFERs.
      This not only requires multiple copies but also multiple BIFTs -
      one BIFT for each set.

   *  Divide the BIER domain into per-region sub-domains to use tunnel
      segmentation (Section 4).  A segmentation point decapsulates the
      BIER packets received in an upstream region and re-encapsulate
      BIER packets for downstream regions.

   BIER can be considered as a tunneling technology - overlay flows
   (e.g.  IP multicast) are tunneled over BIER, even though there are
   not per-tunnel state in a BIER domain.

   The BIER architecture includes a flow overlay on top of BIER layer
   that does BIER signaling and forwarding, which is in turn over a
   routing underlay that forwards BIER traffic from BIER Forwarding
   Ingress Routers (BFIRs) to BFERs.  The purpose of flow overlay is for
   BFERs to signal BFIRs that they are interested in certain overlay
   flows so that the BFIRs can derive what BitString to use for an
   overlay flow.

3.2.  BIER-TE

   BIER does not provide per-flow traffic engineering capability.  BIER
   Traffic Engineering (BIER-TE) [I-D.ietf-bier-te-arch] does so without
   requiring per-flow state inside the BIER domain.

   With BIER-TE, a BP may indicate a replication branch - not just a
   BFER anymore.  Because of that, with the same BitString length, fewer
   BFERs will be covered by a packet - whether we send multiple copies
   or use tunnel segmentation.

3.3.  CGM2

   With BIER and BIER-TE, the BPs in the BitString have "global"
   ("subdomain-wide" to be strict) significance and all BFRs in a
   subdomain must have entries for all of them in the BIFTs.  However,
   the BIER-TE concept can be augmented to work with BPs of local
   significance.  This is referred to as Carrier Grade Minimalist
   Multicast (CGM2) [I-D.eckert-bier-cgm2-rbs].

   The CGM2 concept could be more easily understood through
   [I-D.chen-pim-srv6-p2mp-path], which uses SRv6 SIDs to encode
   replication branches.  Replacing the SIDs with BPs you get CGM2.



Zhang, et al.             Expires 27 April 2023                 [Page 5]

Internet-Draft              Multicast Scaling               October 2022


   Obviously CGM2 scales better than [I-D.chen-pim-srv6-p2mp-path] in
   that many fewer bits are needed in the packet header so this document
   focuses on CGM2.

   While CGM2 does use fewer BIFT entries, it does not save on the
   number of bits needed to encode all replication branches.  In fact,
   it uses more bits to turn the flat-structured BitString in BIER-TE to
   recursive/hierarchical constructs, and that also leads to more
   complicated forwarding behavior.

   Given this limitation, while CGM2 can be used as underlay tunnels, it
   is not really a good fit for Carrier Grade multicast when overlay IP
   multicast is not used (i.e. the end receivers are directly encoded as
   BPs).

4.  Multicast Tunnel Segmentation

   An end-to-end (E2E) multicast flow may need to go through a vast
   network consisting of many ASes and/or IGP areas (referred to
   regions).  When tunneling overlay multicast flows, different types/
   instances of tunnels may need to be used in different regions.  This
   is referred to as tunnel segmentation [RFC6514] [RFC7524], and it is
   because of the following reasons:

   *  Due to technological or administrative reasons, different tunnel
      technologies may have to be used in different regions.  For
      example, one region may use mLDP while another may use IR.

   *  For the same reasons, different tunnel instances of the same type
      may have to be used in different regions.

   *  Even if the same tunnel could be established across multiple
      regions, different tunnel instances in different regions may be
      desired for better optimization.  For example, if a set of flows
      have very different sets of egress points, it is not desired to
      carry them in a single across-region tunnel but they may be
      carried by a single intra-region tunnel in one region and
      different intra-region tunnels in other regions.

   *  In case of IR, instead of directly replicating to all egress
      points, it is more efficient to ingress replicate to segmentation
      points who in turn ingress replicate to the next set of
      segmentation and/or egress points.

   *  In case of BIER, when the number of BFERs is larger than the
      BitString length, segmentation may be used together with smaller
      subdomains.




Zhang, et al.             Expires 27 April 2023                 [Page 6]

Internet-Draft              Multicast Scaling               October 2022


   At the segmentation points, overlay state is needed to stitch the
   upstream segments and downstream segments together.

5.  Signaling for Tunneling and Segmentation

   To carry multicast traffic over a tunnel, the tunnel ingress needs to
   know what flows to be put onto which tunnel.  Depending on the tunnel
   type, it may also need to know the tunnel egresses.  The tunnel
   egresses may also need to know that they need to join the tunnel.
   This requires signaling - note that this is not for setting up the
   underlay tunnel, but for "flow overlay".

5.1.  MVPN as Flow Overlay Signaling for Tunneling

   Consider some E2E IP multicast trees of a customer with part of them
   go over a VPN provider network.  The PE routers tunnel the traffic
   across the provider network based on PIM-MVPN or BGP-MVPN signaling
   specified in [RFC6037] [RFC6513] and [RFC6514].  In particular, Data
   MDT or I/S-PMSI A-D routes are used to announce the binding of
   overlay flows to underlay tunnels.  The binding could be inclusive
   (one tunnel for all flows in a VPN and to all egress PEs - referred
   to as an inclusive tunnel) or selective (one tunnel for one or more
   flows and only to the egress PEs that need to receive the traffic -
   referred to as a selective tunnel).  While selective binding prevents
   multicast traffic from being sent to PEs that do not need to receive
   traffic, inclusive binding can significantly reduce the number of
   tunnels needed (only one tunnel is used for each VPN but traffic is
   sent to all the PEs of the VPN).

   Two other features of BGP-MVPN can further reduce the number of
   underlay tunnels.  One is to use the same tunnel for flows in
   different VPNs (referred to as tunnel aggregation in this document).
   This can be done for all flows or just for some selective flows in
   the VPNs, achieved by advertising a de-multiplex VPN label in the
   PMSI Tunnel Attribute (PTA) attached to the PMSI A-D routes and
   imposing the de-multiplex label before imposing the tunnel label to
   the traffic.  The other is inter-as aggregation where per-PE
   inclusive tunnels are confined to the local AS while per-AS inclusive
   tunnels are used outside the local AS.  This is achieved with Inter-
   AS I-PMSI A-D routes [RFC6514] and the concept is further explained
   and extended to per-Region Aggregation (Section 6.2 of
   [I-D.ietf-bess-evpn-bum-procedure-updates]).

   The same BGP-MVPN procedures can be applied to Global Table Multicast
   (GTM) for multicast in the default/master routing instance over an
   underlay network, as specified in [RFC7716].





Zhang, et al.             Expires 27 April 2023                 [Page 7]

Internet-Draft              Multicast Scaling               October 2022


   Note that this section is about MVPN as Flow Overlay Signaling for
   any kind of tunneling technology including BIER [RFC8556], and the
   next section is about two flow overlay signaling methods specifically
   for BIER.

5.2.  PIM/mLDP as Flow Overlay Signaling over BIER

   Consider an E2E IP multicast tree with part of it tunneled over BIER.
   The flow overlay signaling can be PIM, which is both the signaling
   protocol for the E2E tree and the overlay signaling protocol for
   tunneling over BIER [I-D.ietf-bier-pim-signaling].

   The above applies to IP multicast in both the default/global routing
   instance and in VRFs in case of MVPN, and it replaces PIM-MVPN
   [RFC6037] [RFC6513] in this context.

   Similarly, an E2E mLDP tree can have part of it tunneled over BIER.
   In this case, the flow overlay signaling is mLDP as specified in
   [I-D.ietf-bier-mldp-signaling-over-bier].

   Note that, "tunnel segmentation" is originally documented in BGP-MVPN
   (Section 4) and it refers to that a PE-PE multicast provider tunnel
   can be instantiated with different types/instances of tunnels in
   different ASes/regions.  Strictly speaking, an IP multicast tree
   going over different tunnels in different regions is different from
   the MVPN "tunnel segmentation".  However, in the rest of the document
   we use "tunnel segmentation" for both situations.

5.3.  Flow Overlay Signaling and Tunnel Segmentation

   In case of BGP-MVPN as overlay signaling (for any kind of tunnels -
   see Section 5.1) with tunnel segmentation, the segmentation points
   maintain overlay state in the form of I-PMSI A-D routes that are for
   all flows or S-PMSI A-D routes for specific flows, instead of overlay
   (e.g., IP) multicast tree state.

   In case of PIM/mLDP as BIER flow overlay signaling (Section 5.2),
   when the BIER domain is divided into multiple sub-domains for
   segmentation purpose, the overlay state that the segmentation points
   maintain is the overlay multicast tree state itself (i.e., IP
   multicast tree state or mLDP tree state).

6.  Overall Considerations for Multicast Scaling

   With all the background laid out above, we have the following
   observations and considerations.





Zhang, et al.             Expires 27 April 2023                 [Page 8]

Internet-Draft              Multicast Scaling               October 2022


6.1.  Observations

   For a massive number of flows to reach a massive number of receivers,
   the best solution is IP multicast (Section 1.1) carried over tunnels
   (Section 1.2).

   With massive number of receivers and a large span of an E2E network,
   an E2E multicast overlay flow may need to be carried by multiple
   tunnels/segments of different types or instances in different parts
   of the E2E network (Section 1.2, Section 4).

   Even if BIER/BIER-TE/CGM2 is supported on all devices E2E, it may be
   impractical to encode all BFERs or BIER-TE/CGM2 replication branches
   in a single BitString (flat or recursive), and sending multiple
   copies is just not efficient hence undesired or impractical.  Tunnel
   segmentation needs to be used so that different sub-domains can be
   used in different regions of the E2E network, and that requires the
   overlay state/signaling at the segmentation points.

   However, the segmentation points may become the scaling bottleneck
   due to the overlay state/signaling.  That may be mitigated by
   solutions described below.

6.2.  Considerations

   As observed above, to massively scale multicast in both dimensions
   (number of receivers and number of flows) and over a vast network, IP
   multicast over underlay tunnel segments is needed.  This section
   focuses on how to reduce the number of underlay tunnels and how to
   scale segmentation points.

6.2.1.  Reduce Number of Underlay Tunnels or Amount of Tunnel Binding
        Signaling

   As discussed in Section 5.1, underlay tunnels can be reduced by use
   of following:

   *  Inclusive tunnels

   *  Tunnel aggregation

   *  Per-region aggregation

   Notice that while they all reduce the number of underlay tunnels
   (which is useful, otherwise the underlay network need to maintain
   more tunnel state unless BIER/BIER-TE/CGM2 is used), tunnel
   aggregation does not reduce the overlay signaling of binding between
   tunnel and overlay flows.



Zhang, et al.             Expires 27 April 2023                 [Page 9]

Internet-Draft              Multicast Scaling               October 2022


   Therefore, if segmented selective tunnels are used, or if there are
   just too many VPNs to support (such that even the number of I-PMSI
   A-D routes is too large), segmentation point scaling is still needed
   as discussed below.

   On the other hand, the trend of multicast application seems to be
   that it is mainly for large scale delivery of real-time high rate
   data that most likely need to be sent everywhere, e.g., broadcasting
   of World Cup Soccer or Chinese Spring Festival Gala.  In this case
   inclusive tunnels in a few VPNs may very well suffice.

   Note that, while [RFC6514] and [RFC6625] specifies that the source/
   group length fields of S-PMSI A-D routes to be either 0 (wildcard) or
   32/128 (IPv4/IPv6 host address), it could be extended to have lengths
   between 0 and 32/128.  With that, a set of overlay flows not only can
   share the same underlay tunnel but can also share the same S-PMSI A-D
   route.

   Part of an underlay tunnel itself can be stacked on top of another
   multicast tunnel.  When multiple upper tunnels stack on top of the
   same lower tunnel, the lower tunnel's domain does not need to
   maintain state for the upper tunnels.  Examples include mLDP over
   RSVP-TE/mLDP/other P2MP tunnels ([RFC7060], mLDP over MVPN [RFC6514]
   where mLDP is the C-multicast protocol, and mLDP over BIER
   [I-D.ietf-bier-mldp-signaling-over-bier].

6.2.2.  Scale Up Segmentation Points

   The scaling burden on a segmentation point falls on both control
   plane (for overlay signaling) and forwarding plane.  While a typical
   router implementation supports much lower number of multicast flows
   than unicast ones, it does not mean the multicast scale cannot be
   increased.

6.2.2.1.  Forwarding Plane Scaling

   A modern edge router can handle hundreds of thousands if not millions
   of (unicast) routes in the forwarding plane.  For multicast
   forwarding, the scaling property is actually similar to the unicase
   case - the only difference is that the lookup key in case of IP
   multicast is (source, group/destination) pair instead of just a
   destination address.  The forwarding action is based on the
   forwarding instructions (e.g. a set of replication branches),
   referred to as forwarding nexthop associated with the route that is
   looked up.






Zhang, et al.             Expires 27 April 2023                [Page 10]

Internet-Draft              Multicast Scaling               October 2022


   On the other hand, while many multicast routes can have the same
   forwarding nexthop just like in unicast case, on segmentation points
   the multicast forwarding nexthop sharing is limited as explained in
   Section 2.1 of [I-D.zzhang-bess-mvpn-evpn-segmented-forwarding].

   Therefore, a modern router designed with multicast scaling in
   consideration (e.g. has enough memory for multicast state especially
   more unshared forwarding nexthop) should be able to handle hundreds
   of thousands if not millions of multicast flows.

   Of course, scaling must always be considered together with
   convergence.  For example, when something happens, how fast can the
   FIB state be updated.  This is further discussed below.

6.2.2.2.  Control Plane Scaling

   While it might be challenging for some PIM [RFC7761] implementations
   to handle hundreds of thousands of multicast flows due to the
   following reasons:

   *  Periodic refreshes due to soft state nature

   *  Potentially huge number of impacted flows when an interface goes
      up/down

   Overlay multicast scaling does not have to rely on PIM (so no soft
   state refresh problem) and is less prone to topology changes.

   Taking the example of BGP-MVPN [RFC6514] as the overlay protocol,
   overlay multicast interest is signaled with BGP-MVPN type-6/7
   (C-Multicast Auto-Discovery or A-D) routes, augmented with type-4
   (Leaf A-D) routes in case of selective IR/BIER tunnels.  None of the
   above-mentioned scaling concerns are applicable here.

   It is possible that an underlay topology change could impact some
   underlay tunnels.  A good implementation should be able to minimize
   the impact to the overlay forwarding state.  In the BIER case, no
   overlay forwarding state needs to be changed at all.

6.2.3.  Scale out Segmentation Points

   Consider that between two segmentation regions there are 2N
   segmentation points (Regional Border Routers, or RBRs) in parallel.
   They are divided into N pairs where each pair is responsible for 1/N
   of overlay flows (a pair is used for redundancy purposes).  By
   increasing the number N, we can scale out the segmentation points
   (whether scale up is done at the same time or not).




Zhang, et al.             Expires 27 April 2023                [Page 11]

Internet-Draft              Multicast Scaling               October 2022


7.  Summary

   As discussed above, existing and in-development multicast solutions,
   when implemented properly and deployed with appropriate combinations,
   can scale very well to massive number of multicast flows with massive
   number of receivers over a vast network:

   *  Use IP multicast to scale for massive number of receivers
      (Section 1.1)

   *  Use tunneling to scale for massive number of flows (Section 1.2)

   *  Use inclusive tunnels and/or aggregation to reduce the number of
      underlay tunnels needed (Section 5.1)

   *  Use BIER to achieve efficient replication without requiring per-
      tree state (Section 3.1)

   *  Use BIER-TE/CGM2 to achieve per-flow traffic steering without
      requiring per-tree state (Section 3.2, Section 3.3)

   *  Use tunnel segmentation to scale for a vast network that may
      deploy different types/instances of tunnels in different regions
      (Section 4)

   *  Scale up (Section 6.2.2) and scale out (Section 6.2.3) tunnel
      segmentation points

   It's worth pointing out that the above is independent of IPv4/IPv6 or
   MPLS/SRv6 data planes.

8.  Informative References

   [I-D.chen-pim-srv6-p2mp-path]
              Chen, H., McBride, M., Fan, Y., Li, Z., Geng, X., Toy, M.,
              Gyan Mishra, S., Wang, A., Liu, L., and X. Liu, "Stateless
              SRv6 Point-to-Multipoint Path", Work in Progress,
              Internet-Draft, draft-chen-pim-srv6-p2mp-path-06, 30 April
              2022, <https://www.ietf.org/archive/id/draft-chen-pim-
              srv6-p2mp-path-06.txt>.











Zhang, et al.             Expires 27 April 2023                [Page 12]

Internet-Draft              Multicast Scaling               October 2022


   [I-D.eckert-bier-cgm2-rbs]
              Eckert, T. and B. (. Xu, "Carrier Grade Minimalist
              Multicast (CGM2) using Bit Index Explicit Replication
              (BIER) with Recursive BitString Structure (RBS)
              Addresses", Work in Progress, Internet-Draft, draft-
              eckert-bier-cgm2-rbs-01, 9 February 2022,
              <https://www.ietf.org/archive/id/draft-eckert-bier-cgm2-
              rbs-01.txt>.

   [I-D.ietf-bess-evpn-bum-procedure-updates]
              Zhang, Z., Lin, W., Rabadan, J., Patel, K., and A.
              Sajassi, "Updates on EVPN BUM Procedures", Work in
              Progress, Internet-Draft, draft-ietf-bess-evpn-bum-
              procedure-updates-14, 18 November 2021,
              <https://www.ietf.org/archive/id/draft-ietf-bess-evpn-bum-
              procedure-updates-14.txt>.

   [I-D.ietf-bier-mldp-signaling-over-bier]
              Bidgoli, H., Kotalwar, J., Wijnands, I., Mishra, M.,
              Zhang, Z., and E. Leyton, "M-LDP Signaling Through BIER
              Core", Work in Progress, Internet-Draft, draft-ietf-bier-
              mldp-signaling-over-bier-01, 12 November 2021,
              <https://www.ietf.org/archive/id/draft-ietf-bier-mldp-
              signaling-over-bier-01.txt>.

   [I-D.ietf-bier-pim-signaling]
              Bidgoli, H., Xu, F., Kotalwar, J., Wijnands, I., Mishra,
              M., and Z. Zhang, "PIM Signaling Through BIER Core", Work
              in Progress, Internet-Draft, draft-ietf-bier-pim-
              signaling-12, 25 July 2021,
              <https://www.ietf.org/archive/id/draft-ietf-bier-pim-
              signaling-12.txt>.

   [I-D.ietf-bier-te-arch]
              Eckert, T., Menth, M., and G. Cauchie, "Tree Engineering
              for Bit Index Explicit Replication (BIER-TE)", Work in
              Progress, Internet-Draft, draft-ietf-bier-te-arch-13, 25
              April 2022, <https://www.ietf.org/archive/id/draft-ietf-
              bier-te-arch-13.txt>.

   [I-D.ietf-pim-sr-p2mp-policy]
              (editor), D. V., Filsfils, C., Parekh, R., Bidgoli, H.,
              and Z. Zhang, "Segment Routing Point-to-Multipoint
              Policy", Work in Progress, Internet-Draft, draft-ietf-pim-
              sr-p2mp-policy-05, 2 July 2022,
              <https://www.ietf.org/archive/id/draft-ietf-pim-sr-p2mp-
              policy-05.txt>.




Zhang, et al.             Expires 27 April 2023                [Page 13]

Internet-Draft              Multicast Scaling               October 2022


   [I-D.zzhang-bess-mvpn-evpn-segmented-forwarding]
              Zhang, Z. and J. Xie, "MVPN/EVPN Segmentated Forwarding
              Options", Work in Progress, Internet-Draft, draft-zzhang-
              bess-mvpn-evpn-segmented-forwarding-00, 20 December 2018,
              <https://www.ietf.org/archive/id/draft-zzhang-bess-mvpn-
              evpn-segmented-forwarding-00.txt>.

   [RFC1112]  Deering, S., "Host extensions for IP multicasting", STD 5,
              RFC 1112, DOI 10.17487/RFC1112, August 1989,
              <https://www.rfc-editor.org/info/rfc1112>.

   [RFC4875]  Aggarwal, R., Ed., Papadimitriou, D., Ed., and S.
              Yasukawa, Ed., "Extensions to Resource Reservation
              Protocol - Traffic Engineering (RSVP-TE) for Point-to-
              Multipoint TE Label Switched Paths (LSPs)", RFC 4875,
              DOI 10.17487/RFC4875, May 2007,
              <https://www.rfc-editor.org/info/rfc4875>.

   [RFC5015]  Handley, M., Kouvelas, I., Speakman, T., and L. Vicisano,
              "Bidirectional Protocol Independent Multicast (BIDIR-
              PIM)", RFC 5015, DOI 10.17487/RFC5015, October 2007,
              <https://www.rfc-editor.org/info/rfc5015>.

   [RFC6037]  Rosen, E., Ed., Cai, Y., Ed., and IJ. Wijnands, "Cisco
              Systems' Solution for Multicast in BGP/MPLS IP VPNs",
              RFC 6037, DOI 10.17487/RFC6037, October 2010,
              <https://www.rfc-editor.org/info/rfc6037>.

   [RFC6388]  Wijnands, IJ., Ed., Minei, I., Ed., Kompella, K., and B.
              Thomas, "Label Distribution Protocol Extensions for Point-
              to-Multipoint and Multipoint-to-Multipoint Label Switched
              Paths", RFC 6388, DOI 10.17487/RFC6388, November 2011,
              <https://www.rfc-editor.org/info/rfc6388>.

   [RFC6513]  Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/
              BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February
              2012, <https://www.rfc-editor.org/info/rfc6513>.

   [RFC6514]  Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
              Encodings and Procedures for Multicast in MPLS/BGP IP
              VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012,
              <https://www.rfc-editor.org/info/rfc6514>.

   [RFC6625]  Rosen, E., Ed., Rekhter, Y., Ed., Hendrickx, W., and R.
              Qiu, "Wildcards in Multicast VPN Auto-Discovery Routes",
              RFC 6625, DOI 10.17487/RFC6625, May 2012,
              <https://www.rfc-editor.org/info/rfc6625>.




Zhang, et al.             Expires 27 April 2023                [Page 14]

Internet-Draft              Multicast Scaling               October 2022


   [RFC7060]  Napierala, M., Rosen, E., and IJ. Wijnands, "Using LDP
              Multipoint Extensions on Targeted LDP Sessions", RFC 7060,
              DOI 10.17487/RFC7060, November 2013,
              <https://www.rfc-editor.org/info/rfc7060>.

   [RFC7524]  Rekhter, Y., Rosen, E., Aggarwal, R., Morin, T.,
              Grosclaude, I., Leymann, N., and S. Saad, "Inter-Area
              Point-to-Multipoint (P2MP) Segmented Label Switched Paths
              (LSPs)", RFC 7524, DOI 10.17487/RFC7524, May 2015,
              <https://www.rfc-editor.org/info/rfc7524>.

   [RFC7716]  Zhang, J., Giuliano, L., Rosen, E., Ed., Subramanian, K.,
              and D. Pacella, "Global Table Multicast with BGP Multicast
              VPN (BGP-MVPN) Procedures", RFC 7716,
              DOI 10.17487/RFC7716, December 2015,
              <https://www.rfc-editor.org/info/rfc7716>.

   [RFC7761]  Fenner, B., Handley, M., Holbrook, H., Kouvelas, I.,
              Parekh, R., Zhang, Z., and L. Zheng, "Protocol Independent
              Multicast - Sparse Mode (PIM-SM): Protocol Specification
              (Revised)", STD 83, RFC 7761, DOI 10.17487/RFC7761, March
              2016, <https://www.rfc-editor.org/info/rfc7761>.

   [RFC7988]  Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress
              Replication Tunnels in Multicast VPN", RFC 7988,
              DOI 10.17487/RFC7988, October 2016,
              <https://www.rfc-editor.org/info/rfc7988>.

   [RFC8279]  Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A.,
              Przygienda, T., and S. Aldrin, "Multicast Using Bit Index
              Explicit Replication (BIER)", RFC 8279,
              DOI 10.17487/RFC8279, November 2017,
              <https://www.rfc-editor.org/info/rfc8279>.

   [RFC8556]  Rosen, E., Ed., Sivakumar, M., Przygienda, T., Aldrin, S.,
              and A. Dolganow, "Multicast VPN Using Bit Index Explicit
              Replication (BIER)", RFC 8556, DOI 10.17487/RFC8556, April
              2019, <https://www.rfc-editor.org/info/rfc8556>.

Authors' Addresses

   Zhaohui Zhang
   Juniper Networks
   Email: zzhang@juniper.net


   Rishabh Parekh
   Cisco



Zhang, et al.             Expires 27 April 2023                [Page 15]

Internet-Draft              Multicast Scaling               October 2022


   Email: riparekh@cisco.com


   Hooman Bidgoli
   Nokia
   Email: hooman.bidgoli@nokia.com


   Zheng Zhang
   ZTE
   Email: zhang.zheng@zte.com.cn








































Zhang, et al.             Expires 27 April 2023                [Page 16]