Internet DRAFT - draft-yang-alto-multi-domain

draft-yang-alto-multi-domain







ALTO Working Group                                             D. Lachos
Internet-Draft                                                  I. Poese
Intended status: Standards Track                                  Benocs
Expires: 11 January 2024                                      M. Lassnig
                                                                    CERN
                                                                   A. Gu
                                                                 Y. Yang
                                                         Yale University
                                                           J. Ros Giralt
                                                                Qualcomm
                                                            10 July 2023


                ALTO Multi-Domain Use Cases and Services
                    draft-yang-alto-multi-domain-03

Abstract

   Application-Layer Traffic Optimization (ALTO) provides means for
   network applications to obtain network information.  Although ALTO is
   inherently multi-domain, in that the ALTO server representing the
   network and the ALTO client requesting the network information belong
   to different trust domains, there are more general cases where the
   path from the source and the destination spans multiple autonomous
   networks, which we call multi-domain settings.  This document first
   gives three multi-domain use cases, and the challenges to address the
   challenges.  It then gives a brief update on the implementation
   solutions that we explored to address the challenges.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.





Lachos, et al.           Expires 11 January 2024                [Page 1]

Internet-Draft              ALTO Multi-Domain                  July 2023


   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 11 January 2024.

Copyright Notice

   Copyright (c) 2023 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Multi-domain Use Cases  . . . . . . . . . . . . . . . . . . .   3
     2.1.  Multi-domain Path Distance/Ranking for Rucio  . . . . . .   3
     2.2.  Multi-domain Path -> Link Mapping for FTS . . . . . . . .   5
     2.3.  Multi-domain Resource Discovery for NOTED/SENSE . . . . .   6
   3.  Multi-domain Challenges . . . . . . . . . . . . . . . . . . .   6
     3.1.  Challenge: Distributed Information  . . . . . . . . . . .   6
     3.2.  Challenge: Partial Deployment . . . . . . . . . . . . . .   7
   4.  Solution Space Exploration  . . . . . . . . . . . . . . . . .   7
     4.1.  Solution Space to Obtain Forwarding Information Bases
           (FIBs)  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     4.2.  FIB-Compute Based Design  . . . . . . . . . . . . . . . .   8
     4.3.  FIB-Apply Based Design  . . . . . . . . . . . . . . . . .   8
     4.4.  Multi-Domain Cascading ALTO . . . . . . . . . . . . . . .   8
     4.5.  General-Path Model Supporting Partial Deployment  . . . .   9
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
   6.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  10
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .  10
     7.2.  Informative References  . . . . . . . . . . . . . . . . .  11
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  11







Lachos, et al.           Expires 11 January 2024                [Page 2]

Internet-Draft              ALTO Multi-Domain                  July 2023


1.  Introduction

   Application-Layer Traffic Optimization (ALTO) provides means for
   network applications to obtain network information.  For example, the
   Endpoint Cost Service (ECS) and the Cost Map Service (CMS) defined by
   ALTO in [RFC7285] can provide the network-path properties (e.g.,
   routing costs, or relative ranking) of data transmissions from a set
   of source network locations to a set of destination network
   locations.  The Path Vector Service [RFC9275] allows ALTO to provide
   bandwidth availability (bottlenecks) for a given set of flows defined
   by a set of source-destination pairs.

   Although such services can provide values when the network path from
   a given source to a given destination spans only a single network,
   called a single-domain setting, provides these services, there are
   many use cases where the path may span multiple networks, called
   multi-domains settings.  [RFC7971] states that "The ALTO protocol is
   designed for use cases where the ALTO server and client can be
   located in different organizations or trust domains.  ALTO is
   inherently designed for use in multi-domain environments.  Most
   importantly, ALTO is designed to enable deployments in which the ALTO
   server and the ALTO client are not located within the same
   administrative domain.”

   This document first specifies three multi-domain use cases.  It then
   discusses the two challenges when deploying ALTO in multi-domain
   settings.  At the end, this document provides initial designs based
   on current implementation experiences to start the design
   conversation.

2.  Multi-domain Use Cases

   To be concrete, this document uses three use cases from the context
   of data-intensive sciences for CERN/WLCG.

2.1.  Multi-domain Path Distance/Ranking for Rucio

   The data orchestration system of CERN/WLCG is Rucio, which selects a
   source for a given downloading client when multiple sources provide
   the dataset.  Rucio also conducts destination selection, when
   multiple destinations can be the target locations for replication.










Lachos, et al.           Expires 11 January 2024                [Page 3]

Internet-Draft              ALTO Multi-Domain                  July 2023


   The main mechanism of source/destination selection in Rucio is
   distance, which is a perfect match for ALTO cost services; see
   appendix for a review of the full source/destination mecahnisms of
   Rucio.  Specifically, consider the following ECS query realizing ALTO
   ECS for Rucio to do destination selection.  The source is located at
   CERN and the destination candidates are at multiple locations of the
   LHCONE network (AGLT, BNL, and KIT, for example).


     POST /endpointcost/lookup HTTP/1.1
     Host: alto.example.com
     Content-Length: 248
     Content-Type: application/alto-endpointcostparams+json
     Accept:
        application/alto-endpointcost+json,application/alto-error+json

     {
       "cost-type": {"cost-mode" : "numerical",
                     "cost-metric" : "routingcost"},
       "endpoints" : {
         "srcs": [ "ipv4:188.184.9.234" ],
         "dsts": [
           "ipv4:192.41.231.81",
           "ipv4:104.18.24.74",
           "ipv4:141.3.128.6"
         ]
       }
     }


     HTTP/1.1 200 OK
     Content-Length: 274
     Content-Type: application/alto-endpointcost+json

     {
       "meta" : {
         "cost-type": {"cost-mode" : "numerical",
                       "cost-metric" : "routingcost"
         }
       },
       "endpoint-cost-map" : {
         "ipv4:188.184.9.234": {
           "ipv4:192.41.231.81" : 20,
           "ipv4:104.18.24.74" : 30,
           "ipv4:141.3.128.6"  : 10
         }
       }
     }



Lachos, et al.           Expires 11 January 2024                [Page 4]

Internet-Draft              ALTO Multi-Domain                  July 2023


   The use case provides an example of a multi-domain setting, because
   the source and the destinations are located at different networks and
   the network paths span multiple autonomous networks.  For example,
   the path from CERN (188.184.9.234) to AGLT2 (192.41.231.81) spans
   CERN, ESNet and U.  Michigan to reach the destination.

   Figure 1 below illustrates a network path from src to dst spanning
   four networks.


         AS S              AS A        AS B         AS D
   +-------------+se1  +---------+   +-----+   +------------+
   | src       --|-----|ai1   ae1|---|     |---|di1     dst |
   |+--+    --/  |     |         |   |     |   |       +--+ |
   ||  | --/     |     |         |   |     |   |       |  | |
   |+--+    \    |se2  |         |   |     |   |       +--+ |
   |         \__ |_____|ai2   ae2|---|     |---|di2         |
   +-------------+     +---------+   +-----+   +------------+

                 Figure 1. Multi-domain Network.


2.2.  Multi-domain Path -> Link Mapping for FTS

   The data transport scheduling system of CERN/WLCG is FTS, which
   schedules given transfers, where a transfer is compromised of a fixed
   dataset, a source and a destination (for example, chosen by Rucio).

   One main function of FTS is to satisfy operator resource constraints.
   As an example, FTS may schedule concurrent transfers from CERN to
   CalTech and KIT, under the constraint that the total transmission
   rate for a given link should be within a certain threshold.

   Consider Figure 1.  Assume S is CERN and D is CalTech.  The network
   path from CERN to CalTech uses the top inter-AS link (se1->ai1), and
   the network path from CERN to KIT uses the lower inter-AS link
   (se2->ai2).  For FTS to know how much demand it is placing on each
   one of these two links so that it can stay within the constraint of
   each link, FTS needs to know the network paths, and the paths span
   multiple autonomous networks.











Lachos, et al.           Expires 11 January 2024                [Page 5]

Internet-Draft              ALTO Multi-Domain                  July 2023


2.3.  Multi-domain Resource Discovery for NOTED/SENSE

   An emerging capacity for LHCONE/WLCG is the ability to create new
   paths to obtain additional capacity when there is high demand from a
   given source storage element (SE) to a destination element (SE).  In
   particular, the NOTED project monitors the backlog at Rucio/FTS for
   each source SE, destination SE pair, and may decide to create a path
   if one pair has a large backlog.

   To identify available networking capacity, ALTO Path Vector Service
   (PVS) is an ideal service.  However, as we see from Figure 1, the
   setting can be a multi-domain setting.

3.  Multi-domain Challenges

   Although implementing ECS/CMS/PVS is relatively straightforward in a
   single domain, there are challenges implementing these services in
   multi-domain settings.

3.1.  Challenge: Distributed Information

   Consider Figure 1, to compute the total routing cost of the path from
   the src to the dst.  The routing costs will consist of multiple
   segments spanning four autonomous networks: (1) from src to the link
   connecting the egress of AS S (se1) and the ingress of AS A (ia1);
   (2) from the ingress of A to the ingress of B; (3) from the ingress
   of B to the ingress of D; ( 4) from the ingress of D to dst.

   One may think that BGP collects information from multiple autonomous
   networks through back propagation from the destination, and hence
   includes information for all four segments.  But BGP information is
   distributed, coarse-grained, and incomplete.

   Source: The BGP router at AS S knows that the path from src to dst
   consists of the AS-PATH [S A B D].  Combining BGP and intradomain
   routing, AS S will also know which one of the two egress routers
   (se1, se2) that it will use to forward traffic to dst.  However, AS S
   does not know more details downstream: for example, it does not know
   whether the packet will use ae1 or ae2 as the egress router at AS A
   to enter AS B; neither does it know the internal routing inside AS A.
   Hence, an ALTO server provided by AS S cannot provide all of the
   information for the example ECS query.









Lachos, et al.           Expires 11 January 2024                [Page 6]

Internet-Draft              ALTO Multi-Domain                  July 2023


   Non-Source AS: A non-source AS knows the AS-PATH starting from itself
   to dst.  But it may not know the ingress point.  For example, AS A
   does not know whether the packet will come in from ai1 or ai2.
   Hence, an ALTO server provided by AS A may consider the example ECS
   query as an ambiguous query (because it gives only source (src) and
   destination (dst), but it does not in general know the ingress
   point).

3.2.  Challenge: Partial Deployment

   It is possible to design protocol extensions to collect the
   aforementioned distributed information to provide complete
   information (see below), but one challenge is that the deployment may
   be only incremental and hence is partially deployed during the
   process.

4.  Solution Space Exploration

   During the process of integrating ALTO to support the Rucio and FTS
   use cases, multiple potential solutions are implemented.  Below we
   discuss them briefly to motivate discussions.

4.1.  Solution Space to Obtain Forwarding Information Bases (FIBs)

   Obtaining FIBs is a basic building block of implementing ALTO route-
   related visibility services.  At a high-level, FIBs have the
   following life cycle:

   FIB-Configure/input:
      Consider a network as a distributed system.  Then the input to the
      distributed system is the input to the components: the routers.
      The input to each router is its configuration, and external input
      such as BGP routes from peers outside the network.

   FIB-Compute:
      The distributed system computes the FIBs using a distributed
      protocol (or configured by a logically centralized control plane).
      During the computing process, the vendor specific configuration
      may be turned into standard format such as IGP information model.

   FIB-Apply:
      The FIBs are computed and then applied.  When FIBs are applied to
      data packets, the application may be observed by NetFlow/sFlow or
      similar capturing capabilities; it may also be applied to control
      mechanisms such as traceoute, which can be observed.






Lachos, et al.           Expires 11 January 2024                [Page 7]

Internet-Draft              ALTO Multi-Domain                  July 2023


4.2.  FIB-Compute Based Design

   This is a type of solution that makes it possible to extends FIB-
   Compute to compute all needed network information at a single
   autonomous network, addressing the distributed information challenge.
   Assume that use an ALTO server at the source network to abstract and
   expose the information.  One natural candidate is to modify the
   routing control plane itself: BGP extensions, which can be extended
   to collect needed information and propagate upstream.  For example,
   when a BGP router at AS A (e.g., ai1) propagates BGP info to its peer
   at AS S (se1), it includes not only the AS-PATH [A, B, D], but also
   additional information so that the upstream can construct the
   complete path cost (distance) metrics.

   The upside of this design is that it integrates with routing systems
   and hence may even extend routing capabilities.  However, routing
   protocol extensions can be complex in deployment.  Further, it
   provides a different trust model: the original ALTO model is a star
   trust model, with the application (e.g., Rucio/FTS) at the hub and
   each AS needs to trust the application.  The BGP extension model
   requires the trust of peers and recursive peers (BGP community may be
   used to impose policies).

4.3.  FIB-Apply Based Design

   This is a type of solution that allows data path to collect control
   plane information.  For example, a traceroute based system called
   PerfSonar is widely deployed in our setting.  It is also natural to
   use NetFlow/sFlow to identify ingress points.  Such a system can
   collect other network information such as delay and loss naturally as
   measurements.

   However, this solution type can have many issues.  For example,
   traceoute has issues including: anonymous routers, uncertain router
   IP resulting in node aliasing; load balancing routing resulting in
   link aliasing.

4.4.  Multi-Domain Cascading ALTO

   This is a model that is presented by Ingmar Poese as Cascading ALTO
   at IETF 116.  For Cascading ALTO by Poese, please see his IETF 116
   slides.

   In the ALTO base model, a network is a container, which we call a big
   switch, with endpoints attached to the big switch.  In the multi-
   domain model, each network (represented by an ALTO server) has a set
   of ingress points (in-1 to in-m) and a set of egress points (e-1 to
   e-n).  An endpoint belonging to the network will be attached to an



Lachos, et al.           Expires 11 January 2024                [Page 8]

Internet-Draft              ALTO Multi-Domain                  July 2023


   ingress point and an egress point.  Hence, a single-domain ALTO query
   will specify ingress and egress directly attached to an ingress point
   and an egress point.  A source network, to a destination that is not
   in the same network, however, will only return the egress point; a
   destination network, when the source is from a different network,
   will need an ingress point.  A general transit network will need an
   ingress point and return egress point.  For consistency, the egress
   point must be a valid ingress point, represented by a unique address,
   of the peer.


               in-1   +-------------+  e-1
                  ----|             |----
                      |             |
                  ----|             |----
                      |             |
                  ----|             |----
                      |             |
                  ----|             |----
               in-m   +-------------+  e-n

   ALTO Server Multi-domain Query Model: Each ECS query, if the src is
   not in the home domain of the ALTO server, should include an ingress
   point, where the ingress point is returned by the ALTO server of the
   previous domain.  If the domain of the ALTO server is not the home
   domain of the destination, the ALTO server should return the egress
   point of the home domain and the ingress of the network domain.

   As an optional feature, the query should allow indication of
   iterative or recursive queries.

   To support incremental deployment, an ALTO server may respond to a
   query without specifying an ingress point and the source is not in
   the domain of the ALTO server.  In this case, the ALTO server will
   return the results from each potential ingress point.  For each
   ingress point indicated, the server indicates information of the
   previous hop (e.g., peer AS number and potential address).

4.5.  General-Path Model Supporting Partial Deployment

   Complementing the cascading model, we introduced a generic-path model
   at ALTO clients so that they can use the acquired information to
   gradually refine network information.

   In particular, it allows the path from a src to a dst to be a
   directed acyclic graph, with the following components:





Lachos, et al.           Expires 11 January 2024                [Page 9]

Internet-Draft              ALTO Multi-Domain                  July 2023


   A set of nodes, where each node has both a type and attributes, where
   the type can be (1) host: such as src/dst, with attributes such as IP
   address; (2) AS: which is a group of nodes, i.e., subgraph, with
   attributes including ASN; (3) router, with subtypes such as BGP-
   router, with attributes such as IP address.

   A set of links, where each link has a head and a tail; hence the
   types of links will be the unique combinations of head-type x tail
   type.  A link can have its attributes as well.

   Now, some examples of this representation in our deployment use case:

   For the geo-distance ALTO cost derived from geo-ip: the src is a host
   and the dst is also a host, and the metric is the geo distance.

   For CERN looking glass ALTO server, from a src host in CERN to a dst
   host in another network, say KIT, the src is a host, with two links,
   one for each of the two looking glass BGP routers from cern; each of
   these BGP routers links to its BGP peer, and each such BGP peer links
   to the next AS, in the AS-PATH exposed by CERN.

5.  IANA Considerations

   Some of the solutions will need IANA registrations.

6.  Acknowledgments

   The authors of this document would also like to thank many for the
   reviews and comments.

7.  References

7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC7285]  Alimi, R., Ed., Penno, R., Ed., Yang, Y., Ed., Kiesel, S.,
              Previdi, S., Roome, W., Shalunov, S., and R. Woundy,
              "Application-Layer Traffic Optimization (ALTO) Protocol",
              RFC 7285, DOI 10.17487/RFC7285, September 2014,
              <https://www.rfc-editor.org/info/rfc7285>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.



Lachos, et al.           Expires 11 January 2024               [Page 10]

Internet-Draft              ALTO Multi-Domain                  July 2023


7.2.  Informative References

   [RFC7971]  Stiemerling, M., Kiesel, S., Scharf, M., Seidel, H., and
              S. Previdi, "Application-Layer Traffic Optimization (ALTO)
              Deployment Considerations", RFC 7971,
              DOI 10.17487/RFC7971, October 2016,
              <https://www.rfc-editor.org/info/rfc7971>.

   [RFC9275]  Gao, K., Lee, Y., Randriamasy, S., Yang, Y., and J. Zhang,
              "An Extension for Application-Layer Traffic Optimization
              (ALTO): Path Vector", RFC 7971, DOI 10.17487/RFC7971,
              September 2022, <https://www.rfc-editor.org/info/rfc9275>.

Authors' Addresses

   Danny Lachos
   Benocs
   Berlin
   Germany
   Email: dlachos@benocs.com


   Ingmar Poese
   Benocs
   Berlin
   Germany
   Email: ingmar@benocs.com


   Mario Lassnig
   CERN
   CH-1211 Geneva 23
   Switzerland
   Email: mario.lassnig@cern.ch


   Annie Gu
   Yale University
   51 Prospect St
   New Haven, CT 06520
   United States of America
   Email: annie.gu@yale.edu









Lachos, et al.           Expires 11 January 2024               [Page 11]

Internet-Draft              ALTO Multi-Domain                  July 2023


   Y. Richard Yang
   Yale University
   51 Prospect St
   New Haven, CT 06520
   United States of America
   Email: yry@cs.yale.edu


   Jordi Ros Giralt
   Qualcomm
   Madrid
   Spain
   Email: jros@qti.qualcomm.com






































Lachos, et al.           Expires 11 January 2024               [Page 12]