Internet DRAFT - draft-bishop-httpbis-distributed-origin

draft-bishop-httpbis-distributed-origin







HTTP Working Group                                             M. Bishop
Internet-Draft                                       Akamai Technologies
Intended status: Informational                            4 October 2021
Expires: 7 April 2022


          Distributed HTTP Origins: Solution Space Exploration
               draft-bishop-httpbis-distributed-origin-00

Abstract

   Certain content libraries are logically a single origin, but too
   large to be practically served by a single origin server.  This
   document discusses existing solutions and explores possible
   directions for future protocol development.

Discussion Venues

   This note is to be removed before publishing as an RFC.

   Discussion of this document takes place on the mailing list
   (httpbis@ietf.org), which is archived at
   https://mailarchive.ietf.org/arch/browse/httpbis/.

   Source for this draft and an issue tracker can be found at
   https://github.com/MikeBishop/alt-svc-bis.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 7 April 2022.

Copyright Notice

   Copyright (c) 2021 IETF Trust and the persons identified as the
   document authors.  All rights reserved.



Bishop                    Expires 7 April 2022                  [Page 1]

Internet-Draft          Distributed HTTP Origins            October 2021


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Simplified BSD License text
   as described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Simplified BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Conventions and Definitions . . . . . . . . . . . . . . .   3
   2.  Existing Solutions  . . . . . . . . . . . . . . . . . . . . .   3
     2.1.  Content-Specific Hostnames  . . . . . . . . . . . . . . .   3
     2.2.  Internal Load-Balancing . . . . . . . . . . . . . . . . .   4
   3.  Previous Standards Efforts  . . . . . . . . . . . . . . . . .   4
     3.1.  Out-of-Band Encoding  . . . . . . . . . . . . . . . . . .   4
       3.1.1.  Resource Map  . . . . . . . . . . . . . . . . . . . .   5
     3.2.  Alternative Services  . . . . . . . . . . . . . . . . . .   5
   4.  Possible Future Directions  . . . . . . . . . . . . . . . . .   5
     4.1.  Scope-Restricted Alt-Svc Entries  . . . . . . . . . . . .   6
     4.2.  Indicating Support for Alt-Svc Parameters . . . . . . . .   6
     4.3.  Incremental Alt-Svc Advertisements  . . . . . . . . . . .   7
     4.4.  The 3NN (Use Alternative) Status Code . . . . . . . . . .   7
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   8
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .   8
     7.2.  Informative References  . . . . . . . . . . . . . . . . .   8
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .   9
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   With increasingly large content deployments, certain origins become
   too large to contain all the data which is logically connected on the
   same server.  A similar issue exists on CDNs, where an origin being
   served through a reverse-proxy contains too many large resources for
   a single instance to cache effectively.

   Examples of this abound in the real world -- consider the video
   libraries of Netflix or YouTube, the photo library of Facebook, or
   the software library of any large software publisher which must make
   available multiple full and patch versions of multiple editions of
   multiple software products.





Bishop                    Expires 7 April 2022                  [Page 2]

Internet-Draft          Distributed HTTP Origins            October 2021


   While there are existing ways to address this problem, they are
   suboptimal in various ways.  This document discusses existing
   approaches (Section 2), previous standards efforts which may provide
   solutions (Section 3), and possible directions for future development
   (Section 4).

1.1.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  Existing Solutions

   In the real world, the origin users initially visit in a browser is
   typically one that a human can remember and type.  This user-facing
   origin serves HTML that references content, which may be on other
   origins.  A similar approach exists in non-browser cases, where a
   user-locatable front-end indicates the actual location of the desired
   content.

2.1.  Content-Specific Hostnames

   One solution, visible in multiple services, uses granular hostnames
   to identify the server or servers with the particular content in
   question, such as r2---sn-jpocxaa-j8bl.googlevideo.com.  This
   hostname, with its own HTTP origin, controls a particular slice of
   the media available on YouTube.com.  The YouTube service indicates to
   a player loading a video which origin contains or caches the
   requested content.

   Note that there are several ways of providing these hostnames to
   clients, depending on the interaction model between the client and
   the server.  For example:

   *  The server might generate HTML or JSON content in response to an
      initial request, providing absolute URIs for each dependent
      resource which indicate the specific host from which the resource
      can be retrieved

   *  The server might return a 3XX (Redirect) response to a client's
      query for a resource, directing the client to the resource at a
      different hostname

   *  An API might enable a client to query for the location of a
      resource before requesting it



Bishop                    Expires 7 April 2022                  [Page 3]

Internet-Draft          Distributed HTTP Origins            October 2021


   One drawback of this approach is that the content belongs to a
   different origin than the primary origin of the page.  While this is
   less of an issue in APIs or bulk data transfer, this limits the types
   of requests that can be made and the access to the data from scripts
   loaded by the primary origin without first making CORS preflight
   requests [CORS], which introduce additional latency.

   This approach can also complicate certain protocol features which
   rely on previous contact with the server.  The primary server
   typically cannot provide Alt-Svc entries for the secondary, though
   the targeting of the specific hostname may avoid the need for Alt-
   Svc.  TLS session resumption and 0-RTT will typically not be usable,
   adding latency to the request.

2.2.  Internal Load-Balancing

   A second solution, which is generally not visible to the client, is
   to have all requests terminated by a front-end which does not cache
   or serve any content directly.  Rather, this front-end is responsible
   for inspecting the request, identifying the server which can actually
   respond, and forwarding the request to that server.

   This solution has its own challenges.  While the data access and
   storage requirements can be distributed amongst back-end machines,
   throughput on the front-end load balancer becomes a bottleneck.  For
   certain protocols, direct server return (DSR) avoids this bottleneck
   by sending response packets back to the client instead of sending
   them via the load balancer.  However, DSR is challenging with
   reliable and encrypted protocols, and even moreso with multiplexed
   protocols like HTTP/2 or HTTP/3.

3.  Previous Standards Efforts

   Several previous drafts in the IETF have offered partial solutions
   for this problem, but have not been published as RFCs or achieved
   widespread adoption.

3.1.  Out-of-Band Encoding

   [OOB] describes an HTTP content coding that can be used to describe
   the location of a secondary resource that contains the payload.  The
   origin returns an HTTP field set which describes the content,
   including a Content-Encoding header which indicates the content can
   be fetched from a different URL, typically hosted on a different
   origin server.






Bishop                    Expires 7 April 2022                  [Page 4]

Internet-Draft          Distributed HTTP Origins            October 2021


   This approach is similar in spirit to Content-Specific Hostnames as
   described in Section 2.1, except that the resources continue to
   belong to a single origin regardless of which origin server actually
   delivers the bytes.  Unlike Content-Specific Hostnames, however, a
   separate request must be made for each resource -- first to the
   origin server to receive the headers, then to the secondary server to
   retrieve the content of the response.

3.1.1.  Resource Map

   [SCD] references a possible extension to this idea, where the origin
   server would indicate to a client that a particular set of resources
   would all be available from a particular secondary server.  However,
   the specifics of this interaction were not identified in that draft.

   One drawback to this approach is that an origin might prefer not to
   distribute the full set of endpoints or resources, either because
   this information is considered proprietary or because the set itself
   is large enough to be prohibitive.

3.2.  Alternative Services

   [AltSvc] describes a way in which an origin server can delegate
   authority over the origin to another host which might be preferable
   in some way.  However, this mechanism delegates the entire origin and
   cannot be subdivided.

   A 421 response being used to work around this dramatically reduces
   efficiency, as the client has no insight into which paths the
   alternative might or might not support.

4.  Possible Future Directions

   Any new solution should fit within the following constraints:

   *  No new feature to address this scenario can expect to entirely
      replace the existing approaches given client upgrade and hardware
      replacement schedules, so the solution needs to be easily layered
      on top of current approaches.  This likely implies a client-
      advertised extension.

   *  Unlike Alt-Svc ([AltSvc]), the solution should permit delegation
      of portions of the origin's URI space to one or more secondary
      servers.

   *  Unlike resource maps (Section 3.1.1), the solution should permit
      incremental new information about secondary server(s) and
      delegated ranges of resources.



Bishop                    Expires 7 April 2022                  [Page 5]

Internet-Draft          Distributed HTTP Origins            October 2021


   This section describes one possible solution in this vein, based on
   HTTP Alternative Services [AltSvc].  The components of this solution
   might be generally useful and incorporated into various
   specifications, or might be tightly coupled and belong in a single
   specification.

   Other solutions within these constraints should also be considered.

4.1.  Scope-Restricted Alt-Svc Entries

   When an alternative service is advertised by an origin, by default
   the indicated server is authoritative for all resources in the
   origin.  The scope parameter can be used to adjust this scope.

   The scope parameter contains the path portion of a URI; see
   Section 3.3 of [RFC3986].  The indicated alternative is authoritative
   only for resources where the path begins with the indicated prefix.

   scope = DQUOTE path DQUOTE ; see [RFC3986], Section 3.3

   For example:

   Alt-Svc: h2=":443"; ma=3600; scope="/sn-jpocxaa-j8bl/",
            h2=":443"; ma=3600; scope="/sn-5ualdn7s"

   A scope-restricted alternative SHOULD NOT be sent requests for
   resources unless the path portion of the URI is a prefix match with
   the indicated scope.

   [AltSvc] indicates that parameters are optional to understand.
   Therefore, origin servers SHOULD NOT send an alternative service
   advertisement to a client which has not indicated support for this
   extension (Section 4.2).  Alternatives MUST be prepared to receive
   requests for any resource in the origin.  However, the alternative
   MAY respond with a 421 (Misdirected Request) to any request it is
   unable to serve.

4.2.  Indicating Support for Alt-Svc Parameters

   Certain origins might prefer to take different actions based on
   whether the client supports HTTP Alternative Services or not.  For
   example, many clients are unable to implement the persist parameter
   defined in [AltSvc].  Servers that offer alternatives based on the
   client's current network connection might choose not to send Alt-Svc
   entries to such a client.






Bishop                    Expires 7 April 2022                  [Page 6]

Internet-Draft          Distributed HTTP Origins            October 2021


   The client can optionally send an Accept-Alt-Svc request header field
   indicating which Alt-Svc parameters it is able to understand.  The
   content of this field is an sf-list [RFC8941] of Alt-Svc parameter
   names.  To reduce fingerprinting surface, the contents of the list
   SHOULD be sorted alphabetically.

   For example:

   Accept-Alt-Svc: host, ma, persist, scope

   A server MAY publish alternative services containing parameters which
   are not understood by the client, since unknown parameters are
   ignored per [AltSvc].

   While [AltSvc] enables an alternative to reside on a different host
   than the origin server, not all clients implement this behavior.
   This draft registers the "host" parameter for Alt-Svc to enable
   clients to indicate support for Alt-Svc entries which provide a
   different hostname from the origin.  The "host" parameter MUST NOT be
   used in Alt-Svc field generation and MUST be ignored if present.

   The presence of this header can be assumed to indicate support for
   Alt-Svc, even if empty.

4.3.  Incremental Alt-Svc Advertisements

   [AltSvc] says that when an Alt-Svc response header field is received
   from an origin, its value invalidates and replaces all cached
   alternative services for that origin.

   In certain circumstances, a server might prefer not to publish the
   full list of alternatives, but instead incrementally add to them.
   For example, a server might provide scope-restricted alternatives as
   a client makes requests for resources in various scopes.

   This draft defines the Additional-Alt-Svc header field.  The parsing
   and semantics of this field are identical to that of Alt-Svc, with
   the following modifications:

   *  The value MUST NOT be "clear"

   *  The entries presented augment, rather than replace, any cached
      alternatives already known to the client.

4.4.  The 3NN (Use Alternative) Status Code

   This document defines a new status code directing that a client
   attempt to satisfy the request from an alternative.



Bishop                    Expires 7 April 2022                  [Page 7]

Internet-Draft          Distributed HTTP Origins            October 2021


   A server MUST include an Alt-Svc or Additional-Alt-Svc header field
   in the response indicating which alternative(s) the client can use to
   satisfy the given request.  A server MUST NOT send the 3NN status
   code in response to a request which did not contain the Accept-Alt-
   Svc header field.

   Upon receipt of this status code, a client SHOULD choose an
   alternative service and retry the request with that alternative.  If
   all configured alternatives are unsuccessful, or the client chooses
   not to use an alternative, the client MAY retry the request with the
   origin server, omitting the Accept-Alt-Svc header field.

5.  Security Considerations

   TODO Security

6.  IANA Considerations

   Lots of stuff to register later.

7.  References

7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, DOI 10.17487/RFC3986, January 2005,
              <https://www.rfc-editor.org/rfc/rfc3986>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

   [RFC8941]  Nottingham, M. and P-H. Kamp, "Structured Field Values for
              HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021,
              <https://www.rfc-editor.org/rfc/rfc8941>.

7.2.  Informative References

   [AltSvc]   Nottingham, M., McManus, P., and J. Reschke, "HTTP
              Alternative Services", RFC 7838, DOI 10.17487/RFC7838,
              April 2016, <https://www.rfc-editor.org/rfc/rfc7838>.




Bishop                    Expires 7 April 2022                  [Page 8]

Internet-Draft          Distributed HTTP Origins            October 2021


   [CORS]     "Cross-Origin Resource Sharing (CORS)", n.d.,
              <https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS>.

   [OOB]      Reschke, J. F. and S. Loreto, "'Out-Of-Band' Content
              Coding for HTTP", Work in Progress, Internet-Draft, draft-
              reschke-http-oob-encoding-12, 24 June 2017,
              <https://datatracker.ietf.org/doc/html/draft-reschke-http-
              oob-encoding-12>.

   [SCD]      Thomson, M., Eriksson, G. A., and C. Holmberg, "An
              Architecture for Secure Content Delegation using HTTP",
              Work in Progress, Internet-Draft, draft-thomson-http-scd-
              02, 30 October 2016,
              <https://datatracker.ietf.org/doc/html/draft-thomson-http-
              scd-02>.

Acknowledgments

   TODO acknowledge.

Author's Address

   Mike Bishop
   Akamai Technologies

   Email: mbishop@evequefou.be

























Bishop                    Expires 7 April 2022                  [Page 9]