Internet DRAFT - draft-bishop-httpbis-distributed-origin
draft-bishop-httpbis-distributed-origin
HTTP Working Group M. Bishop
Internet-Draft Akamai Technologies
Intended status: Informational 4 October 2021
Expires: 7 April 2022
Distributed HTTP Origins: Solution Space Exploration
draft-bishop-httpbis-distributed-origin-00
Abstract
Certain content libraries are logically a single origin, but too
large to be practically served by a single origin server. This
document discusses existing solutions and explores possible
directions for future protocol development.
Discussion Venues
This note is to be removed before publishing as an RFC.
Discussion of this document takes place on the mailing list
(httpbis@ietf.org), which is archived at
https://mailarchive.ietf.org/arch/browse/httpbis/.
Source for this draft and an issue tracker can be found at
https://github.com/MikeBishop/alt-svc-bis.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 7 April 2022.
Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved.
Bishop Expires 7 April 2022 [Page 1]
Internet-Draft Distributed HTTP Origins October 2021
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Simplified BSD License text
as described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Conventions and Definitions . . . . . . . . . . . . . . . 3
2. Existing Solutions . . . . . . . . . . . . . . . . . . . . . 3
2.1. Content-Specific Hostnames . . . . . . . . . . . . . . . 3
2.2. Internal Load-Balancing . . . . . . . . . . . . . . . . . 4
3. Previous Standards Efforts . . . . . . . . . . . . . . . . . 4
3.1. Out-of-Band Encoding . . . . . . . . . . . . . . . . . . 4
3.1.1. Resource Map . . . . . . . . . . . . . . . . . . . . 5
3.2. Alternative Services . . . . . . . . . . . . . . . . . . 5
4. Possible Future Directions . . . . . . . . . . . . . . . . . 5
4.1. Scope-Restricted Alt-Svc Entries . . . . . . . . . . . . 6
4.2. Indicating Support for Alt-Svc Parameters . . . . . . . . 6
4.3. Incremental Alt-Svc Advertisements . . . . . . . . . . . 7
4.4. The 3NN (Use Alternative) Status Code . . . . . . . . . . 7
5. Security Considerations . . . . . . . . . . . . . . . . . . . 8
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
7. References . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1. Normative References . . . . . . . . . . . . . . . . . . 8
7.2. Informative References . . . . . . . . . . . . . . . . . 8
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 9
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9
1. Introduction
With increasingly large content deployments, certain origins become
too large to contain all the data which is logically connected on the
same server. A similar issue exists on CDNs, where an origin being
served through a reverse-proxy contains too many large resources for
a single instance to cache effectively.
Examples of this abound in the real world -- consider the video
libraries of Netflix or YouTube, the photo library of Facebook, or
the software library of any large software publisher which must make
available multiple full and patch versions of multiple editions of
multiple software products.
Bishop Expires 7 April 2022 [Page 2]
Internet-Draft Distributed HTTP Origins October 2021
While there are existing ways to address this problem, they are
suboptimal in various ways. This document discusses existing
approaches (Section 2), previous standards efforts which may provide
solutions (Section 3), and possible directions for future development
(Section 4).
1.1. Conventions and Definitions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
2. Existing Solutions
In the real world, the origin users initially visit in a browser is
typically one that a human can remember and type. This user-facing
origin serves HTML that references content, which may be on other
origins. A similar approach exists in non-browser cases, where a
user-locatable front-end indicates the actual location of the desired
content.
2.1. Content-Specific Hostnames
One solution, visible in multiple services, uses granular hostnames
to identify the server or servers with the particular content in
question, such as r2---sn-jpocxaa-j8bl.googlevideo.com. This
hostname, with its own HTTP origin, controls a particular slice of
the media available on YouTube.com. The YouTube service indicates to
a player loading a video which origin contains or caches the
requested content.
Note that there are several ways of providing these hostnames to
clients, depending on the interaction model between the client and
the server. For example:
* The server might generate HTML or JSON content in response to an
initial request, providing absolute URIs for each dependent
resource which indicate the specific host from which the resource
can be retrieved
* The server might return a 3XX (Redirect) response to a client's
query for a resource, directing the client to the resource at a
different hostname
* An API might enable a client to query for the location of a
resource before requesting it
Bishop Expires 7 April 2022 [Page 3]
Internet-Draft Distributed HTTP Origins October 2021
One drawback of this approach is that the content belongs to a
different origin than the primary origin of the page. While this is
less of an issue in APIs or bulk data transfer, this limits the types
of requests that can be made and the access to the data from scripts
loaded by the primary origin without first making CORS preflight
requests [CORS], which introduce additional latency.
This approach can also complicate certain protocol features which
rely on previous contact with the server. The primary server
typically cannot provide Alt-Svc entries for the secondary, though
the targeting of the specific hostname may avoid the need for Alt-
Svc. TLS session resumption and 0-RTT will typically not be usable,
adding latency to the request.
2.2. Internal Load-Balancing
A second solution, which is generally not visible to the client, is
to have all requests terminated by a front-end which does not cache
or serve any content directly. Rather, this front-end is responsible
for inspecting the request, identifying the server which can actually
respond, and forwarding the request to that server.
This solution has its own challenges. While the data access and
storage requirements can be distributed amongst back-end machines,
throughput on the front-end load balancer becomes a bottleneck. For
certain protocols, direct server return (DSR) avoids this bottleneck
by sending response packets back to the client instead of sending
them via the load balancer. However, DSR is challenging with
reliable and encrypted protocols, and even moreso with multiplexed
protocols like HTTP/2 or HTTP/3.
3. Previous Standards Efforts
Several previous drafts in the IETF have offered partial solutions
for this problem, but have not been published as RFCs or achieved
widespread adoption.
3.1. Out-of-Band Encoding
[OOB] describes an HTTP content coding that can be used to describe
the location of a secondary resource that contains the payload. The
origin returns an HTTP field set which describes the content,
including a Content-Encoding header which indicates the content can
be fetched from a different URL, typically hosted on a different
origin server.
Bishop Expires 7 April 2022 [Page 4]
Internet-Draft Distributed HTTP Origins October 2021
This approach is similar in spirit to Content-Specific Hostnames as
described in Section 2.1, except that the resources continue to
belong to a single origin regardless of which origin server actually
delivers the bytes. Unlike Content-Specific Hostnames, however, a
separate request must be made for each resource -- first to the
origin server to receive the headers, then to the secondary server to
retrieve the content of the response.
3.1.1. Resource Map
[SCD] references a possible extension to this idea, where the origin
server would indicate to a client that a particular set of resources
would all be available from a particular secondary server. However,
the specifics of this interaction were not identified in that draft.
One drawback to this approach is that an origin might prefer not to
distribute the full set of endpoints or resources, either because
this information is considered proprietary or because the set itself
is large enough to be prohibitive.
3.2. Alternative Services
[AltSvc] describes a way in which an origin server can delegate
authority over the origin to another host which might be preferable
in some way. However, this mechanism delegates the entire origin and
cannot be subdivided.
A 421 response being used to work around this dramatically reduces
efficiency, as the client has no insight into which paths the
alternative might or might not support.
4. Possible Future Directions
Any new solution should fit within the following constraints:
* No new feature to address this scenario can expect to entirely
replace the existing approaches given client upgrade and hardware
replacement schedules, so the solution needs to be easily layered
on top of current approaches. This likely implies a client-
advertised extension.
* Unlike Alt-Svc ([AltSvc]), the solution should permit delegation
of portions of the origin's URI space to one or more secondary
servers.
* Unlike resource maps (Section 3.1.1), the solution should permit
incremental new information about secondary server(s) and
delegated ranges of resources.
Bishop Expires 7 April 2022 [Page 5]
Internet-Draft Distributed HTTP Origins October 2021
This section describes one possible solution in this vein, based on
HTTP Alternative Services [AltSvc]. The components of this solution
might be generally useful and incorporated into various
specifications, or might be tightly coupled and belong in a single
specification.
Other solutions within these constraints should also be considered.
4.1. Scope-Restricted Alt-Svc Entries
When an alternative service is advertised by an origin, by default
the indicated server is authoritative for all resources in the
origin. The scope parameter can be used to adjust this scope.
The scope parameter contains the path portion of a URI; see
Section 3.3 of [RFC3986]. The indicated alternative is authoritative
only for resources where the path begins with the indicated prefix.
scope = DQUOTE path DQUOTE ; see [RFC3986], Section 3.3
For example:
Alt-Svc: h2=":443"; ma=3600; scope="/sn-jpocxaa-j8bl/",
h2=":443"; ma=3600; scope="/sn-5ualdn7s"
A scope-restricted alternative SHOULD NOT be sent requests for
resources unless the path portion of the URI is a prefix match with
the indicated scope.
[AltSvc] indicates that parameters are optional to understand.
Therefore, origin servers SHOULD NOT send an alternative service
advertisement to a client which has not indicated support for this
extension (Section 4.2). Alternatives MUST be prepared to receive
requests for any resource in the origin. However, the alternative
MAY respond with a 421 (Misdirected Request) to any request it is
unable to serve.
4.2. Indicating Support for Alt-Svc Parameters
Certain origins might prefer to take different actions based on
whether the client supports HTTP Alternative Services or not. For
example, many clients are unable to implement the persist parameter
defined in [AltSvc]. Servers that offer alternatives based on the
client's current network connection might choose not to send Alt-Svc
entries to such a client.
Bishop Expires 7 April 2022 [Page 6]
Internet-Draft Distributed HTTP Origins October 2021
The client can optionally send an Accept-Alt-Svc request header field
indicating which Alt-Svc parameters it is able to understand. The
content of this field is an sf-list [RFC8941] of Alt-Svc parameter
names. To reduce fingerprinting surface, the contents of the list
SHOULD be sorted alphabetically.
For example:
Accept-Alt-Svc: host, ma, persist, scope
A server MAY publish alternative services containing parameters which
are not understood by the client, since unknown parameters are
ignored per [AltSvc].
While [AltSvc] enables an alternative to reside on a different host
than the origin server, not all clients implement this behavior.
This draft registers the "host" parameter for Alt-Svc to enable
clients to indicate support for Alt-Svc entries which provide a
different hostname from the origin. The "host" parameter MUST NOT be
used in Alt-Svc field generation and MUST be ignored if present.
The presence of this header can be assumed to indicate support for
Alt-Svc, even if empty.
4.3. Incremental Alt-Svc Advertisements
[AltSvc] says that when an Alt-Svc response header field is received
from an origin, its value invalidates and replaces all cached
alternative services for that origin.
In certain circumstances, a server might prefer not to publish the
full list of alternatives, but instead incrementally add to them.
For example, a server might provide scope-restricted alternatives as
a client makes requests for resources in various scopes.
This draft defines the Additional-Alt-Svc header field. The parsing
and semantics of this field are identical to that of Alt-Svc, with
the following modifications:
* The value MUST NOT be "clear"
* The entries presented augment, rather than replace, any cached
alternatives already known to the client.
4.4. The 3NN (Use Alternative) Status Code
This document defines a new status code directing that a client
attempt to satisfy the request from an alternative.
Bishop Expires 7 April 2022 [Page 7]
Internet-Draft Distributed HTTP Origins October 2021
A server MUST include an Alt-Svc or Additional-Alt-Svc header field
in the response indicating which alternative(s) the client can use to
satisfy the given request. A server MUST NOT send the 3NN status
code in response to a request which did not contain the Accept-Alt-
Svc header field.
Upon receipt of this status code, a client SHOULD choose an
alternative service and retry the request with that alternative. If
all configured alternatives are unsuccessful, or the client chooses
not to use an alternative, the client MAY retry the request with the
origin server, omitting the Accept-Alt-Svc header field.
5. Security Considerations
TODO Security
6. IANA Considerations
Lots of stuff to register later.
7. References
7.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, DOI 10.17487/RFC3986, January 2005,
<https://www.rfc-editor.org/rfc/rfc3986>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC8941] Nottingham, M. and P-H. Kamp, "Structured Field Values for
HTTP", RFC 8941, DOI 10.17487/RFC8941, February 2021,
<https://www.rfc-editor.org/rfc/rfc8941>.
7.2. Informative References
[AltSvc] Nottingham, M., McManus, P., and J. Reschke, "HTTP
Alternative Services", RFC 7838, DOI 10.17487/RFC7838,
April 2016, <https://www.rfc-editor.org/rfc/rfc7838>.
Bishop Expires 7 April 2022 [Page 8]
Internet-Draft Distributed HTTP Origins October 2021
[CORS] "Cross-Origin Resource Sharing (CORS)", n.d.,
<https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS>.
[OOB] Reschke, J. F. and S. Loreto, "'Out-Of-Band' Content
Coding for HTTP", Work in Progress, Internet-Draft, draft-
reschke-http-oob-encoding-12, 24 June 2017,
<https://datatracker.ietf.org/doc/html/draft-reschke-http-
oob-encoding-12>.
[SCD] Thomson, M., Eriksson, G. A., and C. Holmberg, "An
Architecture for Secure Content Delegation using HTTP",
Work in Progress, Internet-Draft, draft-thomson-http-scd-
02, 30 October 2016,
<https://datatracker.ietf.org/doc/html/draft-thomson-http-
scd-02>.
Acknowledgments
TODO acknowledge.
Author's Address
Mike Bishop
Akamai Technologies
Email: mbishop@evequefou.be
Bishop Expires 7 April 2022 [Page 9]