Internet DRAFT - draft-toomim-httpbis-range-patch
draft-toomim-httpbis-range-patch
Internet-Draft M. Milutinovic
Expires: Mar 16, 2020 UC Berkeley
Intended status: Proposed Standard M. Toomim
Invisible College
B. Bellomy
Invisible College
Nov 18, 2019
Range Patch
draft-toomim-httpbis-range-patch-00
Abstract
A uniform approach for expressing changes to state over HTTP.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts. The list of current Internet-Drafts is at
http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
https://www.ietf.org/1id-abstracts.html
The list of Internet-Draft Shadow Directories can be accessed at
https://www.ietf.org/shadow.html
Table of Contents
1. Introduction ....................................................3
2. Range Patch .....................................................3
2.1. Multiple Range Patches .....................................5
2.2. Stand-Alone Range Patch ....................................5
2.3. URI Fragment Identifiers ...................................6
3. Range Units .....................................................7
3.1. Bytes Range Unit ...........................................7
3.2. JSON Range Unit ............................................8
3.3. Lines Range Unit ..........................................10
4. IANA Considerations ............................................11
4.1. Range Unit Registrations ..................................11
4.2. The +patch Structured Syntax Suffix .......................12
5. Checking Capabilities ..........................................13
6. Race Conditions ................................................14
7. Security Considerations ........................................14
8. Conventions ....................................................14
9. Copyright Notice ...............................................14
10. References ....................................................15
10.1. Normative References .....................................15
10.2. Informative References ...................................15
1. Introduction
This documents describes a uniform approach for expressing changes to
state over HTTP. It builds upon [RFC7233] and details how patches
can be defined using range units, ranges, and content. Any patch is
expressed in the form:
"range X in units Y of the data was replaced with content Z"
Range units define how original content (being patched) should be
parsed to obtain a region of the content which is being patched, and
then how that region is replaced with new content.
2. Range Patch
[RFC7233] effectively already defines how a patch operating on byte
units can be represented over HTTP, using Content-Range,
Content-Type, and Content-Length HTTP headers. Example:
HTTP/1.1 206 Partial Content
Date: Wed, 15 Nov 1995 06:25:24 GMT
Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT
Content-Range: bytes 21010-47021/47022
Content-Length: 26012
Content-Type: image/gif
... 26012 bytes of partial image data ...
The same approach can be used to describe a range inside content
interpreted not as bytes, but, for example, as JSON [RFC8259] or
JSON-compatible structure. We define such JSON range unit in
Section 4.1. For example, given the following JSON document:
{"foo": {"bar": [
{"some": "thing"},
{"no": "thing"},
{"mo": "re"},
{"baz": {"1": {"two": "tree"}}}
]}}
One might make the following request:
GET /api/document/1 HTTP/1.1
Host: example.com
Accept: application/json
Range: json=/foo/bar/3/baz
And receive the following response:
HTTP/1.1 206 Partial Content
Date: Thu, 31 Oct 2019 07:51:08 GMT
Last-Modified: Thu, 18 Oct 2019 17:44:39 GMT
Content-Range: json /foo/bar/3/baz
Content-Length: 22
Content-Type: application/json
{"1": {"two": "tree"}}
[RFC7233] defines and allows a Range header only for the GET request
method. In this document, we define the behavior for other request
methods. Which methods a given resource supports and which
methods accept range patches as defined in this document is left to
the server to define.
When issuing a non-GET request to a resource, a range patch can be
provided using Range header field.
PATCH /api/image/1 HTTP/1.1
Host: example.com
Range: bytes=21010-47021
Content-Length: 26012
Content-Type: image/gif
... 26012 bytes of new partial image data ...
And for JSON:
PATCH /api/document/1 HTTP/1.1
Host: example.com
Range: json=/foo/bar/3/baz
Content-Length: 25
Content-Type: application/json
{"2": {"three": "flour"}}
A patch with empty contents corresponds to deletion of existing
content at the specified range. A patch with a zero-length range but
non-empty contents corresponds to inserting content immediately
before the location of the zero-length range. A patch with non-empty
contents at a non-zero-length range corresponds to replacing existing
content at the range with new content.
When server supports Range header with non-GET requests, server MUST
NOT ignore the Range header when used with a non-GET request. When
server does not support Range header with non-GET requests, a server
SHOULD generate a 416 (Range Not Satisfiable) or a 400 (Bad Request)
response when a non-GET request with a Range header is made. Proxies
SHOULD NOT drop Range header for non-GET requests. To assure correct
handling of non-GET requests with the Range header, requester can
check server's support for it as described in Section 5.
2.1. Multiple Range Patches
Multiple range patches can also be combined in one request. This can
be done by reusing [RFC7233] for transferring multiple parts using
multipart/byteranges payload as described in Section 4.1. of
[RFC7233].
When issuing a non-GET request to a resource, multiple range patches
can be provided as well:
PATCH /api/document/1 HTTP/1.1
Host: example.com
Content-Length: 200
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES
--THIS_STRING_SEPARATES
Content-Type: application/json
Range: json=/foo/bar/2/mo
42
--THIS_STRING_SEPARATES
Content-Type: application/json
Range: json=/foo/bar/1/no
"person"
2.2. Stand-Alone Range Patch
When range patches are transmitted outside of HTTP session, a
stand-alone range patch format can be used. For example, in this
format a patch can be stored in a file, send to a mailing list, or a
code version control system can display the patch in the range patch
format. The format reuses structure from HTTP and consists of
headers separated from the patch body by an empty line. Only
Content-Range header is required. Example:
Content-Range: json /foo/bar/3/baz
{"1": {"two": "tree"}}
Additional headers can be provided. This can be used even for
multiple range patches. In such case the patch starts with
Content-Type header defining the boundary. Example:
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES
--THIS_STRING_SEPARATES
Content-Range: json /foo/bar/2/mo
42
--THIS_STRING_SEPARATES
Content-Range: json /foo/bar/1/no
"person"
Stand-alone range patches can be transmitted over HTTP as-is as well.
This can be used to provide the patch which has been used in a
previous non-GET request. A Content-Type with "+patch" suffix
identifies such stand-alone range patch. For example, the patch used
in the PATCH request example above could be retrieved as:
HTTP/1.1 200 OK
Date: Thu, 31 Oct 2019 07:51:08 GMT
Last-Modified: Thu, 18 Oct 2019 17:44:39 GMT
Content-Length: 62
Content-Type: application/json+patch
Content-Range: json /foo/bar/3/baz
{"2": {"three": "flour"}}
Stand-alone range patches are binary data.
2.3. URI Fragment Identifiers
For media types which support range patches, ranges can be used as
URI fragment identifies as well. For example, URI:
/api/document/1#json=/foo/bar/0
identifies a fragment with the following content:
{"some": "thing"}
Multiple ranges are supported as well and they identify multiple
fragments:
/api/document/1#json=/foo/bar/0,/foo/bar/1
3. Range Units
Range units define how content is parsed into a structure. They
define a corresponding range specification which is a string
describing range under the unit.
Different range units can be compatible with content expresses
through different media types.
3.1. Bytes Range Unit
Bytes range unit is already specified in [RFC7233]. We extend it by
allowing a zero-length range using a zero-length-byte-range-spec.
zero-length-byte-range-spec = 1*DIGIT
A zero-length range is a byte offset used to identify a location
immediately before which new content can be inserted with a patch.
Additionally, we note that the range "-0" is allowed, is a
zero-length range, and identifies a location immediately after the
last byte of data. This allows appending bytes to data.
Note that bytes range unit operates on encoded content as specified
in any Content-Encoding header. That holds both for GET and non-GET
requests.
3.2. JSON Range Unit
JSON range unit operates on JSON and JSON-compatible data structures.
Its range specification is based on JSON pointer as described in
[RFC6901]. The content of the range MUST always be a valid JSON by
itself. JSON range unit is identified with "json".
JSON pointer provides capabilities to identify a single element of a
data structure. Here we extend it to allow a range of elements for
arrays and strings, by extending the scheme how reference token
modifies which value is referenced from Section 4 of [RFC6901]:
o If the currently referenced value is a JSON array, the reference
token can be compromised of two sets of digits (according to the
ABNF syntax for array indices as specified in Section 4 of
[RFC6901]), delimited by the character "-". Each set of digits
represent an unsigned base-10 integer value. The first integer
value MUST be smaller than the number of elements in the array. The
second integer value MUST be smaller than or equal to the number of
elements in the array. The second integer value MUST be larger than
or equal to the first integer value. If any of these requirements
are violated, an error condition is raised.
The new referenced value is a new array with a subset of elements
starting at the zero-based index of the first integer value, and
ending at the element before the zero-based index of the second
integer value (the first index is inclusive, the second index is
exclusive).
o If the currently referenced value is a JSON array, the reference
token can be the character "-". The new referenced value is a
zero-length array corresponding to the position immediately after
the end of the current array. This design makes such JSON pointer
compatible with the use of JSON pointers in JSON Patch [RFC6902].
This allows appending array elements to an array.
o If the currently referenced value is a JSON string, the scheme for
JSON arrays is used to index into a string and makes the new
referenced value a substring of the currently referenced value.
String indexing is done by code units.
A range of elements can be specified only as the last reference token
in JSON range. It follows that a range of elements can be specified
only once.
For example, given the JSON document:
{
"foo": [
"bar",
"baz",
"bax"
]
}
The following JSON strings evaluate to the accompanying JSON values:
"/foo" ["bar", "baz"]
"/foo/0" "bar"
"/foo/0-1" ["bar"]
"/foo/1-3" ["baz", "bax"]
"/foo/1-1" []
"/foo/-" []
"/foo/3-3" // error
"/foo/4-4" // error
"/foo/1-0" // error
"/foo/1-4" // error
"/foo/1-3/0" // error
"/foo/0/1-3" "ar"
JSON ranges "/foo/1-1" and "/foo/-" are on its own of little utility,
but serve as a zero-length range to identify a location immediately
before which new content can be inserted with a patch.
JSON range unit operates always on non-encoded content, ignoring any
Content-Encoding header. That holds both for GET and non-GET
requests.
3.3. Lines Range Unit
For textual contents lines range unit operates on lines. Line
positions are numbered starting with zero (with line position zero
always being identical with character position zero). Ranges
identified by lines include the line endings. If a content does not
contain any line endings, then it consists of a single (the first)
line.
Implementers should be aware of the fact that line endings in textual
contents can be represented by other characters or character
sequences than CR+LF. Besides the CR and LF, there are also NEL and
CR+NEL. In general, the encoding of line endings can also depend on
the character encoding of textual contents, and implementations have
to take this into account where necessary.
Lines range unit is identified with "lines". Lines range
specification is defined by:
lines-range-spec = first-line "-" second-line
first-line = 1*DIGIT
second-line = 1*DIGIT
Each lines range consists of two sets of digits, delimited by a
character "-". Each set of digits represent an unsigned base-10
integer value. The first integer value MUST be smaller than the
number of lines in contents. The second integer value MUST be
smaller than or equal to the number of lines in contents. The second
integer value MUST be larger than or equal to the first integer
value.
The range are lines starting at the line corresponding to the first
integer value, and ending at the line before the line corresponding
to the second integer value (the first integer is inclusive, the
second integer is exclusive).
Lines range where the first and second integer value are equal are
empty and are on its own of little utility, but serve as a
zero-length range to identify a location immediately before which new
content can be inserted with a patch.
Additionally, lines range specification can be the character "-",
representing a zero-length range, and identifies a location
immediately after the last line of textual contents. This allows
appending lines to textual contents.
Lines range unit operates always on non-encoded content, ignoring any
Content-Encoding header. That holds both for GET and non-GET
requests.
4. IANA Considerations
4.1. Range Unit Registrations
This document registers the following range units:
+-------------+---------------------------------------+-------------+
| Range Unit | Description | Reference |
| Name | | |
+-------------+---------------------------------------+-------------+
| json | a JSON pointer range on JSON and | Section 3.2 |
| | JSON-compatible data structures | |
+-------------+---------------------------------------+-------------+
| lines | a range of lines of textual contents | Section 3.3 |
+-------------+---------------------------------------+-------------+
The change controller is: "IETF (iesg@ietf.org) - Internet
Engineering Task Force".
4.2. The +patch Structured Syntax Suffix
This document registers the following media type structured syntax
suffix:
Name: Range patch
+suffix: +patch
References: See Section 2.2 of this document.
Encoding considerations: Stand-alone range patches are binary data.
Fragment identifier considerations:
The syntax and semantics of fragment identifiers specified for
+patch SHOULD be as specified for range patches themselves. (At
publication of this document, there is no fragment identification
syntax defined for range patches themselves.)
The syntax and semantics for fragment identifiers for a specific
"xxx/yyy+patch" SHOULD be processed as follows:
For cases defined in +patch, where the fragment identifier
resolves per the +patch rules, then process as specified in
+patch.
For cases defined in +patch, where the fragment identifier does
not resolve per the +patch rules, then such fragment SHOULD
identifies a fragment which is obtained by intersection of the
fragment identifier and the underlying range patch range
specification for "xxx/yyy+patch".
For cases not defined in +patch, then such fragment SHOULD
identifies a fragment which is obtained by intersection of the
fragment identifier and the underlying range patch range
specification for "xxx/yyy+patch".
Interoperability considerations: n/a
Security considerations: See Section 7 of this document.
Contact: IETF HTTP Working Group (ietf-http-wg@w3.org)
Author/Change controller:
IETF (iesg@ietf.org) - Internet Engineering Task Force
5. Checking Capabilities
A server may support or not support non-GET requests with a Range
header. The default behavior of servers is simply to ignore unknown
or unsupported headers. In the case of a range patch, this
implies that a request issuing a patch to a specific subsection of a
resource might be interpreted by a server as a request to overwrite
the entire resource with the patch, leaving the resource in a
corrupted state.
To determine whether or not the server can fulfill such a request
correctly, the requester may first issue an OPTIONS request:
OPTIONS /api/document/1
Range-Request-Method: PATCH
Range-Request-Units: json,bytes
To which the server may reply in the affirmative:
HTTP/1.1 204 No Content
Connection: keep-alive
Range-Request-Allow-Methods: PATCH
Range-Request-Allow-Units: json,bytes
Version: 33a64df551425fcc55e4d42a148795d9f25f89d4
In the partial negative:
HTTP/1.1 204 No Content
Connection: keep-alive
Range-Request-Allow-Methods: PATCH
Range-Request-Allow-Units: json
Version: 33a64df551425fcc55e4d42a148795d9f25f89d4
Or in the complete negative:
HTTP/1.1 204 No Content
Connection: keep-alive
Range-Request-Allow-Methods:
Range-Request-Allow-Units:
Empty header fields are allowed per [RFC2616] section 2.1.
Also note the presence of the Version header, discussed in section
6. The server may preemptively send this to obviate the need for
another GET prior to a range patch request.
6. Race Conditions
As with standard PUT, POST, and PATCH requests, a non-GET request
with a Range header carries the risk of a mid-air collision with
another simultaneous request. If one requester updates a resource,
and another requester, not being aware of that update, issues a
second update, the resource may be left in an unexpected state.
Standard PUT, POST, and PATCH requests handle this with the ETag and
If-Match headers. However, these headers vary based on the
Content-Encoding of the request. Alternatively, requests can use the
Versioning in [Braid-HTTP] to determine the ordering of simultaneous
requests, and can specify consistency guarantees with [Merge-Types].
The server may return a Version header in response to HTTP requests
directed at a given resource. Correspondingly, when issuing a range
patch, the requester may include a Version header containing the
version of the resource it intends to update. If the server cannot
merge the patch at the given version, it must return a 409 Conflict
response.
7. Security Considerations
Both GET and non-GET requests with a Range header are potentially
susceptible to denial-of-service attacks because the effort required
to compute the patch or apply the patch.
8. Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
9. Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
10. References
10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC7233] Fielding, R., Lafon, Y., and J. Reschke, "Hypertext
Transfer Protocol (HTTP/1.1): Range Requests", RFC 7233,
June 2014.
[RFC6901] Bryan, P., Zyp, K., and M. Nottingham, "JavaScript Object
Notation (JSON) Pointer", RFC 6901, April 2013.
[RFC6902] Bryan, P., and M. Nottingham, "JavaScript Object Notation
(JSON) Patch", RFC 6902, April 2013.
10.2. Informative References
[Merge-Types] draft-toomim-httpbis-merge-types-00
[Braid-HTTP] draft-toomim-httpbis-braid-http-00
[RFC8259] T. Bray, "The JavaScript Object Notation (JSON) Data
Interchange Format", RFC 8259, December 2017.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Masinter, L., Leach, P., and T. Berners-Lee,
"Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616,
June 1999.
Authors' Addresses
For more information, the authors of this document are best contacted
via Internet mail:
Mitar Milutinovic
UC Berkeley, EECS Department
775 Soda Hall #1776
Berkeley, CA 94720-1776
EMail: mitar.ietf@tnode.com
Web: https://mitar.tnode.com/
Michael Toomim
Invisible College, Berkeley
2053 Berkeley Way
Berkeley, CA 94704
EMail: toomim@gmail.com
Web: https://invisible.college/@toomim
Bryn Bellomy
Invisible College, Berkeley
2053 Berkeley Way
Berkeley, CA 94704
EMail: bryn@signals.io
Web: https://invisible.college/@bryn