Internet DRAFT - draft-carlyle-sem-delta-encoding
draft-carlyle-sem-delta-encoding
Network Working Group B. Carlyle
Internet-Draft June 30, 2012
Intended status: Experimental
Expires: January 1, 2013
Semantic Delta Encoding with HTTP
draft-carlyle-sem-delta-encoding-00
Abstract
Semantic Delta Encoding with HTTP defines an efficient stateless
mechanism for multiple clients to become synchronised and stay
synchronised with a large resource through the transmission of
semantic deltas (distinct from RFC3229), and supports a long poll
option for timely delivery. It is intended to support web browsers
needing to keep up to date with changes of state on an origin server
as well as to support a wide variety of other realtime and non-
realtime synchronisation requirements between communicating systems.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 1, 2013.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Carlyle Expires January 1, 2013 [Page 1]
Internet-Draft SDE-HTTP June 2012
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Requirements of Semantic Delta Encoding . . . . . . . . . . . 4
3. The Semantic Delta Encoding Model . . . . . . . . . . . . . . 5
4. The Main Resource . . . . . . . . . . . . . . . . . . . . . . 7
5. The Delta Resource . . . . . . . . . . . . . . . . . . . . . . 7
5.1. Choosing Delta Resource Identifiers . . . . . . . . . . . 8
5.2. Filtered Delta Buffers . . . . . . . . . . . . . . . . . . 8
5.3. Summarization . . . . . . . . . . . . . . . . . . . . . . 9
6. Delta Resource Response Types . . . . . . . . . . . . . . . . 10
6.1. 200 OK . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.2. 204 No Content . . . . . . . . . . . . . . . . . . . . . . 10
6.3. 410 Gone . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.4. Other Status Codes . . . . . . . . . . . . . . . . . . . . 11
7. Link Relation Types . . . . . . . . . . . . . . . . . . . . . 11
7.1. The Delta Link Relation Type . . . . . . . . . . . . . . . 11
7.2. The Next Link Relation Type . . . . . . . . . . . . . . . 11
8. Optional Long Poll . . . . . . . . . . . . . . . . . . . . . . 11
9. Degenerate Behaviours . . . . . . . . . . . . . . . . . . . . 13
10. Synchronising Large Resources . . . . . . . . . . . . . . . . 14
11. Cache Efficiency . . . . . . . . . . . . . . . . . . . . . . . 15
12. Transmission Efficiency . . . . . . . . . . . . . . . . . . . 16
13. Synchronous Synchronisation . . . . . . . . . . . . . . . . . 17
14. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17
15. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17
15.1. Delta Relation Type . . . . . . . . . . . . . . . . . . . 18
15.2. Request-Timeout Header . . . . . . . . . . . . . . . . . . 18
16. Security Considerations . . . . . . . . . . . . . . . . . . . 18
17. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18
17.1. Normative References . . . . . . . . . . . . . . . . . . . 18
17.2. Informative References . . . . . . . . . . . . . . . . . . 19
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 19
Carlyle Expires January 1, 2013 [Page 2]
Internet-Draft SDE-HTTP June 2012
1. Introduction
The Web's traditional fetch interaction has been to transfer a page
and related resources from origin server to browser, to allow the
user to interact with the page, and then to to submit a new browser
request that begins the process of loading a new page. As the Web
has become more interactive interactions between a page and the
origin server between page loads are more common. Many of these
interactions take the form of synchronising the latest information to
display to the user. Synchronisation of very small quantities of
state that have undemanding timing constraints can either fetch the
updated data when needed or poll the data periodically. Larger
quantities of state and tighter timing constraints can introduce
additional complexity.
State synchronisation problems such as this one can lead to solutions
that do not scale as well as the traditional Web interaction model
and can lead to reliability problems. Some solutions require a TCP
connection to be held open between the client and the origin server
for the duration of the page's lifetime. Some require other state
information to be stored by the origin server on behalf of each
client. The types of solution can violate the REST stateless
constraint with the result that scalability and reliability of
applications can be reduced, and it can be more difficult for
intermediaries such as caches to involve themselves effectively to
improve performance and deal with other considerations.
The Semantic Delta Encoding protocol is a state synchronisation
mechanism that complies with the REST stateless constraint. An
origin server maintains both the current state to be synchronised and
a buffer of recent changes whose size can be bounded by the server.
Each client synchronises an initial replica of the state, and then
subsequently keeps track of where it is up to in the buffer of recent
changes (the "delta buffer"). At appropriate times the client will
make a request that identifies the position it is up to in order to
request to be brought up to date with the current state.
This "delta fetch" interaction transfers only the changes that have
occurred, so allows a large resource to be synchronised with minimal
ongoing data transfer overhead and potentially reduced processing
overhead. This approach also increases the fidelity of information
transfer. By transferring the changes rather than a new state
snapshot the client does not need to compute what has changed in the
resource state but can read this information directly from the
message.
The mechanism is intended for two main use cases:
Carlyle Expires January 1, 2013 [Page 3]
Internet-Draft SDE-HTTP June 2012
1. A large set of state that changes in relatively small way needs
to be synchronised, and transferring the entire state each time a
change occurs would be inefficient
2. The changes to a set of state are more important to the client
than the state itself, meaning that the client would prefer to
obtain changes than to obtain new snapshots of the state
Outside of the Web this mechanism is able to be used to synchronise
data sets in a standard manner. For example:
o It can be used to synchronise lists of alarms or realtime alerts
between systems with minimal overhead
o It can be used to synchronise the state of a large number of
individual variables or properties between systems in a low
overhead manner that can be explicit about the nature of changes
that have occurred.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Comments are solicited and should be addressed to the author(s).
2. Requirements of Semantic Delta Encoding
Semantic Delta Encoding is intended to ensure the following
properties:
o The mechanism shall be usable in conjunction with any media type.
It should not define or require any specific media type for
initial state or for deltas
o The mechanism shall use standard HTTP/1.1 ([RFC2616]) requests to
transfer initial state and to transfer deltas
o The mechanism shall allow the origin server to place a constant
bound on the size of the delta buffer
o The mechanism shall support efficient delivery of delta responses
by leveraging existing cache infrastructure
o The mechanism should minimise the number of requests required to
bring a given client up to date
Carlyle Expires January 1, 2013 [Page 4]
Internet-Draft SDE-HTTP June 2012
o The mechanism is explicitly intended not to try and update the
bytes of the representation of the main resource, but instead to
provide a semantic delta describing the changes.
3. The Semantic Delta Encoding Model
The Semantic Delta Encoding model consists of one or more clients, a
main resource, a delta buffer, and a set of delta resources. The
main resource captures all of the state that needs to be synchronised
to a client, and the delta resources identify specific locations
within the delta buffer.
Each client will begin by issuing a HTTP GET request with headers of
its choosing to the main resource. This request is intended to bring
the client up to date with the current state of the resource. As
part of the response this main resource will identify the next delta
resource, and the client stores this resource identifier away.
When the client next wishes to be brought up to date with the main
resource state it queries the delta resource using the identifier
that was returned to it. This query will return all of the deltas
that have been added to the delta buffer since the client issued its
request to the main resource. This single query brings the client up
to date, and also informs the client of the identifier of the next
delta resource that the client should use.
If a client queries its delta resource when no new deltas have been
added, the server responds 204 No Content to indicate that nothing
has changed and the client should use the same delta resource
identifier in its next request.
If a client queries its delta resource when the location that this
resource identifies within the buffer has expired or when the delta
buffer itself is no longer valid, the server responds 410 Gone to
indicate that the client will need to query the main resource in
order to reestablish synchronisation.
For example:
Carlyle Expires January 1, 2013 [Page 5]
Internet-Draft SDE-HTTP June 2012
GET /activity-feed HTTP/1.1
Host: example.com
Accept: application/activity-feed+xml
HTTP/1.1 200 OK
Link: <http://example.com/activity-feed/delta/233>; rel="delta"
Content-Type: application/activity-feed+xml
Cache-Control: max-age=5
Content-Length: ...
... current activity feed state ...
GET /activity-feed/delta/233 HTTP/1.1
Host: example.com
Accept: application/activity-feed-delta+xml
HTTP/1.1 204 No Content
Cache-Control: max-age=5
(nothing has changed, don't retry for 5 seconds)
GET /activity-feed/delta/233 HTTP/1.1
Host: example.com
Accept: application/activity-feed-delta+xml
HTTP/1.1 200 OK
Link: <http://example.com/activity-feed/delta/238>; rel="next"
Cache-Control: max-age=5
Content-Type: application/activity-feed-delta+xml
... all 5 changes since record 233 ...
GET /activity-feed/delta/238 HTTP/1.1
Host: example.com
Accept: application/activity-feed-delta+xml
HTTP/1.1 410 Gone
(Too many changes have occurred since record 238.
The client will need to fetch the main resource again)
This technique is stateless when a bounded number of buffers are
present in the origin server (the delta buffers are shared by
multiple clients) and the buffer sizes are determined and bounded by
the origin server. The origin server has control over the size of
the delta buffer and can forget about individual clients between
requests. Both main and delta resources can be cached to be returned
to multiple clients.
Note: In the example above two different media types are used to
convey main resource state and delta resource state, respectively.
Because of the semantic nature of the deltas it is likely that in
Carlyle Expires January 1, 2013 [Page 6]
Internet-Draft SDE-HTTP June 2012
many circumstances a media type will be able to be constructed that
serves the needs of both the main resource and the delta resources,
so a single media type will often be used instead.
4. The Main Resource
This specification can apply to a wide range of main resources. Any
resource that wishes to publish it's ability to support delta
encoding MUST include a "delta" link in response to GET and HEAD
requests. The link is supplied in the HTTP link header specified in
[RFC5988] with relationship type "delta". At the end of processing
the response to its main resource GET request the client will have a
synchronised representation of the resource state, plus a link to
fetch changes that have occurred or will occur to the resource state
after the time of the included representation.
5. The Delta Resource
Every delta resource identifies a particular location within the
delta buffer. These locations correspond to particular states that
the main resource either has or has had at some time in the past.
When a fetch to the main resource returns a specific location in the
delta buffer it becomes possible for the consumer to fetch changes
since that representation to construct a new replica of the resource
state, or to process changes to the resource state since that time
for other purposes.
If a delta resource identifies a location in the delta buffer that
has been expired from the buffer it MUST return a 410 Gone status
code in response to GET and HEAD requests.
If a delta resource identifies a position in the delta buffer after
which no further deltas have been inserted it MUST return a 204 No
Content status code in response to GET and HEAD requests.
If a delta resource identifies a position in the delta buffer after
which further deltas have been inserted it MUST return a 200 OK
status code in response to GET and HEAD requests. Its response MUST
include all deltas up to the end of the delta buffer in a single
response, and it SHOULD include a link to the next delta.
A delta resource SHOULD include equivalent cache control directives
to the main resource in response to GET and HEAD requests. This
ensures that multiple clients seeking to access the same delta are
able to effectively utilise caches to bring themselves up to date.
Carlyle Expires January 1, 2013 [Page 7]
Internet-Draft SDE-HTTP June 2012
5.1. Choosing Delta Resource Identifiers
The exact nature of delta resource identifiers is origin server-
specific and is not defined by this specification. However, delta
resource identifiers should have a number of properties:
o As with most resources a delta resource identifier SHOULD be
unique across space and time, and MUST NOT be reused while clients
may still hold references to them. Servers MAY embed a form of
globally unique identifier within the resource identifier for this
purpose.
o A delta resource identifier MUST identify both the delta buffer
and the point within the delta buffer that the consumer is up to
in its synchronisation process.
5.2. Filtered Delta Buffers
It will often be the case that different clients have different
requirements for what must be synchronised. As this is a stateless
protocol, origin servers SHOULD NOT allocate a different delta buffer
for each active consumer. This will compromise scalability and
reliability outcomes. Rather, the number of delta buffers allocated
by an origin server SHOULD be a bounded set.
To deal with the needs of different clients the recommended solution
is to share a common delta buffer between many clients, but to allow
each consumer to select a distinct subset of the delta buffer entries
by identifying a set of filter criteria within the delta resource
identifier. For example:
Carlyle Expires January 1, 2013 [Page 8]
Internet-Draft SDE-HTTP June 2012
GET /feed?keywords=ietf,web HTTP/1.1
Host: example.com
Accept: application/activity-feed+xml
HTTP/1.1 200 OK
Link: <http://example.com/feed/delta/233;ietf,web>; rel="delta"
Content-Type: application/activity-feed+xml
Cache-Control: max-age=5
Content-Length: ...
... current activity feed state ...
GET /feed/delta/233;ietf,web HTTP/1.1
Host: example.com
Accept: application/activity-feed-delta+xml
HTTP/1.1 200 OK
Link: <http://example.com/feed/delta/238;ietf,web>; rel="next"
Cache-Control: max-age=5
Content-Type: application/activity-feed-delta+xml
... all 3 changes relating to ietf or web since record 233 ...
This interaction will allow the client that wants to see deltas for
these specific topics to use the same delta buffer as a client that
wants to see a different subset of the available deltas. As the
origin server is able to forget each consumer's place in the delta
buffer and their selected filter criteria at the end of each request
this filtering model complies with the REST stateless constraint.
Since the client has no control over the delta resource identifier,
any filter information present in the delta resource identifier must
be derived by the origin server from the semantics of the request to
the main resource. Header information from the delta request MAY be
taken into account by the origin server for filtering purposes, such
as information about the identity of the user requesting the delta.
Note: This specification does not describe any particular syntax to
use in specifying filter criteria, or any other part of the delta
resource identifier. All resource identifiers for main and delta
resources are part of the specification of individual origin servers
and are outside the scope of this specification.
5.3. Summarization
The delta buffer held by a server may contain redundant information.
For example, multiple changes may have occurred to the same property
of the main resource and it may not be necessary for the client to
see each of these changes. In this case the server MAY summarise the
delta buffer by removing redundant information. For example, the
Carlyle Expires January 1, 2013 [Page 9]
Internet-Draft SDE-HTTP June 2012
server may replace two changes in the buffer with a single change
that communicates the final state to the client. If summarization
occurs the server MUST ensure that the consumer sees a consistent
resource state at the end of each response message.
6. Delta Resource Response Types
This section describes additional semantics for status codes that can
be returned from delta resources.
6.1. 200 OK
The 200 OK status code from a delta resource indicates a set of
deltas that are included in the message body. The client SHOULD
process these deltas to identify changes or to update its current
model of the main resource's state. The client SHOULD examine the
response for a "next" link header and process it as indicated below.
If no "next" link is present in the message the client SHOULD treat
the message as having brought it up to date with the current state of
the resource, but for which the delta buffer is now no longer
available. The client SHOULD NOT make any further requests to the
delta resource, and SHOULD instead query the main resource in its
next request if it requires further synchronisation.
6.2. 204 No Content
The 204 No Content status code from a delta resource indicates that
no further deltas have been added, and this delta resource remains
the correct resource to query for additional deltas. The client
SHOULD retain the current resource identifier for its next query.
The origin server SHOULD include a max-age Cache-Control directive in
the 204 No Content response, and the client SHOULD NOT issue a new
request to this resource until the max-age timeout has expired. This
allows the origin server to control the rate at which clients poll
its delta buffer for new deltas.
6.3. 410 Gone
The 410 Gone status code from a delta resource indicates that the
position it identifies in the delta buffer has been expired out of
the buffer, or that the delta buffer itself is no longer valid. The
delta resource can no longer be used to bring the consumer up to date
with the current state of the main resource. The client SHOULD query
the main resource in its next request if it requires further
synchronisation, and MAY do so immediately on recieving the 410 Gone
Carlyle Expires January 1, 2013 [Page 10]
Internet-Draft SDE-HTTP June 2012
response.
6.4. Other Status Codes
Other status codes should be processed according to their respective
specifications. If the client would have to treat the response from
a delta resource as a failure, it SHOULD instead handle the code as
if it were 410 Gone and therefore resynchronise the main resource
instead of triggering failure logic.
7. Link Relation Types
This section describes the new "delta" link relation type, and
additional semantics for the "next" link relation type for a delta
resource.
7.1. The Delta Link Relation Type
The "delta" link relation is used by a main resource to indicate the
identifier of the next delta resource. This delta resource will
always indicate the current end of the delta buffer, so will return a
204 No Content response if queried before any new deltas are added to
the buffer. The client SHOULD retain this resource identifier to use
to query for deltas when it next needs to bring its state up to date
with that of the main resource. The client SHOULD NOT immediately
issue a request to the delta resource unless it is using a long poll
(described below), but instead SHOULD wait until it next needs to
access the main resource state.
7.2. The Next Link Relation Type
The "next" link relation is used by a delta resource to indicate the
identifier of the next delta resource. This delta resource will
always indicate the current end of the delta buffer, so will return a
204 No Content response if queried before any new deltas are added to
the buffer. The client SHOULD retain this resource identifier to use
to query for deltas when it next needs to bring its state up to date
with that of the main resource. The client SHOULD NOT immediately
issue a request to the delta resource unless it is using a long poll
(described below), but instead SHOULD wait until it next needs to
access the main resource state.
8. Optional Long Poll
The default mechanisms for a consumer that wishes to stay
synchronised with the state of the main resource are either to fetch
Carlyle Expires January 1, 2013 [Page 11]
Internet-Draft SDE-HTTP June 2012
the current delta whenever they need to access the state of the main
resource, or to poll the delta resource periodically if they need to
trigger processing based on changes to the resource. If the polling
mechanism is selected then the polling period must be carefully
controlled to balance the need for timely processing when the state
changes against the network and processing overhead associated with
issuing delta requests.
If the processing must be triggered within a short period of the
change then either a rapid polling rate is required or an alternative
technique is warranted. This specification defines a long poll
option that allows the server to hold onto the most recent delta
request from a given client until either a delta is generated by the
origin server or a timeout occurs. When either of these events occur
the server generates a response according to requirements of a normal
request that would have arrived at that time.
The client is able to indicate its preference for a long poll with
the Request-Timeout header. The "Request-Timeout" header is a end-
to-end request header that indicates the maximum time that a client
is prepared to await a response.
Request-Timeout = "Request-Timeout" ":" timeout-value
timeout-value = 1*DIGIT
A client adds a Request-Timeout header to any request for which they
are prepared to await a response. The client sets the header to the
maximum time that they are prepared to wait.
The value of the Request-Timeout header is a single integer value in
seconds.
An origin server interprets this header as the time between receipt
of a complete request and the time that it generates and begins
sending the response. A client will observe a longer time interval
between request and response, as network transit and processing by
intermediaries add delays. If this time is critical, a client SHOULD
allow for delays in setting a value for the header.
An origin server MAY apply a lower value to the timeout based on
local policy. An origin server MAY choose to take longer to produce
a response, at the risk that the client is no longer able to use the
response.
An HTTP intermediary MAY reduce the value of a Request-Timeout header
based on local policy. An intermediary MAY add a Request-Timeout
header if none is present. The value in the Request-Timeout header
MUST NOT be increased or removed.
Carlyle Expires January 1, 2013 [Page 12]
Internet-Draft SDE-HTTP June 2012
If no new delta occurs before the value specified in the Request-
Timeout header expires, the origin server SHOULD return 204 No
Content. If the server is unwilling or unable to keep the long poll
open for the requested Request-Timeout header duration it MAY return
204 No Content at any time before the next delta is added to the
delta buffer.
The Semantic Delta Encoding model combined with long poll is able to
emulate real-time publish/subscribe semantics for a given main
resource without requiring an explicit subscription model, allowing
the server to shed long poll clients whenever this may be required
without breaching the contract between itself and its clients.
If a client that specifies a Request-Timeout header sees a 204 No
Content response, it MUST ensure that the time between the previous
request and the next request is consistent with a short polling
period. As with normal 204 No Content handling (Section 6.2) the
server SHOULD include a max-age Cache-Control directive in the 204 No
Content response, and the client SHOULD NOT issue a new request to
this resource until the max-age timeout has expired. These
provisions are intended to avoid entering a degenerate rapid polling
mode when a server refuses to participate in the long poll and
immediately returns 204 No Content.
9. Degenerate Behaviours
If the delta buffer is consistently too short for a given client to
use then the client will see a 410 Gone response each time it
attempts to query its delta resource. This will usually cause the
client to immediately issue a new GET request to the main resource.
In this degenerate case the semantic delta encoding mechanism will
cause the client to alternate between issuing a new GET request to
the main resource and a GET request to a delta resource returning 410
Gone. This case exhibits similar behaviour to the case where the
client polls only the main resource, although latency is effectively
increased.
Because this degenerate case can occur clients SHOULD be able to fall
back to computing the set of changes by comparing the previous
synchronised state to the new state of the main resource whenever it
queries the main resource, or otherwise bring itself up to date by
purging the old synchronised state and replacing it with the new
state from the main resource.
If changes are occurring rapidly to the main resource the number of
bytes added to the delta buffer may exceed the number of bytes that
the client is able to fetch in a given period of time. If this
Carlyle Expires January 1, 2013 [Page 13]
Internet-Draft SDE-HTTP June 2012
occurs, each request will fetch a larger set of deltas than the
previous request until either a maximum message size limit is
encountered or an expired delta resource (returning 410 Gone) is
encountered. The 410 Gone response will result in the
resynchronisation case already described, but it is also worth
considering the case when this has not already occurred. As each
message increases in size it will take longer for the message to be
transferred and processed. This will result in increasing latency
between deltas being added to the buffer and those deltas being
processed by a given client.
Clients that are sensitive to latency SHOULD place an upper bound on
the size of the biggest delta they are prepared to process. They
SHOULD treat an over-sized delta as if it were a 410 Gone response.
A good rule of thumb for determining how large a set of deltas both
for the client to process and for the server to generate is that the
set of deltas should not be significantly larger than the size of the
main resource when clients are sensitive to latency.
If the Request-Timeout header is specified by the client but is not
honoured by the server the client will immediately see a 204 No
Content response to each of its requests. As noted in Section 8, the
client MUST maintain a minimim gap between request messages when it
requests a long poll but sees a 204 No Content response. This will
cause the interaction to degenerate into a standard poll.
10. Synchronising Large Resources
This specification assumes that the state of all resources (including
main resources and all delta resources) will be able to be
transferred in a single message from server to client. If this is
not the case then a slightly more complex mechanism is required,
which is outside the scope of this specification.
The simplest approach to building such a mechanism would be to blur
the distinction between main resources and delta resources to allow
current state and deltas to be transferred side by side in a single
message. Each new request that a client makes should return as many
deltas as possible to the client, plus (if any space is left over) as
much additional state as possible to the client. For example, a
client could fetch an initial subset of the overall resource state.
The next request would fetch any deltas that have been added since
the last request, plus an additional quantity of state. When all
state has been successfully transferred to the client the state can
be said to be synchronised, and it is completely up to date with the
server's resource at that time. This mechanism can be further
optimised by only transferring deltas relating to the synchronised
Carlyle Expires January 1, 2013 [Page 14]
Internet-Draft SDE-HTTP June 2012
portion of the main resource state in responses.
It would be possible to define a general mechanism that facilitates
this kind of transfer, but it would be more complex than the
mechanism defined in this specification. Moreover, opportunities to
make use of this increased genericity would be limited on the Web
where resources are typically not large enough to require it.
11. Cache Efficiency
One of the implications of using this mechanism is that caches may
find themselves caching both main resources and delta resources
related to that main resource. There is a risk that the cache will
become polluted with redundant information that reduces its
efficiency.
If the max-age caching model is used then this impact can be reduced.
All clients that request the resource within the max-age expiry
period will retrieve the same cached delta, and will therefore obtain
the same next delta link as part of the fetch. This group can
continue to move forwards through the set of deltas as a block that
only needs to interact with their local cache to obtain new deltas.
Moreover, as they will only fetch a new delta when the old one
expires there will only be one cached delta active for the group
For example, if the max-age on both the main resource and the delta
resources is 10 seconds then the first client to GET to the main
resource will cause the cache to become populated with the main
resource state. Any other client that requests that state within the
10 second period will retrieve the local cached representation along
with a link to the same delta resource. The first client to request
the delta will cause the cache to become populated with the delta.
If clients from the same group each fetch within the 10 second window
they will continue to be part of the group. The group as a whole
will continue to progress through the same cached deltas and through
the same "next" links.
The group can become fragmented if they take longer than 10 seconds
to fetch their deltas. In this case a new delta will be cached with
a new next link that may have no relationship to the earlier "next"
link. In this case the group will split into the set that fetched
the delta within the first 10 second period and the group that
fetched the delta within a later 10 second period. If all clients
fell into a unique 10 second period then the group would split up
into a set of single member groups.
The mechanism that can allow these groups to become merged again is
Carlyle Expires January 1, 2013 [Page 15]
Internet-Draft SDE-HTTP June 2012
when the rate of change to the main resources slows such that
independent groups are directed to the same "next" link. This allows
multiple groups to be joined back into a single group. In general
the mechanism will work most effectively with the least group
fragmentation and highest cache efficiency when the period between
client GET requests on average is less than the max-age period, and
less than the average period between main resource changes.
With the long poll mechanism in place a cache must be able to hold
off on issuing the redundant GET requests in order to operate with
high efficiency. If the cache lets a single GET request through to
the resource while blocking any redundant requests then it will be
able to wait for a response and return that response to all blocked
clients for whom the response is appropriate, ie the blocked clients
whose requests are equivalent to the first request for caching
purposes. The cache can then allow further blocked requests through
that may have request headers that prevent the first response from
being used, for example because they specified a different Accept
header.
If the max-age caching model is in appropriate and Last-Modified or
ETags are used instead the groups of clients accessing a particular
cache can readily become fragmented leading to inefficiency.
12. Transmission Efficiency
When the number of delta bytes being generated approaches that of the
available network bandwidth the most efficient transfer mechanism
would be a continuous stream, without any breaks in transmission.
The semantic delta encoding mechanism enforces a break in the
transmission that begins when the last byte of the previous delta is
transmitted.
The normal processing of a response might wait for the response to be
generated and processed before issuing a new request. Under this
model the next delta cannot begin to be transmitted until the
previous delta has been transmitted and processed, and a new client
GET request to the new delta has been transmitted to the server.
In theory an efficient implementation could reduce this transmission
delay by reading the link header from the previous response message
before it is completely transmitted. This link could be used to
issue a new long poll GET request to the next delta before the
previous response has been completely transmitted or processed.
However, a client SHOULD NOT request the next delta until the
previous response has been completely received. This is due to the
problem of multiple deltas potentially ending up on multiple TCP
Carlyle Expires January 1, 2013 [Page 16]
Internet-Draft SDE-HTTP June 2012
connections and competing for the available bandwidth during their
concurrent transfers. Even if a client uses pipelining to ensure
that the next delta transmission begins only when the previous
response message has been completely transmitted an intermediary such
as a caching proxy may employ its own connection management strategy
that causes the different deltas to compete for available bandwidth
at that layer.
13. Synchronous Synchronisation
This specification MAY be combined with the LOCK method defined in
[RFC4918] to synchronously synchronise state, however this is not
recommended unless it is specifically required. In general methods
such as LOCK that would affect the state of the main resource SHOULD
be applied to the main resource rather than a delta resource.
14. Acknowledgements
The Request-Timeout header definition in this specification is based
on the Timeout header in draft-loreto-http-timeout-00. "Request-
Timeout" is used instead to avoid conflict with [RFC4918].
The term summarization comes from [ARRESTref]. The mechanism
described in this specification treats the state of the main resource
as an estimated variable (consensus-free, master/slave), however only
the null estimation function is supported at intermediaries such as
caches. Clients MAY apply their own estimation functions if they see
fit to do so. This protocol allows the client to select between an
efficient version of the REST style (fetching deltas when required),
the REST with Polling style (fetching deltas periodically), or an
approximation of the Asynchronous REST style (deltas are returned
immediately if a long poll is used).
[RFC3229] defines a delta encoding mechanism for HTTP, but one that
is quite different from that which is included in this specification.
RFC3229 is intended for use by caches to update the bytes of a given
representation rather than to transfer a semantic delta to client
programs. The RFC3229 mechanism is more difficult, restrictive, and
inefficient to use for semantic delta encoding purposes.
15. IANA Considerations
Carlyle Expires January 1, 2013 [Page 17]
Internet-Draft SDE-HTTP June 2012
15.1. Delta Relation Type
This specification updates the Link Relation Type Registry with the
following new entry:
Relation Name: delta
Description: Refers to a resource that contains semantic differences
between the current version of this resource and a future version of
this resource.
Reference: [this document]
15.2. Request-Timeout Header
This specification updates the Message Header registry with the
following new entry:
Header field: Request-Timeout
Applicable protocol: http
Status: experimental
Author/Change controller: IETF (iesg@ietf.org) Internet Engineering
Task Force
Specification document(s): [this document]
16. Security Considerations
The use of a single delta buffer between different clients requires
care to ensure that clients only see information they are entitled to
see. The delta buffer must have the same access control mechanisms
in place as the main resource, otherwise it will become a backdoor
mechanism for accessing the data.
17. References
17.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
Carlyle Expires January 1, 2013 [Page 18]
Internet-Draft SDE-HTTP June 2012
Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[RFC5988] Nottingham, M., "Web Linking", RFC 5988, October 2010.
17.2. Informative References
[ARRESTref]
Khare, R., "Extending the REpresentational State Transfer
Architectural Style for Decentralized Systems", 2003,
<http://www.ics.uci.edu/~rohit/Khare-Thesis-FINAL.pdf>.
[RFC3229] Mogul, J., Krishnamurthy, B., Douglis, F., Feldmann, A.,
Goland, Y., van Hoff, A., and D. Hellerstein, "Delta
encoding in HTTP", RFC 3229, January 2002.
[RFC4918] Dusseault, L., "HTTP Extensions for Web Distributed
Authoring and Versioning (WebDAV)", RFC 4918, June 2007.
Author's Address
Benjamin Carlyle
Email: benjamincarlyle@soundadvice.id.au
Carlyle Expires January 1, 2013 [Page 19]