Internet DRAFT - draft-morgan-http2-header-compression
draft-morgan-http2-header-compression
HTTPbis K. Morgan
Internet-Draft C. Brunhuber
Intended status: Standards Track IAEA
Expires: December 4, 2014 June 2, 2014
H2EZ: HTTP/2 Header Compression
draft-morgan-http2-header-compression-00
Abstract
This specification defines the format and compression of HTTP header
fields in HTTP/2. The compression is based on EZFLATE, which is a
token-based DEFLATE algorithm for secure compression within encrypted
communication channels.
Editorial Note (To be removed by RFC Editor)
Discussion of this draft takes place on the HTTPBIS working group
mailing list (ietf-http-wg@w3.org), which is archived at
<http://lists.w3.org/Archives/Public/ietf-http-wg/>.
Working Group information can be found at
<http://tools.ietf.org/wg/httpbis/>;
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on December 4, 2014.
Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Morgan & Brunhuber Expires December 4, 2014 [Page 1]
Internet-Draft H2EZ June 2014
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Detailed Header Format . . . . . . . . . . . . . . . . . . . . 3
2.1. ASN.1 Octet String Examples . . . . . . . . . . . . . . . 4
2.1.1. Example 1 (length 15) . . . . . . . . . . . . . . . . 4
2.1.2. Example 2 (length 33) . . . . . . . . . . . . . . . . 4
2.1.3. Example 3 (length 128) . . . . . . . . . . . . . . . . 5
2.1.4. Example 4 (length 258) . . . . . . . . . . . . . . . . 5
3. EZFLATE Tokenization Rules . . . . . . . . . . . . . . . . . . 6
3.1. Header Name Tokenization . . . . . . . . . . . . . . . . . 6
3.2. Header Value Tokenization . . . . . . . . . . . . . . . . 6
3.3. Tokenization Example . . . . . . . . . . . . . . . . . . . 7
3.4. Sub-Tokenization . . . . . . . . . . . . . . . . . . . . . 9
3.4.1. Sub-Tokenization Example 1 (1292 octet token) . . . . 9
3.4.2. Sub-Tokenization Example 2 (516 octet token) . . . . . 9
3.4.3. Sub-Tokenization Example 3 (259 octet token) . . . . . 10
4. Security Considerations . . . . . . . . . . . . . . . . . . . 10
5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 10
6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.1. Normative References . . . . . . . . . . . . . . . . . . . 10
6.2. Informative References . . . . . . . . . . . . . . . . . . 11
Morgan & Brunhuber Expires December 4, 2014 [Page 2]
Internet-Draft H2EZ June 2014
1. Introduction
[TODO: A significant portion of text the 'introduction' and 'security
considerations' sections, is completely lifted from the HPACK
Internet Draft; re-write or give credit somehow.]
In HTTP/1.1 (see [HTTP-p1]), header fields are encoded without any
form of compression. As web pages have grown to include dozens to
hundreds of requests, the redundant header fields in these requests
now measurably increase latency and unnecessarily consume bandwidth
(see [PERF1] and [PERF2]).
SPDY [SPDY] initially addressed this redundancy by compressing header
fields using the DEFLATE format [DEFLATE], which proved very
effective at efficiently representing the redundant header fields.
However, that approach exposed a security risk as demonstrated by the
CRIME attack (see [CRIME]).
A key observation of the CRIME and BREACH attacks is that they
exploited the DEFLATE _algorithm_ as described in RFC 1951 [DEFLATE],
not the DEFLATE _format_ (also described in RFC 1951).
EZFLATE [EZFLATE], is a token-based DEFLATE compression algorithm for
secure compression within encrypted communication channels. The key
feature of the EZFLATE algorithm (and primary difference to the
algorithm described in RFC 1951), is that LZ77 <length, backward
distance> tuples may only reference a duplicated token (octet string)
occurring in a previous block, up to 32K input bytes before, if the
current token exactly matches in length _and_ octet values. As shown
in Section 3.1, "Take a Bite out of CRIME", of [EZFLATE], this forces
potential CRIME-like attackers to guess n-character secrets n
characters at a time rather than one character at a time, making the
search space no worse than a full brute-force attack.
This document defines 1) the formatting of HTTP/2 header fields, and
2) how EZFLATE is used to compress HTTP/2 header fields. Finally,
this document also includes implementation recommendations.
2. Detailed Header Format
In HTTP/1.1 [HTTP-p1] headers are defined as field-name, field-value
pairs (see Section 3.2, "Header Fields" of [HTTP-p1]). The field-
name and field-value are separated from each other by a single ':'
character followed by optional whitespace. Each pair is separated by
the two octet sequence CRLF (carriage return, line feed). The end of
the header block is marked by an empty line followed by CRLF.
In HTTP/2, the headers are still field-name, field-value pairs except
Morgan & Brunhuber Expires December 4, 2014 [Page 3]
Internet-Draft H2EZ June 2014
that each are length-prefixed octet strings to simplify parsing. In
other words, the field-name and field-value are separately length-
prefixed. In this way no separator is necessary between the field-
name and field-value and no separator is necessary between
consecutive pairs. The end of the header block is naturally defined
as the end of a HTTP/2 HEADERS or PUSH_PROMISE frame with the
END_HEADERS flag set [HTTP2].
Each field-name and field-value are encoded as an ASN.1 octet string
[X.208-88][X.209-88]. This simply means the octets which compose a
field-name or field-value are length-prefixed with a variable-length
integer which indicates the exact number of octets composing the
original field-name or field-value.
For ASN.1 encoded octet-strings, the length prefix is composed of at
least a single octet, and optionally N additional octets, where N is
exactly specified by the first octet as follows. If the length of
the octets to be length-prefixed is less than 0x80 (128), then only a
single octet is required for the length prefix and so the length-
prefix octet is encoded as exactly the length (big endian) of the
octets to be length-prefixed. If the length of the octets to be
length-prefixed is greater than or equal to 0x80 (128), then the most
significant bit of the first length-prefix octet is set (i.e. 0x80),
and the remaining 7 bits specify the number of additional octets N
(big endian) which will be added to encode the length, as an unsigned
integer (big endian), of the octets to be length-prefixed .
2.1. ASN.1 Octet String Examples
The following are examples of computing the length prefix for
converting various octet strings to ASN.1 octet strings.
2.1.1. Example 1 (length 15)
The ASCII string "accept-encoding" is to be converted to an ASN.1
octet string. The length of the string is 15 octets. Since 15 is
less than 128, the length prefix is simply a single octet with the
value 15 (0x0f).
Hex dump of the resulting ASN.1 octet string:
0000000 0f61 6363 6570 742d | 656e 636f 6469 6e67 | .accept-encoding
2.1.2. Example 2 (length 33)
The ASCII string "gzip;q=1.0, identity;q=0.5, *;q=0" is to be
converted to an ASN.1 octet string. The length of the string is 33
octets. Since 33 is less than 128, the length prefix is simply a
Morgan & Brunhuber Expires December 4, 2014 [Page 4]
Internet-Draft H2EZ June 2014
single octet with the value 33 (0x21).
Hex dump of the resulting ASN.1 octet string:
0000000 2167 7a69 703b 713d | 312e 302c 2069 6465 | !gzip;q=1.0, ide
0000010 6e74 6974 793b 713d | 302e 352c 202a 3b71 | ntity;q=0.5, *;q
0000020 3d30 | | =0
2.1.3. Example 3 (length 128)
The string of octets 0x00 0x01 ... 0x7d 0x7f is to be converted to an
ASN.1 octet string. The number of octets is 128. Since 128 equals
128, the first octet of the length prefix has the most significant
bit set to indicate the length prefix is longer than one octet. The
remaining seven bits encode the number of additional octets necessary
to represent the value 128 as a big endian unsigned integer. Since
encoding 128 requires one octet, the seven remaining bits encode the
value 1 (0x01). The second octet encodes the value 128 (0x80).
Hex dump of the resulting ASN.1 octet string:
0000000 8180 0001 0203 0405 | 0607 0809 0a0b 0c0d | ................
0000010 0e0f 1011 1213 1415 | 1617 1819 1a1b 1c1d | ................
0000020 1e1f 2021 2223 2425 | 2627 2829 2a2b 2c2d | .. !"#$%&'()*+,-
0000030 2e2f 3031 3233 3435 | 3637 3839 3a3b 3c3d | ./0123456789:;<=
0000040 3e3f 4041 4243 4445 | 4647 4849 4a4b 4c4d | >?@ABCDEFGHIJKLM
0000050 4e4f 5051 5253 5455 | 5657 5859 5a5b 5c5d | NOPQRSTUVWXYZ[\]
0000060 5e5f 6061 6263 6465 | 6667 6869 6a6b 6c6d | ^_`abcdefghijklm
0000070 6e6f 7071 7273 7475 | 7677 7879 7a7b 7c7d | nopqrstuvwxyz{|}
0000080 7e7f | | ~.
2.1.4. Example 4 (length 258)
The string of octets 0x00 0x01 ... 0xfe 0xff 0x00 0x01 is to be
converted to an ASN.1 octet string. The number of octets is 258.
Since 258 is greater than 128, the first octet of the length prefix
has the most significant bit set to indicate the length prefix is
longer than one octet. The remaining seven bits encode the number of
additional octets necessary to represent the value 258 as a big
endian unsigned integer. Since encoding 258 requires two octets, the
seven remaining bits encode the value 2 (0x02). The second and third
octets encode the value 258 (0x01 0x02).
Morgan & Brunhuber Expires December 4, 2014 [Page 5]
Internet-Draft H2EZ June 2014
Hex dump of the resulting ASN.1 octet string:
000000 8201 0200 0102 0304 | 0506 0708 090a 0b0c | ................
000010 0d0e 0f10 1112 1314 | 1516 1718 191a 1b1c | ................
000020 1d1e 1f20 2122 2324 | 2526 2728 292a 2b2c | ... !"#$%&'()*+,
000030 2d2e 2f30 3132 3334 | 3536 3738 393a 3b3c | -./0123456789:;<
000040 3d3e 3f40 4142 4344 | 4546 4748 494a 4b4c | =>?@ABCDEFGHIJKL
000050 4d4e 4f50 5152 5354 | 5556 5758 595a 5b5c | MNOPQRSTUVWXYZ[\
000060 5d5e 5f60 6162 6364 | 6566 6768 696a 6b6c | ]^_`abcdefghijkl
000070 6d6e 6f70 7172 7374 | 7576 7778 797a 7b7c | mnopqrstuvwxyz{|
000080 7d7e 7f80 8182 8384 | 8586 8788 898a 8b8c | ................
000090 8d8e 8f90 9192 9394 | 9596 9798 999a 9b9c | ................
0000a0 9d9e 9fa0 a1a2 a3a4 | a5a6 a7a8 a9aa abac | ................
0000b0 adae afb0 b1b2 b3b4 | b5b6 b7b8 b9ba bbbc | ................
0000c0 bdbe bfc0 c1c2 c3c4 | c5c6 c7c8 c9ca cbcc | ................
0000d0 cdce cfd0 d1d2 d3d4 | d5d6 d7d8 d9da dbdc | ................
0000e0 ddde dfe0 e1e2 e3e4 | e5e6 e7e8 e9ea ebec | ................
0000f0 edee eff0 f1f2 f3f4 | f5f6 f7f8 f9fa fbfc | ................
000100 fdfe ff00 01 | | ................
3. EZFLATE Tokenization Rules
EZFLATE is itself agnostic to tokenization methods and delimiter
sets. Appropriate tokenization rules and delimiter sets depend on
the type of data to be compressed. For HTTP/2 header data, each
field-name and field-value are tokenized separately.
3.1. Header Name Tokenization
Since header field names are typically static, it is not beneficial
to tokenize field names. Therefore field names are fed to the
EZFLATE compressor as a single token, including the ASN.1 length
prefix.
3.2. Header Value Tokenization
Field values, on the other hand, have varying values of staticity
ranging from very static (e.g. :scheme, user-agent) to semi-static
(e.g. accept, accept-encoding) to very dynamic (e.g. :path, date).
Field values MAY be tokenized with following set of delimiters:
{ ' ', '\t', ';', ',' }
Figure 1
Additionally, the set of headers { ":path", "location", "content-
location" } MAY instead be tokenized with the following set of
delimiters:
Morgan & Brunhuber Expires December 4, 2014 [Page 6]
Internet-Draft H2EZ June 2014
Special token delimiters for header values containing a path:
{ '/', '?', '&', '=' }
Figure 2
The ASN.1 length prefix MAY be treated as a delimiter and therefore
treated as a separate token.
Finally, note that per Section 3.2 of [EZFLATE], a sequence of
consecutive delimiters are to be treated as a single unit (e.g. as
one token). For examle, see Figure 3 and Figure 4, where the
consecutive delimiter characters ',' and ' ', are treated as a single
token ", ".
3.3. Tokenization Example
The following is an example of converting an HTTP header name-value
pair into ASN.1 length prefixed octet strings (see Section 2 and then
tokenizing the octet string based on the tokenization rules.
The following header is to be encoded:
accept-encoding: gzip;q=1.0, identity;q=0.5, *;q=0
First the field-name and field-value are converted to ASN.1 octet
strings by adding the appropriate length prefixes (see Section 2.1.1
and Section 2.1.1).
Hex dump of the header converted to ASN.1 octet strings:
0000000 0f61 6363 6570 742d | 656e 636f 6469 6e67 | .accept-encoding
0000010 2167 7a69 703b 713d | 312e 302c 2069 6465 | !gzip;q=1.0, ide
0000020 6e74 6974 793b 713d | 302e 352c 202a 3b71 | ntity;q=0.5, *;q
0000030 3d30 | | =0
Next, the length-prefixed header name is taken as a single token.
Hex dump of the header name token:
0000000 0f61 6363 6570 742d | 656e 636f 6469 6e67 | .accept-encoding
Finally, the length-prefixed header value is tokenized according to
the rules in Section 3. The following is a hex dump of each of the
eleven tokens.
Morgan & Brunhuber Expires December 4, 2014 [Page 7]
Internet-Draft H2EZ June 2014
Token 1:
0000010 2167 7a69 70 | | !gzip
Token 2:
0000010 3b | | ;
Token 3:
0000010 713d | 312e 30 | q=1.0
Token 4:
0000010 | 2c 20 | ,
Figure 3
Token 5:
0000010 | 69 6465 | ide
0000020 6e74 6974 79 | | ntity
Token 6:
0000020 3b | | ;
Token 7:
0000020 713d | 302e 35 | q=0.5
Token 8:
0000020 | 2c 20 | ,
Figure 4
Token 9:
0000020 | 2a | *
Token 10:
0000020 | 3b | ;
Morgan & Brunhuber Expires December 4, 2014 [Page 8]
Internet-Draft H2EZ June 2014
Token 11:
0000020 | 71 | q
0000030 3d30 | | =0
3.4. Sub-Tokenization
The DEFLATE [DEFLATE] format has a fixed maximum length of 258 octets
for matches (see Section 3.2.5 of [DEFLATE]). As such, tokens with
length L greater than 258 octets SHOULD be sub-tokenized into sub-
tokens as follows. The number of sub-tokens N MUST be N = (L + 257)
/ 258 (integer division). If N > 2, the first N - 2 sub-tokens MUST
be 258 octets in length. In all cases, the final two sub-tokens MUST
have length greater than or equal to 129 (258 / 2). Let R be the
remaining length for the final two tokens. The length of the first
remaining token is L[N - 1] = R - R / 2 and the length of the second
remaining token is L[N] = R / 2 (where the index is 1..N).
Alternatively, tokens with length L greater than 258 octets may
simply be emitted as literals.
Consider the following examples:
3.4.1. Sub-Tokenization Example 1 (1292 octet token)
A token T, 1292 octets in length, is to be sub-tokenized.
N = (1292 + 257) / 258 = 1549 / 258 = 6
L1 = 258
L2 = 258
L3 = 258
L4 = 258
R = 1292 - 4 * 258 = 260
L5 = 260 - 260 / 2 = 130
L6 = 260 / 2 = 130
Thus there are six sub-tokens. The first four sub-tokens are 258
octets in length. The final two sub-tokens are 130 octets in length.
3.4.2. Sub-Tokenization Example 2 (516 octet token)
A token T, 516 octets in length, is to be sub-tokenized.
N = (516 + 257) / 258 = 773 / 258 = 2
R = 516
L1 = R - R / 2 = 258
L2 = R / 2 = 258
Thus there are two sub-tokens. The two sub-tokens are 258 octets in
Morgan & Brunhuber Expires December 4, 2014 [Page 9]
Internet-Draft H2EZ June 2014
length.
3.4.3. Sub-Tokenization Example 3 (259 octet token)
A token T, 259 octets in length, is to be sub-tokenized.
N = (259 + 257) / 258 = 516 / 258 = 2
R = 259
L1 = R - R / 2 = 130
L2 = R / 2 = 129
Thus there are two sub-tokens. The length of the first sub-token is
130 octets. The length of the second sub-token is 129 octets.
4. Security Considerations
An EZFLATE compressor can act as an oracle to an attacker probing the
compression context. However, the attacker can only find out if the
guess of the entire token is correct or not.
Still, an attacker could take advantage of this limited information
for breaking low-entropy secrets using a brute-force attack. A
server usually has some protections against such brute-force attack.
Here, the attack would target the client, where it would be harder to
detect. The attack would be even more dangerous if the attacker is
able to prevent the traffic generated by its brute-force attack from
reaching the server.
To offer protection against such type of attacks, and endpoint MUST
use non-compressed DEFLATE blocks (see Section 3.2.4 of [DEFLATE]) to
prevent the compression of any header field whose value contains a
secret which could be put at risk by a brute-force attack.
5. Acknowledgements
Roberto Peon and Herve Ruellan.
6. References
6.1. Normative References
[EZFLATE] Morgan, K. and C. Brunhuber, "EZFLATE: Token-based
DEFLATE Compression", draft-morgan-ezflate (work in
progress), June 2014.
[X.208-88] CCITT, "Recommendation X.208: Specification of Abstract
Syntax Notation One (ASN.1)", January 1998.
Morgan & Brunhuber Expires December 4, 2014 [Page 10]
Internet-Draft H2EZ June 2014
[X.209-88] CCITT, "Recommendation X.209: Specification of Basic
Encoding Rules for Abstract Syntax Notation One",
January 1998.
6.2. Informative References
[CRIME] Rizzo, J. and T. Duong, "The CRIME Attack",
September 2012, <https://docs.google.com/a/twist.com/
presentation/d/
11eBmGiHbYcHR9gL5nDyZChu_-lCa2GizeuOfaLU2HOU/
edit#slide=id.g1eb6c1b5_3_6>.
[DEFLATE] Deutsch, P., "DEFLATE Compressed Data Format
Specification version 1.3", RFC 1951, May 1996.
[HTTP-p1] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext
Transfer Protocol (HTTP/1.1): Message Syntax and
Routing", draft-ietf-httpbis-p1-messaging-26 (work in
progress), February 2014.
[HTTP2] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext
Transfer Protocol version 2", draft-ietf-httpbis-http2-10
(work in progress), February 2014.
[SPDY] Belshe, M. and R. Peon, "SPDY Protocol",
draft-mbelshe-httpbis-spdy-00 (work in progress),
February 2012.
Authors' Addresses
Keith Shearl Morgan
International Atomic Energy Agency
EMail: k.morgan@iaea.org
Christoph Brunhuber
International Atomic Energy Agency
EMail: c.brunhuber@iaea.org
Morgan & Brunhuber Expires December 4, 2014 [Page 11]