Network Working Group | A. Newton |
Internet-Draft | ARIN |
Intended status: Standards Track | August 28, 2014 |
Expires: March 1, 2015 |
A Language for Rules Describing JSON Content
draft-newton-json-content-rules-02
This document describes a language useful for documenting the expected content of JSON structures found in specifications using JSON.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on March 1, 2015.
Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
The goal of this document is to provide a way to document the expected content of data expressed in JSON [RFC4627] format. That is, the primary purpose of this document is to specify a means for one person to communicate with another person the expected nature of a JSON data structure in a method more concise than prose. The programmatic validation of a JSON data structure against content rules is a lesser goal of this document, though such a practice is useful in both the writing of specifications and the communications of programs.
Unlike JSON Schema, this language is not JSON though the syntax described here is "JSON-like" (a comparison with JSON Schema can be found in Appendix A and a "real world" example can be found in Appendix B). A specialized syntax is used to reduce the tedium in reading and writing rules as the complexity describing allowable content is often more involved than most of the actual content. Figure 2 is an example of this language describing the JSON of Figure 1.
Example JSON lifted from RFC 4627
[ { "precision": "zip", "Latitude": 37.7668, "Longitude": -122.3959, "Address": "", "City": "SAN FRANCISCO", "State": "CA", "Zip": "94107", "Country": "US" }, { "precision": "zip", "Latitude": 37.371991, "Longitude": -122.026020, "Address": "", "City": "SUNNYVALE", "State": "CA", "Zip": "94085", "Country": "US" } ]
Figure 1
Rules describing Figure 1
root [ 2*2{ "precision" : string, "Latitude" : float, "Longitude" : float, "Address" : string, "City" : string, "State" : string, "Zip" : string, "Country" : string } ]
Figure 2
The JSON Content Rules are of five types:
Each rule has two components, a rule name and a rule definition. Anywhere in a rule definition where a rule name is allowed, another rule definition may be used.
This is an example of a value rule:
It specifies a rule named "v1" that has a definition of ": integer 0..3" (value rule definitions begin with a ':' character). This defines values of type "v1" to be integers in the range 0 to 3 (minimum value of 0, maximum value of 3). Value rules can define the limits of JSON values, such as stating that numbers must fall into a certain range or that strings must be formatted according to certain patterns or standards (i.e. URIs, phone numbers, etc...).
Member rules specify JSON object members. The following example member rule states that the rules name is 'm1' with a value defined by the 'v1' value rule:
Since rule names can be substituted by rule definitions, this member rule can also be written as follows:
Object rules are composed of member rules, since JSON objects are composed of members. Object rules can specify members that are mandatory, optional, and even choices between members. In this example, the rule 'o1' defines an object that must contain a member as defined by member rule 'm1' and optionally a member defined by the rule 'm2':
Finally, array rules are composed of value and object rules. Like object rules, array rules can specify the cardinality of the contents of an array. The following array rule defines an array that must contain value rule 'v1' and zero or more objects as defined by rule 'o1':
Putting it all together, Figure 4 describes the JSON in Figure 3.
Example JSON shamelessly lifted from RFC 4627
{ "Image": { "Width": 800, "Height": 600, "Title": "View from 15th Floor", "Thumbnail": { "Url": "http://www.example.com/image/481989943", "Height": 125, "Width": "100" }, "IDs": [116, 943, 234, 38793] } }
Figure 3
Rules describing Figure 3
width_v : integer 0..1280 height_v : integer 0..1024 width "width" width_v height "height" height_v thumbnail "thumbnail" { width, height, "Url" : uri } image "Image" { width, height, "Title" : string, thumbnail, "IDs" [ *: integer ] } root { image }
Figure 4
The rules from Figure 4 can be written more compactly (see Figure 5).
Compact rules describing Figure 3
width "width" : integer 0..1280 height "height" : integer 0..1024 root { "Image" { width, height, "Title" :string, "thumbnail" { width, height, "Url" :uri }, "IDs" [ *:integer ] } }
Figure 5
There is no statement terminator and therefore no need for a line continuation syntax. Blank lines are allowed.
Comments are very similar to comments in ABNF [RFC4234]. They start with a semi-colon (';') and continue to the end of the line.
Rules are composed of two parts, a rule name and a rule definition. Rule names allow a rule definition to be referenced easily by a name. With the exception of value rules, rule definitions refer to other rules using the rule names of other appropriate types of rules. Because of this, it is also possible to use a rule definition of the appropriate type where a rule name of that type would be appropriate.
The type of rule to use in a rule definition, either directly or by reference of a name, depends on the type of rule being defined and fall along the structure of allowable JSON grammar:
A fifth rule type, group rules, exist to help reference a collection of rules.
Rule names must start with an alphabetic character (a-z,A-Z) and must contain only alphabetic characters, numeric characters, the hyphen character ('-') and the underscore character ('_'). Rule names must not be used more than once.
Value rules define content for JSON values. JSON allows values to be objects, arrays, numbers, booleans, strings, and null. Arrays and objects are handled by the array and object rules, and the value rules define the rest.
The rules for booleans and null are the simplest and take the following forms:
Rules for numbers can specify the number as either an integer or floating point number and may specify a range:
where n is the minimum allowable value of the number and m is the maximum allowable value of the number. The range doesn't have to be given, but if it is given either the minimum, maximum, or both are required. If the minimum is not given then the minimum is considered to be the minimum number value possible to represent in JSON. Likewise, if the maximum is not given then the maximum is considered to be the maximum number value possible to represent in JSON.
String values may be specified generically as:
However, the content of strings can be narrowed in the following ways:
URIs can also be scoped further by providing the literals 'full' or 'relative' to indicate that the URI must be either a full URI or a relative URI:
And the scheme of the URI can also be specified:
Neither the scheme nor the full/relative literals need to be specified, and neither need to be specified together.
Conformance levels are specified with the literal '2822' signifying
RFC 2822 [RFC2822] conformance or '5322' signifying RFC 5322 [RFC5322] conformance.
Member rules are the simplest of the rules and define members of JSON objects. Member rules follow the format:
where rule_name is the name of the rule being defined, member_name (in quotes) is the name of the JSON object member, and target_rule_name is a reference to a value rule, array rule, or object rule specifying the allowable content of the JSON object member.
Since rule names in rule definitions may be substituted for rule definitions, member rules may also be written in this form:
The following is an example:
Object rules define the allowable members of a JSON object. Their rule definitions are composed of member rules and group rules. They take the following form:
The following rule example defines an object composed of two member rules:
Given the general rule that where a rule name is found a rule definition of the appropriate type may be used, the above example might also be written:
Rules given in the rule definition of an object rule do not imply order. Given the example object rule above both
and
are JSON objects that match the rule.
Member rules or member rule definitions may not be repeated in the rule definition of an object rule. However, a member of an object can be marked as optional if the member rule defining it is preceded by the question mark ('?') character. In the following example, the location_uri member is optional while the status_code member is required to be in the defined object:
An object rule can also define the choice between members by placing the forward slash ('/') character between two member rules. In the following example, the object being defined can have either a location_uri member or content_type member and must have a status_code member:
Finally, the specification of a member of an object can be conditioned upon the the specification of another member of that object by placing the ampersand ('&') character between two member rules. Using this syntax, the member defined by the second rule is only allowed in the object if the member defined by the first rule is given. Or in other words, the appearance of the second member depends upon the appearance of the first member. In the following example, the object defined can have a referrer_uri so long as location_uri is also present:
Array rules define the allowable content of JSON arrays. Their rule definitions are composed of value rules, object rules, group rules, and other array rules and have the following form:
The following example defines an array where element 1 is defined by the width_value rule and element 2 is defined by the height_value rule:
Unlike object rules, order is implied by the array rule definition. That is, the first rule referenced or defined within an array rule specifies that the first element of the array will match that rule, the second rule given with the array rule specifies that the second element of the array will match that rule, and so on.
Take for example the following array rule definition:
This JSON array matches the above rule:
while this one does not:
As with object rules, the forward slash character ('/') can be used to indicate a choice between two elements. Take for example the following rules:
which would validate
or
Repetition of array values may also be specified by preceding a rule with an asterisk ('*') character surrounded by the lower bound and upper bound of the repetition (e.g. "0*1"). The following rules define an array that has between one and three strings:
Both the lower bound and the upper bound are optional. If lower bound is not given then it is assumed to be zero. If the upper bound is not given then it is assumed to be infinity. The following example defines an array with an infinite number of child_value defined strings:
Unlike the other types of rules, group rules have no direct tie with JSON syntax. Group rules simply group together other rules. They take the form:
Group rule definitions and any nesting of group rule definitions, must conform to the allowable set of rules of the rule containing them. A group rule referenced inside of an array rule may not contain a member rule since member rules are not allowed in array rules directly. Likewise, a group rule referenced inside an object rule must only contain member rules, and once group rules used in an object rule are fully dereferenced there must be no duplicate member rules as member rules in object rules are required to be unique.
Take for example the following rules:
These rules describe a JSON object that might look like this:
Groups can also be used with the choice and dependency syntax in member rules. Here the object can either have first_two_children or second_two_children:
and here the object can have second_two_children only if first_two_children are given:
It is possible to specify that a value can be of any type allowable by JSON using the any value rule. This is done with the 'any' literal in a value rule:
However, unlike other value rules which define primitive data types, this rule defines a value of any kind, either primitive (null, boolean, number, string), object, or array.
Use of the any value rule in arrays can be used with repetition to define arrays that may contain any value:
Specifying any object member name in a member rule with the any member rule is done by pre-pending a carat character ('^') to an empty member name (that is, ^"" signifies any member name). This has the following form:
As an example, the following defines an object member with any name that has a value that is a string:
Usage of the any member rule must still satisfy the criteria that all member names of an object be unique.
Constructing an object member of any name with any type would therefore take the form:
Unlike other types of member rules, it is possible to use repetition with the any member rule in an object rule. The repetition syntax and semantics are the same as the repetition syntax and semantics of repetition with array rules. The following example rules define an object that may contain any number of members where each member may have any value.
Use of the repetition of any member rules must satisfy the criteria that all member names of an object be unique.
In some contexts it is necessary that there be a rule that defines the outer most JSON object or array, or if thought of as an inverted object tree the structure at the very top. If in a collection of rules there is no rule explicitly specified for this purpose and a rule named "root" is given, it can be assumed to be the outer most JSON structure or the root of an object/array tree. If a rule is explicitly specified other than "root" and there exists a rule named "root", that rule name holds no special meaning.
Directives change the interpretation of a collection of rules. They begin with a hash character ('#') and are terminated by the end of a line. They take the following form:
This directive specifies that any member of any object which has not been specified should be ignored. Ignored object members may have a value of any type. This directive cannot be used in any collection of rules that has an any member rule.
This directive specifies that every member name of every object, either explicitly defined or specified via an any member rule or the ignore-unknown-members directive must be a name compatible with programming languages. The intent is to specify object member names that may be promoted to first-order object attributes or methods in an API. The following ABNF describes the restrictions upon the member names:
ABNF for programming language compatible JSON names
name = ALPHA *( ALPHA / DIGIT / "_" )
Figure 6
This directive specifies that every member of every object is not required. This directive effectively pre-pends a '?' to every member rule in every object rule.
The following ABNF describes the syntax for JSON Content Rules.
grammar = 1*(rule / directive) *c-wsp rule = rulename definition definition = *c-wsp ( value-rule / member-rule / array-rule / object-rule / group-rule ) ; rulenames must be unique, and may not be a reserved word rulename = *c-wsp ALPHA *(ALPHA / DIGIT / "-" / "_") ; Adapted from the ABNF for JSON, RFC 4627 s 2.4 float = [ "-" ] int [ frac ] [ exp ] integer = [ "-" ] int [ exp ] exp = ( "e" / "E" ) [ "+" / "-" ] 1*DIGIT frac = "." 1*DIGIT int = "0" / ( %x31-39 *DIGIT ) ; The regex-char rule allows for any sequence of characters, including ; whitespace and newlines, with backslash only allowed before either ; a forward or a backslash. regex-char = %x21-2E / %x30-5D / %x5E-7E / WSP / CR / LF / "\/" / "\\" ; uri-scheme from RFC 3986 uri-scope = *c-wsp ( "relative" / "full" ) uri-scheme = *c-wsp ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) boolean-type = "boolean" null-type = "null" integer-type = "integer" [ 1*c-wsp integer ".." integer ] float-type = "float" [ 1*c-wsp float ".." float ] string-type = "string" [ *c-wsp "/" *regex-char "/" ] uri-type = "uri" [ uri-scope ] [ uri-scheme ] ip-type = "ip4" / "ip6" dns-type = "fqdn" / "idn" date-type = "date-time" / "full-date" / "full-time" email-type = "email" [ *c-wsp ( "2822" / "5322" ) ] phone-type = "phone" base64-type = "base64" any-type = "any" value-rule = ":" *c-wsp type-rule type-rule = boolean-type / null-type / integer-type / float-type / string-type / uri-type / ip-type / dns-type / date-type / email-type / phone-type / base64-type / any-type inline-rule = *c-wsp ( rulename / definition ) ; The defintion of a JSON string, from RFC 4627 s 2 json-name = %x20-21 / %x23-5B / %x5D-10FFFF / "\" ( %x22 / ; " u+0022 %x5C / ; \ u+005C %x2F / ; / u+002F %x62 / ; BS u+0008 %x66 / ; FF u+000C %x6E / ; LF u+000A %x72 / ; CR u+000D %x74 / ; HT u+0009 ( %x75 4HEXDIG ) ) ; uXXXX u+XXXX member-rule = ( ( "^" %x22.22 ) / ( %x22 *json-name %x22 ) ) inline-rule object-rule = "{" [ object-member *( *c-wsp ( "," / "/" / "&" ) object-member ) ] *c-wsp "}" object-member = *c-wsp ["?"] ( rulename / member-rule / group-rule ) array-rule = "[" [ array-member *( *c-wsp "," array-member ) ] *c-wsp "]" array-count = *c-wsp [ [int] "*" [int] *c-wsp ] array-member = array-count ( rulename / value-rule / object-rule / group-rule ) [ *c-wsp "/" array-member ] group-rule = "(" [ group-member *( *c-wsp "," group-member) ] *c-wsp ")" group-member = ["?"] inline-rule [ *c-wsp ( "/" / "&" ) group-member ] directive = *c-wsp "#" *( VCHAR / WSP / %x7F-10FFFF ) EOL ; Taken from the ABNF for ABNF (RFC 4627 section 4) and slightly adapted ; newlines in a c-wsp do not need whitespace at the start of a newline ; to form a valid continuation line, and EOL might not be a full CRLF c-wsp = WSP / c-nl c-nl = comment / EOL comment = ";" *(WSP / VCHAR) EOL EOL = 1*( CR / LF ) ; core rules ALPHA = %x41-5A / %x61-7A ; A-Z / a-z CR = %x0D DIGIT = %x30-39 HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" LF = %x0A VCHAR = %x21-7E WSP = SP / HTAB SP = %x20 HTAB = %x09
JSON Content Rules ABNF
Many thanks to Byron Ellacott for providing the ABNF in Section 5.
[RFC1166] | Kirkpatrick, S., Stahl, M. and M. Recker, "Internet numbers", RFC 1166, July 1990. |
[RFC2822] | Resnick, P., "Internet Message Format", RFC 2822, April 2001. |
[RFC3339] | Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, July 2002. |
[RFC3986] | Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005. |
[RFC4234] | Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 4234, October 2005. |
[RFC4627] | Crockford, D., "The application/json Media Type for JavaScript Object Notation (JSON)", RFC 4627, July 2006. |
[RFC4648] | Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, October 2006. |
[RFC5322] | Resnick, P., "Internet Message Format", RFC 5322, October 2008. |
[RFC5952] | Kawamura, S. and M. Kawashima, "A Recommendation for IPv6 Address Text Representation", RFC 5952, August 2010. |
This section compares this specification, JSON Content Rules, with JSON Schema using examples.
Example JSON lifted from RFC 4627
[ { "precision": "zip", "Latitude": 37.7668, "Longitude": -122.3959, "Address": "", "City": "SAN FRANCISCO", "State": "CA", "Zip": "94107", "Country": "US" }, { "precision": "zip", "Latitude": 37.371991, "Longitude": -122.026020, "Address": "", "City": "SUNNYVALE", "State": "CA", "Zip": "94085", "Country": "US" } ]
JSON Content Rules
root [ 2*2{ "precision" : string, "Latitude" : float, "Longitude" : float, "Address" : string, "City" : string, "State" : string, "Zip" : string, "Country" : string } ]
JSON Schema
{ "type": "array", "items": [ { "type": "object", "properties": { "precision": { "type": "string", "required": "true" }, "Latitude": { "type": "number", "required": "true" }, "Longitude": { "type": "number", "required": "true" }, "Address" : { "type": "string", "required": "true" }, "City" : { "type": "string", "required": "true" }, "State" : { "type" : "string", "required": "true" }, "Zip" : { "type" : "string", "required": "true" }, "Country" : { "type" : "string", "required": "true" } } } ], "minItems" : 2, "maxItems" : 2 }
Example JSON shamelessly lifted from RFC 4627
{ "Image": { "Width": 800, "Height": 600, "Title": "View from 15th Floor", "Thumbnail": { "Url": "http://www.example.com/image/481989943", "Height": 125, "Width": "100" }, "IDs": [116, 943, 234, 38793] } }
JSON Content Rules
width "width" : integer 0..1280 height "height" : integer 0..1024 root { "Image" { width, height, "Title" :string, "thumbnail" { width, height, "Url" :uri }, "IDs" [ *:integer ] } }
JSON Schema
{ "type" : "object", "properties" : { "Image": { "type" : "object", "properties" : { "Width" : { "type" : "integer", "minimum" : 0, "maximum" : 1280, "required" : "true" } "Height" : { "type" : "integer", "minimum" : 0, "maximum" : 1024, "required" : "true" } "Title" : { "type": "string" }, "Thumbnail" : { "type" : "object", "properties" : { "Url" : { "type" : "string", "format" : "uri", "required" : "true" }, "Width" : { "type" : "integer", "minimum" : 0, "maximum" : 1280, "required" : "true" }, "Height" : { "type" : "integer", "minimum" : 0, "maximum" : 1280, "required" : "true" } } }, "IDs" : { "type":"array", "items":[ { "type": "integer" } ], "required" : "true" } } } } }
The following example is taken from draft-ietf-weirds-json-response-00. It describes the entity object (Section 4), the nameserver object (Section 5) and many of the other sub-structures used in objects defined in other sections of that draft.
JSON Content Rules for nameserver and entity from draft-ietf-weirds-json-response
# all-members-optional # ignore-unknown-members # language-compatible-members ; the nameserver object ; models nameserver host information ; this often referred to as 'host' object too nameserver { ; the host name of the name server "name" : fqdn, ; the ip addresses of the nameserver "ipAddresses" [ *( :ip4 / :ip6 ) ], common } ; the entity object ; This object object represents the information of organizations, ; corporations, governments, non-profits, clubs, individual persons, ; and informal groups of people. entity { ; the names by which the entity is commonly known "names" [ *:string ], ; the roles this entity has with any containing object "roles" [ *:string ], ; the place where the person, org, etc... receives postal mail ; THIS IS NOT LOCATION "postalAddress" [ *:string ], ; electronic mailboxes where the person, org, etc... ; receives messages "emails" [ *:email 2822 ], ; phones where the person, org, etc... receives ; telephonic communication "phones" { "office" [ *:phone ], ; office phones "fax" [ *:phone ], ; facsilime machines "mobile" [ *:phone ] ; cell phones and the like }, common } ; The members "handle", "status", "remarks", "uris", "port43", ; "sponsoredBy", "resoldBy", "registrationBy", "registrationDate", ; "lastChangedDate", and "lastChangedBy" are used in many objects common ( ; a registry-unique identifier "handle" : string, ; an array of status values "status" [ *:string ], ; an array of strings, each containing comments about the object "remarks" [ *:string ]. ; an array of uri objects ; "type" referrs to the application of the URI ; "uri" is the uri "uris" [ *{ "type" : string, "uri" : uri } ], ; a string containing the fully-qualified host name of the ; WHOIS [RFC3912] server where the object instance may be found "port43" : fqdn, ; a string containing an identifier of the party ; through which the registration was made, such as an IANA approved ; registrar "sponsoredBy" : string, ; a string containing an identifier of the party ; originating the registration of the object. "resoldBy" : string, ; a string containing an identifier of the party ; responsible for the registration of the object "registrationBy" : string, ; the date the object was registered "registrationDate" : date-time, ; the date of last change made to the object "lastChangedDate" : date-time, ; a string containing an identifier of the party ; responsible for the last change made to the registration "lastChangedBy" : string )
JSON does not disallow non-unique object member names ( in other words, it allows non-unique object member names ) but strongly advises against the use of non-unique object member names. Many JSON implementations use hash-indexed maps to represent JSON objects, where the object's member names are the key of the hash index. Non-uniqueness would break such implementations or result in the value of the last member given overwriting the value of all previous members of the same name.
Therefore, allowing non-unique object member names would be bad practice. For this reason, this specification does not accommodate the need for non-unique object member names.
JSON gives awkward guidance regarding ordering of object member names. However, many JSON implementations use hash-indexed maps to represent JSON objects, where the object's member names are the key of the hash index. Though it is possible, usually these maps have no explicit order as the only index is the hash.
Therefore, this specification does not provide a means to imply order of object member names.
It is possible to create a separate group syntax for array rules vs object rules, since allowable group rule content is determined by the containing rule. For instance, while the syntax for groups in objects could have been "( blah blah )", syntax for groups in arrays could have been "< blah blah >". That may be more distinctive and allow the formal syntax parser to handle rule content validity, but the added extra syntax appeared to hurt readability. There is only so many enclosure characters a person should reasonably be required to know, and adding yet another did not seem prudent.
The original approach to this problem was to find a concise way to describe JSON data structures; to do for JSON what RelaxNG compact syntax does for XML. The syntax itself hopefully has a JSON-ness or a JSON feel to it. And a good bit of inspiration came from ABNF.
From -00 to -01
From -01 to -02
The following is a list of possible modifications and additions to this specification.