Internet Engineering Task Force | A. Wright, Ed. |
Internet-Draft | |
Intended status: Informational | G. Luff |
Expires: April 16, 2017 | October 13, 2016 |
JSON Schema Validation: A Vocabulary for Structural Validation of JSON
draft-wright-json-schema-validation-00
JSON Schema (application/schema+json) has several purposes, one of which is JSON instance validation. This document specifies a vocabulary for JSON Schema to describe the meaning of JSON documents, provide hints for user interfaces working with JSON data, and to make assertions about what a valid document must look like.
The issues list for this draft can be found at <https://github.com/json-schema-org/json-schema-spec/issues>.
For additional information, see <http://json-schema.org/>.
To provide feedback, use this issue tracker, the communication methods listed on the homepage, or email the document editors.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on April 16, 2017.
Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
JSON Schema can be used to require that a given JSON document (an instance) satisfies a certain number of criteria. These criteria are asserted by using keywords described in this specification. In addition, a set of keywords is also defined to assist in interactive, user interface instance generation.
This specification will use the terminology defined by the JSON Schema core [json-schema] specification.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].
This specification uses the term "container instance" to refer to both array and object instances. It uses the term "children instances" to refer to array elements or object member values.
This specification uses the term "property set" to refer to the set of an object's member names; for instance, the property set of JSON Object { "a": 1, "b": 2 } is [ "a", "b" ].
Elements in an array value are said to be unique if no two elements of this array are equal [json-schema].
It should be noted that the nul character (\u0000) is valid in a JSON string. An instance to validate may contain a string value with this character, regardless of the ability of the underlying programming language to deal with such data.
The JSON specification allows numbers with arbitrary precision, and JSON Schema does not add any such bounds. This means that numeric instances processed by JSON Schema can be arbitrarily large and/or have an arbitrarily long decimal part, regardless of the ability of the underlying programming language to deal with such data.
Two validation keywords, "pattern" and "patternProperties", use regular expressions to express constraints. These regular expressions SHOULD be valid according to the ECMA 262 [ecma262] regular expression dialect.
Furthermore, given the high disparity in regular expression constructs support, schema authors SHOULD limit themselves to the following regular expression tokens:
Finally, implementations MUST NOT take regular expressions to be anchored, neither at the beginning nor at the end. This means, for instance, the pattern "es" matches "expression".
Most validation keywords only limit the range of values within a certain primitive type. When the primitive type of the instance is not of the type targeted by the keyword, the validation succeeds.
For example, the "maxLength" keyword will only restrict certain strings (that are too long) from being valid. If the instance is a number, boolean, null, array, or object, the keyword passes validation.
Validation keywords that are missing never restrict validation. In some cases, this no-op behavior is identical to a keyword that exists with certain values, and these values are noted where known.
Validation keywords typically operate independent of each other, without affecting each other.
For author convienence, there are some exceptions:
Validation keywords in a schema impose requirements for successfully validating an instance.
The value of "multipleOf" MUST be a number, strictly greater than 0.
A numeric instance is only valid if division by this keyword's value results in an integer.
The value of "maximum" MUST be a number, representing an upper limit for a numeric instance.
If the instance is a number, then this keyword validates if "exclusiveMaximum" is true and instance is less than the provided value, or else if the instance is less than or exactly equal to the provided value.
The value of "exclusiveMaximum" MUST be a boolean, representing whether the limit in "maximum" is exclusive or not. An undefined value is the same as false.
If "exclusiveMaximum" is true, then a numeric instance SHOULD NOT be equal to the value specified in "maximum". If "exclusiveMaximum" is false (or not specified), then a numeric instance MAY be equal to the value of "maximum".
The value of "minimum" MUST be a number, representing a lower limit for a numeric instance.
If the instance is a number, then this keyword validates if "exclusiveMinimum" is true and instance is greater than the provided value, or else if the instance is greater than or exactly equal to the provided value.
The value of "exclusiveMinimum" MUST be a boolean, representing whether the limit in "minimum" is exclusive or not. An undefined value is the same as false.
If "exclusiveMinimum" is true, then a numeric instance SHOULD NOT be equal to the value specified in "minimum". If "exclusiveMinimum" is false (or not specified), then a numeric instance MAY be equal to the value of "minimum".
The value of this keyword MUST be a non-negative integer.
The value of this keyword MUST be an integer. This integer MUST be greater than, or equal to, 0.
A string instance is valid against this keyword if its length is less than, or equal to, the value of this keyword.
The length of a string instance is defined as the number of its characters as defined by RFC 7159 [RFC7159].
A string instance is valid against this keyword if its length is greater than, or equal to, the value of this keyword.
The length of a string instance is defined as the number of its characters as defined by RFC 7159 [RFC7159].
The value of this keyword MUST be an integer. This integer MUST be greater than, or equal to, 0.
"minLength", if absent, may be considered as being present with integer value 0.
The value of this keyword MUST be a string. This string SHOULD be a valid regular expression, according to the ECMA 262 regular expression dialect.
A string instance is considered valid if the regular expression matches the instance successfully. Recall: regular expressions are not implicitly anchored.
The value of "additionalItems" MUST be either a boolean or an object. If it is an object, this object MUST be a valid JSON Schema.
The value of "items" MUST be either a schema or array of schemas.
Successful validation of an array instance with regards to these two keywords is determined as follows:
If either keyword is absent, it may be considered present with an empty schema.
The value of this keyword MUST be an integer. This integer MUST be greater than, or equal to, 0.
An array instance is valid against "maxItems" if its size is less than, or equal to, the value of this keyword.
The value of this keyword MUST be an integer. This integer MUST be greater than, or equal to, 0.
An array instance is valid against "minItems" if its size is greater than, or equal to, the value of this keyword.
If this keyword is not present, it may be considered present with a value of 0.
The value of this keyword MUST be a boolean.
If this keyword has boolean value false, the instance validates successfully. If it has boolean value true, the instance validates successfully if all of its elements are unique.
If not present, this keyword may be considered present with boolean value false.
The value of this keyword MUST be an integer. This integer MUST be greater than, or equal to, 0.
An object instance is valid against "maxProperties" if its number of properties is less than, or equal to, the value of this keyword.
The value of this keyword MUST be an integer. This integer MUST be greater than, or equal to, 0.
An object instance is valid against "minProperties" if its number of properties is greater than, or equal to, the value of this keyword.
If this keyword is not present, it may be considered present with a value of 0.
The value of this keyword MUST be an array. This array MUST have at least one element. Elements of this array MUST be strings, and MUST be unique.
An object instance is valid against this keyword if its property set contains all elements in this keyword's array value.
The value of "properties" MUST be an object. Each value of this object MUST be an object, and each object MUST be a valid JSON Schema.
If absent, it can be considered the same as an empty object.
The value of "patternProperties" MUST be an object. Each property name of this object SHOULD be a valid regular expression, according to the ECMA 262 regular expression dialect. Each property value of this object MUST be an object, and each object MUST be a valid JSON Schema.
If absent, it can be considered the same as an empty object.
The value of "additionalProperties" MUST be a boolean or a schema.
If "additionalProperties" is absent, it may be considered present with an empty schema as a value.
If "additionalProperties" is true, validation always succeeds.
If "additionalProperties" is false, validation succeeds only if the instance is an object and all properties on the instance were covered by "properties" and/or "patternProperties".
If "additionalProperties" is an object, validate the value as a schema to all of the properties that weren't validated by "properties" nor "patternProperties".
This keyword specifies rules that are evaluated if the instance is an object and contains a certain property.
This keyword's value MUST be an object. Each property specifies a dependency. Each dependency value MUST be an object or an array.
If the dependency value is an object, it MUST be a valid JSON Schema. If the dependency key is a property in the instance, the dependency value must validate against the entire instance.
If the dependency value is an array, it MUST have at least one element, each element MUST be a string, and elements in the array MUST be unique. If the dependency key is a property in the instance, each of the items in the dependency value must be a property that exists in the instance.
The value of this keyword MUST be an array. This array SHOULD have at least one element. Elements in the array SHOULD be unique.
Elements in the array MAY be of any type, including null.
An instance validates successfully against this keyword if its value is equal to one of the elements in this keyword's array value.
The value of this keyword MUST be either a string or an array. If it is an array, elements of the array MUST be strings and MUST be unique.
String values MUST be one of the seven primitive types defined by the core specification.
An instance matches successfully if its primitive type is one of the types defined by keyword. Recall: "number" includes "integer".
This keyword's value MUST be an array. This array MUST have at least one element.
Elements of the array MUST be objects. Each object MUST be a valid JSON Schema.
An instance validates successfully against this keyword if it validates successfully against all schemas defined by this keyword's value.
This keyword's value MUST be an array. This array MUST have at least one element.
Elements of the array MUST be objects. Each object MUST be a valid JSON Schema.
An instance validates successfully against this keyword if it validates successfully against at least one schema defined by this keyword's value.
This keyword's value MUST be an array. This array MUST have at least one element.
Elements of the array MUST be objects. Each object MUST be a valid JSON Schema.
An instance validates successfully against this keyword if it validates successfully against exactly one schema defined by this keyword's value.
This keyword's value MUST be an object. This object MUST be a valid JSON Schema.
An instance is valid against this keyword if it fails to validate successfully against the schema defined by this keyword.
This keyword's value MUST be an object. Each member value of this object MUST be a valid JSON Schema.
This keyword plays no role in validation per se. Its role is to provide a standardized location for schema authors to inline JSON Schemas into a more general schema.
{ "type": "array", "items": { "$ref": "#/definitions/positiveInteger" }, "definitions": { "positiveInteger": { "type": "integer", "minimum": 0, "exclusiveMinimum": true } } }
As an example, here is a schema describing an array of positive integers, where the positive integer constraint is a subschema in "definitions":
The value of both of these keywords MUST be a string.
Both of these keywords can be used to decorate a user interface with information about the data produced by this user interface. A title will preferrably be short, whereas a description will provide explanation about the purpose of the instance described by this schema.
Both of these keywords MAY be used in root schemas, and in any subschemas.
There are no restrictions placed on the value of this keyword.
This keyword can be used to supply a default JSON value associated with a particular schema. It is RECOMMENDED that a default value be valid against the associated schema.
This keyword MAY be used in root schemas, and in any subschemas.
Structural validation alone may be insufficient to validate that an instance meets all the requirements of an application. The "format" keyword is defined to allow interoperable semantic validation for a fixed subset of values which are accurately described by authoritative resources, be they RFCs or other external specifications.
The value of this keyword is called a format attribute. It MUST be a string. A format attribute can generally only validate a given set of instance types. If the type of the instance to validate is not in this set, validation for this format attribute and instance SHOULD succeed.
Implementations MAY support the "format" keyword. Should they choose to do so:
Implementations MAY add custom format attributes. Save for agreement between parties, schema authors SHALL NOT expect a peer implementation to support this keyword and/or custom format attributes.
This attribute applies to string instances.
A string instance is valid against this attribute if it is a valid date representation as defined by RFC 3339, section 5.6 [RFC3339].
This attribute applies to string instances.
A string instance is valid against this attribute if it is a valid Internet email address as defined by RFC 5322, section 3.4.1 [RFC5322].
This attribute applies to string instances.
A string instance is valid against this attribute if it is a valid representation for an Internet host name, as defined by RFC 1034, section 3.1 [RFC1034].
This attribute applies to string instances.
A string instance is valid against this attribute if it is a valid representation of an IPv4 address according to the "dotted-quad" ABNF syntax as defined in RFC 2673, section 3.2 [RFC2673].
This attribute applies to string instances.
A string instance is valid against this attribute if it is a valid representation of an IPv6 address as defined in RFC 2373, section 2.2 [RFC2373].
This attribute applies to string instances.
A string instance is valid against this attribute if it is a valid URI, according to [RFC3986].
This attribute applies to string instances.
A string instance is valid against this attribute if it is a valid URI Reference (either a URI or a relative-reference), according to [RFC3986].
JSON Schema validation defines a vocabulary for JSON Schema core and conserns all the security considerations listed there.
JSON Schema validation allows the use of Regular Expressions, which have numerous different (often incompatible) implementations. Some implementations allow the embedding of arbritrary code, which is outside the scope of JSON Schema and MUST NOT be permitted. Regular expressions can often also be crafted to be extremely expensive to compute (with so-called "catastrophic backtracking"), resulting in a denial-of-service attack.
This specification does not have any influence with regards to IANA.
[RFC2119] | Bradner, S., Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997. |
[json-schema] | JSON Schema: A Media Type for Describing JSON Documents", Internet-Draft draft-wright-json-schema-00, October 2016. | , "
[RFC1034] | Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987. |
[RFC2373] | Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 2373, DOI 10.17487/RFC2373, July 1998. |
[RFC2673] | Crawford, M., "Binary Labels in the Domain Name System", RFC 2673, DOI 10.17487/RFC2673, August 1999. |
[RFC3339] | Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002. |
[RFC3986] | Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005. |
[RFC7159] | Bray, T., "The JavaScript Object Notation (JSON) Data Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March 2014. |
[RFC5322] | Resnick, P., Internet Message Format", RFC 5322, DOI 10.17487/RFC5322, October 2008. |
[ecma262] | ECMA 262 specification" | , "
Thanks to Gary Court, Francis Galiegue, Kris Zyp, and Geraint Luff for their work on the initial drafts of JSON Schema.
Thanks to Jason Desrosiers, Daniel Perrett, Erik Wilde, Ben Hutton, Evgeny Poberezkin, and Henry H. Andrews for their submissions and patches to the document.
[CREF1]This section to be removed before leaving Internet-Draft status.