Understanding JSONSchema
Understanding JSONSchema
Understanding JSONSchema
Release 6.0
Michael Droettboom, et al
Space Telescope Science Institute
2 What is a schema? 7
3 The basics 11
3.1 Hello, World! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 The type keyword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Declaring a JSON Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Declaring a unique identifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
i
4.8 Generic keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.8.1 Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.8.2 Enumerated values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.8.3 Constant values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.9 Combining schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.9.1 allOf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.9.2 anyOf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.9.3 oneOf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.9.4 not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.10 The $schema keyword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.10.1 Advanced . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.11 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.11.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6 Acknowledgments 67
Index 69
ii
Understanding JSON Schema, Release 6.0
JSON Schema is a powerful tool for validating the structure of JSON data. However, learning to use it by reading its
specification is like learning to drive a car by looking at its blueprints. You don’t need to know how an electric motor
fits together if all you want to do is pick up the groceries. This book, therefore, aims to be the friendly driving instructor
for JSON Schema. It’s for those that want to write it and understand it, but maybe aren’t interested in building their
own car—er, writing their own JSON Schema validator—just yet.
Note: This book describes JSON Schema draft 6. The most recent version is draft 7 — stay tuned, updates are coming!
Earlier and later versions of JSON Schema are not completely compatible with the format described here.
Where to begin?
• This book uses some novel conventions (page 3) for showing schema examples and relating JSON Schema to
your programming language of choice.
• If you’re not sure what a schema is, check out What is a schema? (page 7).
• The basics (page 11) chapter should be enough to get you started with understanding the core JSON Schema
Reference (page 15).
• When you start developing large schemas with many nested and repeated sections, check out Structuring a
complex schema (page 59).
• json-schema.org has a number of resources, including the official specification and tools for working with JSON
Schema from various programming languages.
• jsonschema.net is an online application run your own JSON schemas against example documents. If you want to
try things out without installing any software, it’s a very handy resource.
Contents 1
Understanding JSON Schema, Release 6.0
2 Contents
CHAPTER 1
Python
In Python, JSON can be read using the json module in the standard library.
Ruby
For C, you may want to consider using Jansson to read and write JSON.
3
Understanding JSON Schema, Release 6.0
Draft 4
1.3 Examples
There are many examples throughout this book, and they all follow the same format. At the beginning of each example
is a short JSON schema, illustrating a particular principle, followed by short JSON snippets that are either valid or
invalid against that schema. Valid examples are in green, with a checkmark. Invalid examples are in red, with a cross.
Often there are comments in between to explain why something is or isn’t valid.
Note: These examples are tested automatically whenever the book is built, so hopefully they are not just helpful, but
also correct!
For example, here’s a snippet illustrating how to use the number type:
{ json schema }
{ "type": "number" }
!
42
!
-1
%
"42"
1.3. Examples 5
Understanding JSON Schema, Release 6.0
What is a schema?
If you’ve ever used XML Schema, RelaxNG or ASN.1 you probably already know what a schema is and you can
happily skip along to the next section. If all that sounds like gobbledygook to you, you’ve come to the right place. To
define what JSON Schema is, we should probably first define what JSON is.
JSON stands for “JavaScript Object Notation”, a simple data interchange format. It began as a notation for the world
wide web. Since JavaScript exists in most web browsers, and JSON is based on JavaScript, it’s very easy to support
there. However, it has proven useful enough and simple enough that it is now used in many other contexts that don’t
involve web surfing.
At its heart, JSON is built on the following data structures:
• object:
• array:
• number:
42
3.1415926
• string:
"This is a string"
• boolean:
true
false
• null:
7
Understanding JSON Schema, Release 6.0
null
These types have analogs in most programming languages, though they may go by different names.
Python
The following table maps from the names of JavaScript types to their analogous types in Python:
JavaScript Python
string string
number int/float
object dict
array list
boolean bool
null None
45
4 Since JavaScript strings always support unicode, they are analogous to unicode on Python 2.x and str on Python 3.x.
5 JavaScript does not have separate types for integer and floating-point.
Ruby
The following table maps from the names of JavaScript types to their analogous types in Ruby:
JavaScript Ruby
string String
number Integer/Float
object Hash
array Array
boolean TrueClass/FalseClass
null NilClass
6 JavaScript does not have separate types for integer and floating-point.
With these simple data types, all kinds of structured data can be represented. With that great flexibility comes great
responsibility, however, as the same concept could be represented in myriad ways. For example, you could imagine
representing information about a person in JSON in different ways:
{
"name": "George Washington",
"birthday": "February 22, 1732",
"address": "Mount Vernon, Virginia, United States"
}
{
"first_name": "George",
(continues on next page)
Both representations are equally valid, though one is clearly more formal than the other. The design of a record will
largely depend on its intended use within the application, so there’s no right or wrong answer here. However, when
an application says “give me a JSON record for a person”, it’s important to know exactly how that record should be
organized. For example, we need to know what fields are expected, and how the values are represented. That’s where
JSON Schema comes in. The following JSON Schema fragment describes how the second example above is structured.
Don’t worry too much about the details for now. They are explained in subsequent chapters.
{ json schema }
{
"type": "object",
"properties": {
"first_name": { "type": "string" },
"last_name": { "type": "string" },
"birthday": { "type": "string", "format": "date-time" },
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" },
"country": { "type" : "string" }
}
}
}
}
By “validating” the first example against this schema, you can see that it fails:
%
{
"name": "George Washington",
"birthday": "February 22, 1732",
"address": "Mount Vernon, Virginia, United States"
}
9
Understanding JSON Schema, Release 6.0
!
{
"first_name": "George",
"last_name": "Washington",
"birthday": "22-02-1732",
"address": {
"street_address": "3200 Mount Vernon Memorial Highway",
"city": "Mount Vernon",
"state": "Virginia",
"country": "United States"
}
}
You may have noticed that the JSON Schema itself is written in JSON. It is data itself, not a computer program. It’s just
a declarative format for “describing the structure of other data”. This is both its strength and its weakness (which it
shares with other similar schema languages). It is easy to concisely describe the surface structure of data, and automate
validating data against it. However, since a JSON Schema can’t contain arbitrary code, there are certain constraints
on the relationships between data elements that can’t be expressed. Any “validation tool” for a sufficiently complex
data format, therefore, will likely have two phases of validation: one at the schema (or structural) level, and one at
the semantic level. The latter check will likely need to be implemented using a more general-purpose programming
language.
The basics
In What is a schema? (page 7), we described what a schema is, and hopefully justified the need for schema languages.
Here, we proceed to write a simple JSON Schema.
{ json schema }
{ }
!
"I'm a string"
!
{ "an": [ "arbitrarily", "nested" ], "data": "structure" }
New in draft 6
11
Understanding JSON Schema, Release 6.0
You can also use true in place of the empty object to represent a schema that matches anything, or false for a schema
that matches nothing.
{ json schema }
true
!
"I'm a string"
!
{ "an": [ "arbitrarily", "nested" ], "data": "structure" }
{ json schema }
false
%
"Resistance is futile... This will always fail!!!"
Note: When this book refers to JSON Schema “keywords”, it means the “key” part of the key/value pair in an object.
Most of the work of writing a JSON Schema involves mapping a special “keyword” to a value within an object.
{ json schema }
{ "type": "string" }
!
"I'm a string"
%
42
The type keyword is described in more detail in Type-specific keywords (page 15).
Note: For brevity, the $schema keyword isn’t included in most of the examples in this book, but it should always be
used in the real world.
{ json schema }
{ "$schema": "http://json-schema.org/schema#" }
You can also use this keyword to declare which version of the JSON Schema specification that the schema is written to.
See The $schema keyword (page 55) for more information.
{ "$id": "http://yourdomain.com/schemas/myschema.json" }
The details of The $id property (page 63) become more apparent when you start Structuring a complex schema (page 59).
New in draft 6
Draft 4
15
Understanding JSON Schema, Release 6.0
Python
The following table maps from the names of JavaScript types to their analogous types in Python:
JavaScript Python
string string
number int/float
object dict
array list
boolean bool
null None
45
4 Since JavaScript strings always support unicode, they are analogous to unicode on Python 2.x and str on Python 3.x.
5 JavaScript does not have separate types for integer and floating-point.
Ruby
The following table maps from the names of JavaScript types to their analogous types in Ruby:
JavaScript Ruby
string String
number Integer/Float
object Hash
array Array
boolean TrueClass/FalseClass
null NilClass
6 JavaScript does not have separate types for integer and floating-point.
{ json schema }
{ "type": "number" }
!
42
!
42.0
In the following example, we accept strings and numbers, but not structured data types:
{ json schema }
!
42
!
"Life, the universe, and everything"
%
["Life", "the universe", "and everything"]
For each of these types, there are keywords that only apply to those types. For example, numeric types have a way of
specifying a numeric range, that would not be applicable to other types. In this reference, these validation keywords are
described along with each of their corresponding types in the following chapters.
4.2 string
The string type is used for strings of text. It may contain Unicode characters.
Python
In Python, "string" is analogous to the unicode type on Python 2.x, and the str type on Python 3.x.
Ruby
4.2. string 17
Understanding JSON Schema, Release 6.0
{ json schema }
{ "type": "string" }
!
"This is a string"
Unicode characters:
!
"Déjà vu"
!
""
!
"42"
%
42
4.2.1 Length
The length of a string can be constrained using the minLength and maxLength keywords. For both keywords, the value
must be a non-negative number.
{ json schema }
{
"type": "string",
"minLength": 2,
"maxLength": 3
}
%
"A"
!
"AB"
!
"ABC"
%
"ABCD"
Note: When defining the regular expressions, it’s important to note that the string is considered valid if the expression
matches anywhere within the string. For example, the regular expression "p" will match any string with a p in it, such
as "apple" not just a string that is simply "p". Therefore, it is usually less confusing, as a matter of course, to surround
the regular expression in ^...$, for example, "^p$", unless there is a good reason not to do so.
The following example matches a simple North American telephone number with an optional area code:
{ json schema }
{
"type": "string",
"pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$"
}
!
"555-1212"
!
"(888)555-1212"
%
"(888)555-1212 ext. 532"
4.2. string 19
Understanding JSON Schema, Release 6.0
%
"(800)FLOWERS"
4.2.3 Format
The format keyword allows for basic semantic validation on certain kinds of string values that are commonly used.
This allows values to be constrained beyond what the other tools in JSON Schema, including Regular Expressions
(page 56) can do.
Note: JSON Schema implementations are not required to implement this part of the specification, and many of them
do not.
There is a bias toward networking-related formats in the JSON Schema specification, most likely due to its heritage in
web technologies. However, custom formats may also be used, as long as the parties exchanging the JSON documents
also exchange information about the custom format types. A JSON Schema validator will ignore any format type that it
does not understand.
Built-in formats
The following is the list of formats specified in the JSON Schema specification.
• "date-time": Date representation, as defined by RFC 3339, section 5.6.
• "email": Internet email address, see RFC 5322, section 3.4.1.
• "hostname": Internet host name, see RFC 1034, section 3.1.
• "ipv4": IPv4 address, according to dotted-quad ABNF syntax as defined in RFC 2673, section 3.2.
• "ipv6": IPv6 address, as defined in RFC 2373, section 2.2.
• "uri": A universal resource identifier (URI), according to RFC3986.
• "uri-reference": New in draft 6 A URI Reference (either a URI or a relative-reference), according to RFC3986,
section 4.1.
• "json-pointer": New in draft 6 A JSON Pointer, according to RFC6901. There is more discussion on the use
of JSON Pointer within JSON Schema in Structuring a complex schema (page 59). Note that this should be used
only when the entire string contains only JSON Pointer content, e.g. /foo/bar. JSON Pointer URI fragments,
e.g. #/foo/bar/ should use "uri" or "uri-reference".
• "uri-template": New in draft 6 A URI Template (of any level) according to RFC6570. If you don’t already
know what a URI Template is, you probably don’t need this value.
If the values in the schema the ability to be relative to a particular source path (such as a link from a webpage), it is
generally better practice to use "uri-reference" rather than "uri". "uri" should only be used when the path must
be absolute.
Draft 4
Draft 4 only includes "uri", not "uri-reference". Therefore, there is some ambiguity around whether "uri"
should accept relative paths.
Note: JSON has no standard way to represent complex numbers, so there is no way to test for them in JSON Schema.
4.3.1 integer
The integer type is used for integral numbers.
Python
Ruby
{ json schema }
{ "type": "integer" }
!
42
!
-1
Warning: The precise treatment of the “integer” type may depend on the implementation of your JSON Schema
validator. JavaScript (and thus also JSON) does not have distinct types for integers and floating-point values.
Therefore, JSON Schema can not use type alone to distinguish between integers and non-integers. The JSON
Schema specification recommends, but does not require, that validators use the mathematical value to determine
whether a number is an integer, and not the type alone. Therefore, there is some disagreement between validators
on this point. For example, a JavaScript-based validator may accept 1.0 as an integer, whereas the Python-based
jsonschema does not.
Clever use of the multipleOf keyword (see Multiples (page 23)) can be used to get around this discrepancy. For
example, the following likely has the same behavior on all JSON Schema implementations:
{ json schema }
!
42
!
42.0
%
3.14156926
4.3.2 number
The number type is used for any numeric type, either integers or floating point numbers.
Python
Ruby
{ json schema }
{ "type": "number" }
!
42
!
-1
4.3.3 Multiples
Numbers can be restricted to a multiple of a given number, using the multipleOf keyword. It may be set to any positive
number.
{ json schema }
{
"type" : "number",
"multipleOf" : 10
}
!
0
!
10
!
20
%
23
4.3.4 Range
Ranges of numbers are specified using a combination of the minimum and maximum keywords, (or exclusiveMinimum
and exclusiveMaximum for expressing exclusive range).
If x is the value being validated, the following must hold true:
• x ≥ minimum
• x > exclusiveMinimum
• x ≤ maximum
• x < exclusiveMaximum
While you can specify both of minimum and exclusiveMinimum or both of maximum and exclusiveMaximum, it doesn’t
really make sense to do so.
{ json schema }
{
"type": "number",
"minimum": 0,
"exclusiveMaximum": 100
}
!
10
!
99
%
100
Draft 4
In JSON Schema Draft 4, exclusiveMinimum and exclusiveMaximum work differently. There they are boolean
values, that indicate whether minimum and maximum are exclusive of the value. For example:
• if exclusiveMinimum is false, x ≥ minimum.
• if exclusiveMinimum is true, x > minimum.
This was changed to have better keyword independence.
Here is an example using the older Draft 4 convention:
{ json schema }
{
"type": "number",
"minimum": 0,
"maximum": 100,
"exclusiveMaximum": true
}
!
10
!
99
4.4 object
Objects are the mapping type in JSON. They map “keys” to “values”. In JSON, the “keys” must always be strings.
Each of these pairs is conventionally referred to as a “property”.
Python
In Python, "objects" are analogous to the dict type. An important difference, however, is that while Python
dictionaries may use anything hashable as a key, in JSON all the keys must be strings.
Try not to be confused by the two uses of the word "object" here: Python uses the word object to mean the
generic base class for everything, whereas in JSON it is used only to mean a mapping from string keys to values.
Ruby
In Ruby, "objects" are analogous to the Hash type. An important difference, however, is that all keys in JSON
must be strings, and therefore any non-string keys are converted over to their string representation.
Try not to be confused by the two uses of the word "object" here: Ruby uses the word Object to mean the generic
base class for everything, whereas in JSON it is used only to mean a mapping from string keys to values.
{ json schema }
{ "type": "object" }
!
{
"key" : "value",
"another_key" : "another_value"
}
!
{
"Sun" : 1.9891e30,
"Jupiter" : 1.8986e27,
"Saturn" : 5.6846e26,
"Neptune" : 10.243e25,
"Uranus" : 8.6810e25,
"Earth" : 5.9736e24,
"Venus" : 4.8685e24,
"Mars" : 6.4185e23,
"Mercury" : 3.3022e23,
"Moon" : 7.349e22,
"Pluto" : 1.25e22
}
4.4. object 27
Understanding JSON Schema, Release 6.0
%
{
0.01 : "cm"
1 : "m",
1000 : "km"
}
%
"Not an object"
%
["An", "array", "not", "an", "object"]
4.4.1 Properties
The properties (key-value pairs) on an object are defined using the properties keyword. The value of properties is
an object, where each key is the name of a property and each value is a JSON schema used to validate that property.
For example, let’s say we want to define a simple schema for an address made up of a number, street name and street
type:
{ json schema }
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "type": "string",
"enum": ["Street", "Avenue", "Boulevard"]
}
}
}
!
{ "number": 1600, "street_name": "Pennsylvania", "street_type": "Avenue" }
By default, leaving out properties is valid. See Required Properties (page 30).
!
{ "number": 1600, "street_name": "Pennsylvania" }
The additionalProperties keyword is used to control the handling of extra stuff, that is, properties whose names are
not listed in the properties keyword. By default any additional properties are allowed.
The additionalProperties keyword may be either a boolean or an object. If additionalProperties is a boolean
and set to false, no additional properties will be allowed.
Reusing the example above, but this time setting additionalProperties to false.
{ json schema }
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "type": "string",
"enum": ["Street", "Avenue", "Boulevard"]
}
},
"additionalProperties": false
}
!
{ "number": 1600, "street_name": "Pennsylvania", "street_type": "Avenue" }
Since additionalProperties is false, this extra property “direction” makes the object invalid:
%
{ "number": 1600, "street_name": "Pennsylvania", "street_type": "Avenue",
"direction": "NW" }
If additionalProperties is an object, that object is a schema that will be used to validate any additional properties
not listed in properties.
4.4. object 29
Understanding JSON Schema, Release 6.0
For example, one can allow additional properties, but only if they are each a string:
{ json schema }
{
"type": "object",
"properties": {
"number": { "type": "number" },
"street_name": { "type": "string" },
"street_type": { "type": "string",
"enum": ["Street", "Avenue", "Boulevard"]
}
},
"additionalProperties": { "type": "string" }
}
!
{ "number": 1600, "street_name": "Pennsylvania", "street_type": "Avenue" }
Draft 4
In the following example schema defining a user record, we require that each user has a name and e-mail address, but
we don’t mind if they don’t provide their address or telephone number:
{ json schema }
{
"type": "object",
"properties": {
"name": { "type": "string" },
"email": { "type": "string" },
"address": { "type": "string" },
"telephone": { "type": "string" }
},
"required": ["name", "email"]
}
!
{
"name": "William Shakespeare",
"email": "[email protected]"
}
Providing extra properties is fine, even properties not defined in the schema:
!
{
"name": "William Shakespeare",
"email": "[email protected]",
"address": "Henley Street, Stratford-upon-Avon, Warwickshire, England",
"authorship": "in question"
}
Missing the required “email” property makes the JSON document invalid:
%
{
"name": "William Shakespeare",
"address": "Henley Street, Stratford-upon-Avon, Warwickshire, England",
}
4.4. object 31
Understanding JSON Schema, Release 6.0
{ json schema }
{
"type": "object",
"propertyNames": {
"pattern": "^[A-Za-z_][A-Za-z0-9_]*$"
}
}
!
{
"_a_proper_token_001": "value"
}
%
{
"001 invalid": "value"
}
Since object keys must always be strings anyway, so it is implied that the schema given to propertyNames is always at
least:
{ "type": "string" }
4.4.4 Size
The number of properties on an object can be restricted using the minProperties and maxProperties keywords. Each
of these must be a non-negative integer.
{ json schema }
{
"type": "object",
"minProperties": 2,
"maxProperties": 3
}
%
{}
%
{ "a": 0 }
!
{ "a": 0, "b": 1 }
!
{ "a": 0, "b": 1, "c": 2 }
%
{ "a": 0, "b": 1, "c": 2, "d": 3 }
4.4.5 Dependencies
The dependencies keyword allows the schema of the object to change based on the presence of certain special
properties.
There are two forms of dependencies in JSON Schema:
• Property dependencies declare that certain other properties must be present if a given property is present.
• Schema dependencies declare that the schema changes when a given property is present.
Property dependencies
Let’s start with the simpler case of property dependencies. For example, suppose we have a schema representing a
customer. If you have their credit card number, you also want to ensure you have a billing address. If you don’t have
their credit card number, a billing address would not be required. We represent this dependency of one property on
another using the dependencies keyword. The value of the dependencies keyword is an object. Each entry in the
object maps from the name of a property, p, to an array of strings listing properties that are required whenever p is
present.
In the following example, whenever a credit_card property is provided, a billing_address property must also be
present:
4.4. object 33
Understanding JSON Schema, Release 6.0
{ json schema }
{
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" },
"billing_address": { "type": "string" }
},
"required": ["name"],
"dependencies": {
"credit_card": ["billing_address"]
}
}
!
{
"name": "John Doe",
"credit_card": 5555555555555555,
"billing_address": "555 Debtor's Lane"
}
Note that dependencies are not bidirectional. It’s okay to have a billing address without a credit card number.
!
{
"name": "John Doe",
"billing_address": "555 Debtor's Lane"
}
To fix the last issue above (that dependencies are not bidirectional), you can, of course, define the bidirectional
dependencies explicitly:
{ json schema }
{
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" },
"billing_address": { "type": "string" }
},
"required": ["name"],
"dependencies": {
"credit_card": ["billing_address"],
"billing_address": ["credit_card"]
}
}
Schema dependencies
Schema dependencies work like property dependencies, but instead of just specifying other required properties, they
can extend the schema to have other constraints.
For example, here is another way to write the above:
4.4. object 35
Understanding JSON Schema, Release 6.0
{ json schema }
{
"type": "object",
"properties": {
"name": { "type": "string" },
"credit_card": { "type": "number" }
},
"required": ["name"],
"dependencies": {
"credit_card": {
"properties": {
"billing_address": { "type": "string" }
},
"required": ["billing_address"]
}
}
}
!
{
"name": "John Doe",
"credit_card": 5555555555555555,
"billing_address": "555 Debtor's Lane"
}
This has a billing_address, but is missing a credit_card. This passes, because here billing_address just looks
like an additional property:
!
{
"name": "John Doe",
"billing_address": "555 Debtor's Lane"
}
enough, and you may want to restrict the names of the extra properties, or you may want to say that, given a particular
kind of name, the value should match a particular schema. That’s where patternProperties comes in: it is a new
keyword that maps from regular expressions to schemas. If an additional property matches a given regular expression, it
must also validate against the corresponding schema.
Note: When defining the regular expressions, it’s important to note that the expression may match anywhere within the
property name. For example, the regular expression "p" will match any property name with a p in it, such as "apple",
not just a property whose name is simply "p". It’s therefore usually less confusing to surround the regular expression in
^...$, for example, "^p$".
In this example, any additional properties whose names start with the prefix S_ must be strings, and any with the prefix
I_ must be integers. Any properties explicitly defined in the properties keyword are also accepted, and any additional
properties that do not match either regular expression are forbidden.
{ json schema }
{
"type": "object",
"patternProperties": {
"^S_": { "type": "string" },
"^I_": { "type": "integer" }
},
"additionalProperties": false
}
!
{ "S_25": "This is a string" }
!
{ "I_0": 42 }
4.4. object 37
Understanding JSON Schema, Release 6.0
%
{ "keyword": "value" }
{ json schema }
{
"type": "object",
"properties": {
"builtin": { "type": "number" }
},
"patternProperties": {
"^S_": { "type": "string" },
"^I_": { "type": "integer" }
},
"additionalProperties": { "type": "string" }
}
!
{ "builtin": 42 }
It must be a string:
%
{ "keyword": 42 }
4.5 array
Arrays are used for ordered elements. In JSON, each element in an array may be of a different type.
Python
In Python, "array" is analogous to a list or tuple type, depending on usage. However, the json module in the
Python standard library will always use Python lists to represent JSON arrays.
Ruby
{ json schema }
{ "type": "array" }
!
[1, 2, 3, 4, 5]
!
[3, "different", { "types" : "of values" }]
%
{"Not": "an array"}
4.5.1 Items
By default, the elements of the array may be anything at all. However, it’s often useful to validate the items of the array
against some schema as well. This is done using the items, additionalItems, and contains keywords.
There are two ways in which arrays are generally used in JSON:
• List validation: a sequence of arbitrary length where each item matches the same schema.
• Tuple validation: a sequence of fixed length where each item may have a different schema. In this usage, the
index (or location) of each item is meaningful as to how the value is interpreted. (This usage is often given a
whole separate type in some programming languages, such as Python’s tuple).
List validation
List validation is useful for arrays of arbitrary length where each item matches the same schema. For this kind of array,
set the items keyword to a single schema that will be used to validate all of the items in the array.
Note: When items is a single schema, the additionalItems keyword is meaningless, and it should not be used.
4.5. array 39
Understanding JSON Schema, Release 6.0
{ json schema }
{
"type": "array",
"items": {
"type": "number"
}
}
!
[1, 2, 3, 4, 5]
New in draft 6
While the items schema must be valid for every item in the array, the contains schema only needs to validate against
one or more items in the array.
{ json schema }
{
"type": "array",
"contains": {
"type": "number"
}
}
Tuple validation
Tuple validation is useful when the array is a collection of items where each has a different schema and the ordinal
index of each item is meaningful.
For example, you may represent a street address such as:
{ json schema }
{
"type": "array",
"items": [
{
"type": "number"
},
{
"type": "string"
},
{
"type": "string",
"enum": ["Street", "Avenue", "Boulevard"]
},
{
"type": "string",
"enum": ["NW", "NE", "SW", "SE"]
}
]
}
4.5. array 41
Understanding JSON Schema, Release 6.0
!
[1600, "Pennsylvania", "Avenue", "NW"]
The additionalItems keyword controls whether it’s valid to have additional items in the array beyond what is defined
in items. Here, we’ll reuse the example schema above, but set additionalItems to false, which has the effect of
disallowing extra items in the array.
{ json schema }
{
"type": "array",
"items": [
{
"type": "number"
},
{
"type": "string"
},
{
"type": "string",
"enum": ["Street", "Avenue", "Boulevard"]
},
{
"type": "string",
"enum": ["NW", "NE", "SW", "SE"]
}
],
"additionalItems": false
}
!
[1600, "Pennsylvania", "Avenue", "NW"]
The additionalItems keyword may also be a schema to validate against every additional item in the array. In that
case, we could say that additional items are allowed, as long as they are all strings:
4.5. array 43
Understanding JSON Schema, Release 6.0
{ json schema }
{
"type": "array",
"items": [
{
"type": "number"
},
{
"type": "string"
},
{
"type": "string",
"enum": ["Street", "Avenue", "Boulevard"]
},
{
"type": "string",
"enum": ["NW", "NE", "SW", "SE"]
}
],
"additionalItems": { "type": "string" }
}
Note: additionalItems doesn’t make sense if you’re doing “list validation” (items is an object), and is ignored in
the case.
4.5.2 Length
The length of the array can be specified using the minItems and maxItems keywords. The value of each keyword must
be a non-negative number. These keywords work whether doing List validation (page 39) or Tuple validation (page 41).
{ json schema }
{
"type": "array",
"minItems": 2,
"maxItems": 3
}
%
[]
%
[1]
!
[1, 2]
!
[1, 2, 3]
%
[1, 2, 3, 4]
4.5.3 Uniqueness
A schema can ensure that each of the items in an array is unique. Simply set the uniqueItems keyword to true.
{ json schema }
{
"type": "array",
"uniqueItems": true
}
!
[1, 2, 3, 4, 5]
%
[1, 2, 3, 3, 4]
4.5. array 45
Understanding JSON Schema, Release 6.0
4.6 boolean
The boolean type matches only two special values: true and false. Note that values that evaluate to true or false,
such as 1 and 0, are not accepted by the schema.
Python
In Python, "boolean" is analogous to bool. Note that in JSON, true and false are lower case, whereas in Python
they are capitalized (True and False).
Ruby
In Ruby, "boolean" is analogous to TrueClass and FalseClass. Note that in Ruby there is no Boolean class.
{ json schema }
{ "type": "boolean" }
!
true
!
false
%
"true"
Values that evaluate to true or false are still not accepted by the schema:
%
0
4.7 null
The null type is generally used to represent a missing value. When a schema specifies a type of null, it has only one
acceptable value: null.
Python
Ruby
{ json schema }
{ "type": "null" }
!
null
%
false
%
0
%
""
4.8.1 Metadata
JSON Schema includes a few keywords, title, description, default, and examples that aren’t strictly used for
validation, but are used to describe parts of a schema.
The title and description keywords must be strings. A “title” will preferably be short, whereas a “description” will
provide a more lengthy explanation about the purpose of the data described by the schema. Neither are required, but
they are encouraged for good practice, and can make your schema “self-documenting”.
The default keyword specifies a default value for an item. JSON processing tools may use this information to provide
a default value for a missing key/value pair, though many JSON schema validators simply ignore the default keyword.
It should validate against the schema in which it resides, but that isn’t required.
New in draft 6 The examples keyword is a place to provide an array of examples that validate against the schema. This
isn’t used for validation, but may help with explaining the effect and purpose of the schema to a reader. Each entry
should validate against the schema in which is resides, but that isn’t strictly required. There is no need to duplicate the
default value in the examples array, since default will be treated as another example.
{ json schema }
{
"title" : "Match anything",
"description" : "This is a schema that matches anything.",
"default" : "Default value",
"examples" : [
"Anything",
4035
]
}
{ json schema }
{
"type": "string",
"enum": ["red", "amber", "green"]
}
!
"red"
%
"blue"
You can use enum even without a type, to accept values of different types. Let’s extend the example to use null to
indicate “off”, and also add 42, just for fun.
{ json schema }
{
"enum": ["red", "amber", "green", null, 42]
}
!
"red"
!
null
!
42
%
0
However, in most cases, the elements in the enum array should also be valid against the enclosing schema:
{ json schema }
{
"type": "string",
"enum": ["red", "amber", "green", null]
}
!
"red"
This is in the enum, but it’s invalid against { "type": "string" }, so it’s ultimately invalid:
%
null
{ json schema }
{
"properties": {
"country": {
"const": "United States of America"
}
}
}
!
{ "country": "United States of America" }
%
{ "country": "Canada" }
It should be noted that const is merely syntactic sugar for an enum with a single element, therefore the following are
equivalent:
{ json schema }
{
"anyOf": [
{ "type": "string", "maxLength": 5 },
{ "type": "number", "minimum": 0 }
]
}
!
"short"
%
"too long"
!
12
%
-5
4.9.1 allOf
To validate against allOf, the given data must be valid against all of the given subschemas.
{ json schema }
{
"allOf": [
{ "type": "string" },
{ "maxLength": 5 }
]
}
!
"short"
%
"too long"
Note that it’s quite easy to create schemas that are logical impossibilities with allOf. The following example creates a
schema that won’t validate against anything (since something may not be both a string and a number at the same time):
{ json schema }
{
"allOf": [
{ "type": "string" },
{ "type": "number" }
]
}
%
"No way"
%
-1
It is important to note that the schemas listed in an allOf (page 51), anyOf (page 53) or oneOf (page 54) array know
nothing of one another. While it might be surprising, allOf (page 51) can not be used to “extend” a schema to add
more details to it in the sense of object-oriented inheritance. For example, say you had a schema for an address in a
definitions section, and want to extend it to include an address type:
{ json schema }
{
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
},
"allOf": [
{ "$ref": "#/definitions/address" },
{ "properties": {
"type": { "enum": [ "residential", "business" ] }
}
}
]
}
!
{
"street_address": "1600 Pennsylvania Avenue NW",
"city": "Washington",
"state": "DC",
"type": "business"
}
This works, but what if we wanted to restrict the schema so no additional properties are allowed? One might try adding
the highlighted line below:
{ json schema }
{
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
},
"allOf": [
{ "$ref": "#/definitions/address" },
{ "properties": {
"type": { "enum": [ "residential", "business" ] }
}
}
],
"additionalProperties": false
}
%
{
"street_address": "1600 Pennsylvania Avenue NW",
"city": "Washington",
"state": "DC",
"type": "business"
}
Unfortunately, now the schema will reject everything. This is because the Properties (page 28) refers to the entire
schema. And that entire schema includes no properties, and knows nothing about the properties in the subschemas
inside of the allOf (page 51) array.
This shortcoming is perhaps one of the biggest surprises of the combining operations in JSON schema: it does not
behave like inheritance in an object-oriented language. There are some proposals to address this in the next version of
the JSON schema specification.
4.9.2 anyOf
To validate against anyOf, the given data must be valid against any (one or more) of the given subschemas.
{ json schema }
{
"anyOf": [
{ "type": "string" },
{ "type": "number" }
]
}
!
"Yes"
!
42
%
{ "Not a": "string or number" }
4.9.3 oneOf
To validate against oneOf, the given data must be valid against exactly one of the given subschemas.
{ json schema }
{
"oneOf": [
{ "type": "number", "multipleOf": 5 },
{ "type": "number", "multipleOf": 3 }
]
}
!
10
!
9
Note that it’s possible to “factor” out the common parts of the subschemas. The following schema is equivalent to the
one above:
{ json schema }
{
"type": "number",
"oneOf": [
{ "multipleOf": 5 },
{ "multipleOf": 3 }
]
}
4.9.4 not
This doesn’t strictly combine schemas, but it belongs in this chapter along with other things that help to modify the
effect of schemas in some way. The not keyword declares that a instance validates if it doesn’t validate against the
given subschema.
For example, the following schema validates against anything that is not a string:
{ json schema }
!
42
!
{ "key": "value" }
%
"I am a string"
It is recommended that all JSON Schemas have a $schema entry, which must be at the root. Therefore most of the time,
you’ll want this at the root of your schema:
"$schema": "http://json-schema.org/schema#"
4.10.1 Advanced
If you need to declare that your schema was written against a specific version of the JSON Schema standard, you should
include the draft name in the path, for example:
• http://json-schema.org/draft-06/schema#
• http://json-schema.org/draft-04/schema#
Additionally, if you have extended the JSON Schema language to include your own custom keywords for validating
values, you can use a custom URI for $schema. It must not be one of the predefined values above, and should probably
include a domain name you own.
Python
This subset of JavaScript regular expressions is compatible with Python regular expressions. Pay close attention
to what is missing, however. Notably, it is not recommended to use . to match any character.
4.11.1 Example
The following example matches a simple North American telephone number with an optional area code:
{ json schema }
{
"type": "string",
"pattern": "^(\\([0-9]{3}\\))?[0-9]{3}-[0-9]{4}$"
}
!
"555-1212"
!
"(888)555-1212"
%
"(888)555-1212 ext. 532"
%
"(800)FLOWERS"
When writing computer programs of even moderate complexity, it’s commonly accepted that “structuring” the program
into reusable functions is better than copying-and-pasting duplicate bits of code everywhere they are used. Likewise in
JSON Schema, for anything but the most trivial schema, it’s really useful to structure the schema into parts that can be
reused in a number of places. This chapter will present some practical examples that use the tools available for reusing
and structuring schemas.
5.1 Reuse
For this example, let’s say we want to define a customer record, where each customer may have both a shipping and
a billing address. Addresses are always the same—they have a street address, city and state—so we don’t want to
duplicate that part of the schema everywhere we want to store an address. Not only would that make the schema more
verbose, but it makes updating it in the future more difficult. If our imaginary company were to start doing international
business in the future and we wanted to add a country field to all the addresses, it would be better to do this in a single
place rather than everywhere that addresses are used.
So let’s start with the schema that defines an address:
{
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
Since we are going to reuse this schema, it is customary (but not required) to put it in the parent schema under a key
called definitions:
{
"definitions": {
"address": {
(continues on next page)
59
Understanding JSON Schema, Release 6.0
We can then refer to this schema snippet from elsewhere using the $ref keyword. The easiest way to describe $ref is
that it gets logically replaced with the thing that it points to. So, to refer to the above, we would include:
{ "$ref": "#/definitions/address" }
This can be used anywhere a schema is expected. You will always use $ref as the only key in an object: any other keys
you put there will be ignored by the validator.
The value of $ref is a URI, and the part after # sign (the “fragment” or “named anchor”) is in a format called JSON
Pointer.
Note: JSON Pointer aims to serve the same purpose as XPath from the XML world, but it is much simpler.
If you’re using a definition from the same document, the $ref value begins with the pound symbol (#). Following that,
the slash-separated items traverse the keys in the objects in the document. Therefore, in our example "#/definitions/
address" means:
1) go to the root of the document
2) find the value of the key "definitions"
3) within that object, find the value of the key "address"
$ref can also be a relative or absolute URI, so if you prefer to include your definitions in separate files, you can also do
that. For example:
{ "$ref": "definitions.json#/address" }
would load the address schema from another file residing alongside this one.
Now let’s put this together and use our address schema to create a schema for a customer:
{ json schema }
{
"$schema": "http://json-schema.org/draft-06/schema#",
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
},
"type": "object",
"properties": {
"billing_address": { "$ref": "#/definitions/address" },
"shipping_address": { "$ref": "#/definitions/address" }
}
}
!
{
"shipping_address": {
"street_address": "1600 Pennsylvania Avenue NW",
"city": "Washington",
"state": "DC"
},
"billing_address": {
"street_address": "1st Street SE",
"city": "Washington",
"state": "DC"
}
}
Note: Even though the value of a $ref is a URI, it is not a network locator, only an identifier. This means that the
schema doesn’t need to be accessible at that URI, but it may be. It is basically up to the validator implementation how
external schema URIs will be handled, but one should not assume the validator will fetch network resources indicated
in $ref values.
5.1.1 Recursion
$ref elements may be used to create recursive schemas that refer to themselves. For example, you might have a person
schema that has an array of children, each of which are also person instances.
5.1. Reuse 61
Understanding JSON Schema, Release 6.0
{ json schema }
{
"$schema": "http://json-schema.org/draft-06/schema#",
"definitions": {
"person": {
"type": "object",
"properties": {
"name": { "type": "string" },
"children": {
"type": "array",
"items": { "$ref": "#/definitions/person" },
"default": []
}
}
}
},
"type": "object",
"properties": {
"person": { "$ref": "#/definitions/person" }
}
}
Above, we created a schema that refers to another part of itself, effectively creating a “loop” in the validator, which is
both allowed and useful. Note, however, that a loop of $ref schemas referring to one another could cause an infinite
loop in the resolver, and is explicitly disallowed.
{ json schema }
{
"definitions": {
"alice": {
"anyOf": [
{ "$ref": "#/definitions/bob" }
]
},
"bob": {
"anyOf": [
{ "$ref": "#/definitions/alice" }
]
}
}
}
{ json schema }
{ "$id": "http://foo.bar/schemas/address.json" }
This provides a unique identifier for the schema, as well as, in most cases, indicating where it may be downloaded.
But be aware of the second purpose of the $id property: that it declares a base URL for relative $ref URLs elsewhere
in the file. For example, if you had:
{ json schema }
{ "$ref": "person.json" }
in the same file, a JSON schema validation library that supported network fetching would fetch person.json from
http://foo.bar/schemas/person.json, even if address.json was loaded from somewhere else, such as the local
filesystem.
New in draft 6
Draft 4
The $id property should never be the empty string or an empty fragment (#), since that doesn’t really make sense.
{ json schema }
{
"$schema": "http://json-schema.org/draft-06/schema#",
"definitions": {
"address": {
"$id": "#address",
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
},
"type": "object",
"properties": {
"billing_address": { "$ref": "#address" },
"shipping_address": { "$ref": "#address" }
}
}
Note: This functionality isn’t currently supported by the Python jsonschema library.
5.3 Extending
The power of $ref really shines when it is used with the combining keywords allOf, anyOf and oneOf (see Combining
schemas (page 50)).
Let’s say that for a shipping address, we want to know whether the address is a residential or business address, because
the shipping method used may depend on that. For a billing address, we don’t want to store that information, because
it’s not applicable.
To handle this, we’ll update our definition of shipping address:
to instead use an allOf keyword entry combining both the core address schema definition and an extra schema snippet
for the address type:
"shipping_address": {
"allOf": [
(continues on next page)
{ json schema }
{
"$schema": "http://json-schema.org/draft-06/schema#",
"definitions": {
"address": {
"type": "object",
"properties": {
"street_address": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" }
},
"required": ["street_address", "city", "state"]
}
},
"type": "object",
"properties": {
"billing_address": { "$ref": "#/definitions/address" },
"shipping_address": {
"allOf": [
{ "$ref": "#/definitions/address" },
{ "properties":
{ "type": { "enum": [ "residential", "business" ] } },
"required": ["type"]
}
]
}
}
}
5.3. Extending 65
Understanding JSON Schema, Release 6.0
%
{
"shipping_address": {
"street_address": "1600 Pennsylvania Avenue NW",
"city": "Washington",
"state": "DC"
}
}
!
{
"shipping_address": {
"street_address": "1600 Pennsylvania Avenue NW",
"city": "Washington",
"state": "DC",
"type": "business"
}
}
From these basic pieces, it’s possible to build very powerful constructions without a lot of duplication.
Acknowledgments
67
Understanding JSON Schema, Release 6.0
68 Chapter 6. Acknowledgments
Index
Symbols examples, 47
$id, 63 exclusiveMaximum, 24
$ref, 60 exclusiveMinimum, 24
$schema, 55
F
A format, 20
additionalItems, 39
additionalProperties, 28
H
allOf, 51 hostname, 20
anyOf, 53
array, 38
I
items, 39 integer, 20
length, 44 ipv4, 20
list validation, 39 ipv6, 20
tuple validation, 41 items, 39
uniqueness, 45
J
B json-pointer, 20
boolean, 45 M
C maximum, 24
maxItems, 44
combining schemas, 50
maxLength, 18
allOf, 51
maxProperties, 32
anyOf, 53
metadata, 47
not, 55
minimum, 24
oneOf, 54
minItems, 44
const, 49
minLength, 18
constant values, 49
minProperties, 32
contains, 39
multipleOf, 23
D N
date-time, 20
not, 55
dependencies, 33
null, 46
description, 47
number, 20, 22
E multiple of, 23
range, 24
email, 20
enum, 48 O
enumerated values, 48
object, 27
69
Understanding JSON Schema, Release 6.0
dependencies, 33
properties, 28
property names, 31
regular expression, 36
required properties, 30
size, 32
oneOf, 54
P
pattern, 19
patternProperties, 36
properties, 28
propertyNames, 31
R
regular expressions, 56
required, 30
S
schema
keyword, 55
string, 17
format, 20
length, 18
regular expression, 19
structure, 57
T
title, 47
type, 15
types
basic, 15
numeric, 20
U
uniqueItems, 45
uri, 20
uri-reference, 20
uri-template, 20
70 Index