diff --git a/jsonschema-annotation.md b/jsonschema-annotation.md new file mode 100644 index 00000000..e69de29b diff --git a/jsonschema-core.md b/jsonschema-core.md index 8f184591..82ca95dd 100644 --- a/jsonschema-core.md +++ b/jsonschema-core.md @@ -3,11 +3,13 @@ ## Abstract JSON Schema defines the media type `application/schema+json`, a JSON-based -format for describing the structure of JSON data. JSON Schema asserts what a -JSON document must look like, ways to extract information from it, and how to -interact with it. The `application/schema-instance+json` media type provides -additional feature-rich integration with `application/schema+json` beyond what -can be offered for `application/json` documents. +format for describing the structure of JSON data. At its core, JSON Schema +provides a framework to describe a set of JSON documents. Through additional +behaviors, JSON Schema can be made to perform a multitude of tasks. + +The `application/schema-instance+json` media type provides additional +feature-rich integration with `application/schema+json` beyond what can be +offered for `application/json` documents. ## Note to Readers @@ -23,18 +25,15 @@ the homepage, or email the document editors. ## Introduction -JSON Schema is a JSON media type for defining the structure of JSON data. JSON -Schema is intended to define validation, documentation, hyperlink navigation, -and interaction control of JSON data. +JSON Schema is a JSON media type for defining the structure of JSON data. This specification defines JSON Schema core terminology and mechanisms, -including pointing to another JSON Schema by reference, dereferencing a JSON -Schema reference, specifying the dialect being used, specifying a dialect's -vocabulary requirements, and defining terms. +including pointing to another JSON Schema by reference, dereferencing such a +reference, specifying the dialect being used, and defining terms. -Other specifications define the vocabularies that perform assertions about -validation, linking, annotation, navigation, and interaction as well as output -formats. +Other specifications define additional behaviors. Examples of such behaviors +include validation, linking, annotation, navigation, interaction, and other +related concepts such as output formats. ## Conventions and Terminology @@ -51,44 +50,35 @@ document are to be interpreted as defined in [RFC 8259](#rfc8259). This document proposes a new media type `application/schema+json` to identify a JSON Schema for describing JSON data. It also proposes a further optional media type, `application/schema-instance+json`, to provide additional integration -features. JSON Schemas are themselves JSON documents. This, and related -specifications, define keywords allowing authors to describe JSON data in +features. JSON Schemas are themselves JSON documents. This and related +specifications define keywords which allow authors to describe JSON data in several ways. -JSON Schema uses keywords to assert constraints on JSON instances or annotate -those instances with additional information. Additional keywords are used to -apply assertions and annotations to more complex JSON data structures, or based -on some sort of condition. - -To facilitate re-use, keywords can be organized into vocabularies. A vocabulary -consists of a list of keywords, together with their syntax and semantics. A -dialect is defined as a set of vocabularies and their required support -identified in a meta-schema. - -JSON Schema can be extended either by defining additional vocabularies, or less -formally by defining additional keywords outside of any vocabulary. Unrecognized -individual keywords simply have their values collected as annotations, while the -behavior with respect to an unrecognized vocabulary can be controlled when -declaring which vocabularies are in use. - -This document defines a core vocabulary that MUST be supported by any -implementation, and cannot be disabled. Its keywords are each prefixed with a -"$" character to emphasize their required nature. This vocabulary is essential -to the functioning of the `application/schema+json` media type, and is used to -bootstrap the loading of other vocabularies. - -Additionally, this document defines a RECOMMENDED vocabulary of keywords for -applying subschemas conditionally, and for applying subschemas to the contents -of objects and arrays. Either this vocabulary or one very much like it is -required to write schemas for non-trivial JSON instances, whether those schemas -are intended for assertion validation, annotation, or both. While not part of -the required core vocabulary, for maximum interoperability this additional -vocabulary is included in this document and its use is strongly encouraged. - -Further vocabularies for purposes such as structural validation or hypermedia -annotation are defined in other documents. These other documents each define a -dialect collecting the standard sets of vocabularies needed to write schemas for -that document's purpose. +JSON Schema uses behaviors, expressed through keywords, to apply constraints to +JSON instances. These behaviors are described by this and other specifications. + +This document defines a set of core keywords that MUST be supported by any +implementation, and cannot be disabled. These keywords are each prefixed with a +"$" character to emphasize their required nature. These keywords are essential +to the functioning of the `application/schema+json` media type. + +Additionally, this document defines a RECOMMENDED set of keywords for applying +subschemas conditionally, and for applying subschemas to the contents of objects +and arrays. These keywords, or a set very much like them, are required to write +schemas for non-trivial JSON instances. While not part of the required core set, +for maximum interoperability this additional set is included in this document +and its use is strongly encouraged. + +Additional specifications MAY define new behaviors, expressing those behaviors +either by defining new keywords or augmenting the behaviors of keywords defined +in other specifications. + + + +Collectively, the specifications which define a set of keywords and their +behaviors form a dialect. ## Definitions @@ -107,24 +97,89 @@ model can be interpreted against a JSON Schema, including media types like ### Instance -A JSON document to which a schema is applied is known as an "instance". - JSON Schema is defined over `application/json` or compatible documents, including media types with the `+json` structured syntax suffix. Among these, this specification defines the `application/schema-instance+json` media type which defines handling for fragments in the IRI. -#### Instance Data Model {#data-model} +When JSON Schema is applied to a JSON value, that value is referred to as an +"instance" and is interpreted by the rules defined in {{interpreting-instances}}. + +### JSON Schema Document {#schema-document} + +A JSON Schema document, or simply a schema, is a JSON document used to describe +an instance. A schema can itself be interpreted as an instance, but SHOULD +always be given the media type `application/schema+json` rather than +`application/schema-instance+json`. The `application/schema+json` media type is +defined to offer a superset of the fragment identifier syntax and semantics +provided by `application/schema-instance+json`. + +The structure of JSON Schema is defined in {{schema-structure}}. -JSON Schema interprets documents according to a data model. A JSON value -interpreted according to this data model is called an "instance". +### Keyword + +A property in a schema object which carries with it defined behaviors is called +a "keyword". + +Keywords which exist within the same schema object are called "adjacent +keywords". + +More information about keywords can be found in {{keywords}}. + +### Subschema + +A schema which appears as the value or within the value of a keyword is called a +"subschema". A single keywords could define a one or more subschemas in various +structures as needed to fulfill their purpose. + +```jsonschema +{ + "title": "root", + "items": { + "title": "array item" + } +} +``` + +In this example schema document, the schema titled "array item" is a subschema, and the +schema titled "root" is the root schema. + +As with the root schema, a subschema is either an object or a boolean. + +### Meta-Schema + +A schema that itself describes a schema is called a meta-schema. Meta-schemas +are used to specify the the keywords and behaviors available to those schemas. + +### Lexical Scope + +The lexical scope is the nested JSON data structure of objects and arrays that +comprise the schema document itself, without following any references. The largest +scope is the full schema document; the smallest scope is a single schema object +with no subschemas. + +As evaluation proceeds, the lexical scope changes so that it only contains the +schema object being processed. + +### Dynamic Scope + +During evaluation, the dynamic scope is the sequence of schema resources through +which evaluation passes. + +As evaluation proceeds, the dynamic scope grows to include schema resources it enters and shrinks to remove schema resources it leaves. + +## Interpreting Instances {#interpreting-instances} + +This section defines rules and behaviors for interpreting instance data. + +### Instance Data Model {#data-model} An instance has one of six primitive types, and a range of possible values depending on the type: - *null*: A JSON "null" value -- boolean: A "true" or "false" value, from the JSON "true" or "false" value +- *boolean*: A "true" or "false" value, from the JSON "true" or "false" values - *object*: An unordered set of properties mapping a string to an instance, from the JSON "object" value - *array*: An ordered list of instances, from the JSON "array" value @@ -133,23 +188,23 @@ depending on the type: - *string*: A string of Unicode code points, from the JSON "string" value Whitespace and formatting concerns, including different lexical representations -of numbers that are equal within the data model, are thus outside the scope of -JSON Schema. JSON Schema [vocabularies](#vocabulary) that wish to work with such -differences in lexical representations SHOULD define keywords to precisely -interpret formatted strings within the data model rather than relying on having -the original JSON representation Unicode characters available. +of numbers that are equal within the data model, are outside the scope of +JSON Schema. Extensions to JSON Schema that wish to work with such differences +in lexical representations SHOULD define keywords to precisely interpret +formatted strings within the data model rather than relying on having the +original textual representation available. -Since an object cannot have two properties with the same key, behavior for a -JSON document that tries to define two properties with the same key in a single -object is undefined. +Objects are defined to have a distinct set of properties. Behavior for data that +tries to define two properties with the same key in a single object is +undefined. -Note that JSON Schema vocabularies are free to define their own extended type +Note that JSON Schema extensions are free to define their own extended type system. This should not be confused with the core data model types defined here. -As an example, "integer" is a reasonable type for a vocabulary to define as a -value for a keyword, but the data model makes no distinction between integers -and other numbers. +As an example, "integer" is a reasonable type to define as a value for a +keyword, but the data model makes no distinction between integers and other +numbers. -#### Instance Equality +### Instance Equality Two JSON instances are said to be equal if and only if they are of the same type and have the same value according to the data model. Specifically, this means: @@ -169,117 +224,78 @@ no way to define multiple properties with the same key, and mere formatting differences (indentation, placement of commas, trailing zeros) are insignificant. -#### Non-JSON Instances +### Non-JSON Instances It is possible to use JSON Schema with a superset of the JSON Schema data model, where an instance may be outside any of the six JSON data types. -In this case, annotations still apply; but most validation keywords will not be -useful, as they will always pass or always fail. +An extension MAY define support for a superset of the core data model. -A custom vocabulary may define support for a superset of the core data model. -The schema itself may only be expressible in this superset; for example, to make -use of the `const` keyword. +### Fragment Identifiers for Instances {#instance-fragments} -### JSON Schema Documents {#schema-document} - -A JSON Schema document, or simply a schema, is a JSON document used to describe -an instance. A schema can itself be interpreted as an instance, but SHOULD -always be given the media type `application/schema+json` rather than -`application/schema-instance+json`. The `application/schema+json` media type is -defined to offer a superset of the fragment identifier syntax and semantics -provided by `application/schema-instance+json`. - -A JSON Schema MUST be an object or a boolean. - -#### JSON Schema Objects and Keywords +In accordance with section 3.1 of [RFC 6839](#rfc6839), the syntax and semantics +of fragment identifiers specified for any +json media type SHOULD be as +specified for `application/json`. However (as of the publication of this +document), there is no fragment identification syntax defined for +`application/json`. -Object properties that are applied to the instance are called keywords, or -schema keywords. Broadly speaking, keywords fall into one of five categories: +The `application/schema-instance+json` media type supports one fragment +identifier structure: JSON Pointers. -- *identifiers*: control schema identification through setting a IRI for the - schema and/or changing how the base IRI is determined -- *assertions*: produce a boolean result when applied to an instance -- *annotations*: attach information to an instance for application use -- *applicators*: apply one or more subschemas to a particular location in the - instance, and combine or modify their results -- *reserved locations*: do not directly affect results, but reserve a place for - a specific purpose to ensure interoperability +## Schema Structure {#schema-structure} -Keywords may fall into multiple categories, although applicators SHOULD only -produce assertion results based on their subschemas' results. They should not -define additional constraints independent of their subschemas. +A JSON Schema MUST either be an object or a boolean. -Keywords which are properties within the same schema object are referred to as -adjacent keywords. +### JSON Schema Objects and Keywords -Extension keywords, meaning those defined outside of this document and its -companions, are free to define other behaviors as well. +Object properties which apply behaviors to the instance are called keywords, or +schema keywords. Keywords can apply multiple behaviors as defined by this and +other documents. Keywords and their behaviors behaviors are described in more detail in +{{keywords}}. -A JSON Schema MAY contain properties which are not schema keywords or are not -recognized as schema keywords. The behavior of such keywords is governed by +A JSON Schema MUST NOT contain properties which are not schema keywords or are +not recognized as schema keywords. The behavior of such keywords is governed by {{unrecognized}}. -An empty schema is a JSON Schema with no properties. - -#### Boolean JSON Schemas +An empty schema is a JSON Schema object with no keywords. -The boolean schema values `true` and `false` are trivial schemas that always -produce themselves as assertion results, regardless of the instance value. They -never produce annotation results. +### Boolean JSON Schemas -These boolean schemas exist to clarify schema author intent and facilitate -schema processing optimizations. They behave identically to the following schema -objects (where `not` is part of the subschema application vocabulary defined in -this document). +The boolean schema values `true` and `false` are trivial schemas and exist to +clarify schema author intent and facilitate schema processing optimizations. +They behave identically to the following schema objects (where `not` is defined +in the [Applicators Vocabulary](./json-schema-vocab-applicators.md)). -- `true`: Always passes validation, as if the empty schema `{}` -- `false`: Always fails validation, as if the schema `{ "not": {} }` +| Boolean Schema | Equivalent Object Schema | +|:--------------:|:------------------------:| +| `true` | `{}` | +| `false` | `{ "not": {} }` | -While the empty schema object is unambiguous, there are many possible +While the empty schema object is unambiguous, there could be many possible equivalents to the `false` schema. Using the boolean values ensures that the intent is clear to both human readers and implementations. -#### Schema Vocabularies - -A schema vocabulary, or simply a vocabulary, is a set of keywords, their syntax, -and their semantics. A vocabulary is generally organized around a particular -purpose. Different uses of JSON Schema, such as validation, hypermedia, or user -interface generation, will involve different sets of vocabularies. - -Vocabularies are the primary unit of re-use in JSON Schema, as schema authors -can indicate what vocabularies are required or optional in order to process the -schema. Since vocabularies are identified by IRIs in the meta-schema, generic -implementations can load extensions to support previously unknown vocabularies. -While keywords can be supported outside of any vocabulary, there is no analogous -mechanism to indicate individual keyword usage. - -A schema vocabulary can be defined by anything from an informal description to a -standards proposal, depending on the audience and interoperability expectations. -In particular, in order to facilitate vocabulary use within non-public -organizations, a vocabulary specification need not be published outside of its -scope of use. +### Root Schema and Subschemas and Resources {#root} -#### Meta-Schemas - -A schema that itself describes a schema is called a meta-schema. Meta-schemas -are used to validate JSON Schemas and specify which vocabularies they are using. +A JSON Schema resource is a schema which is [canonically](#rfc6596) identified +by an [absolute IRI](#rfc3987). Schema resources MAY also be identified by IRIs +with fragments if the resulting secondary resource (as defined by +[section 3.5 of RFC 3986](#rfc3986)) is identical to the primary resource. This +can occur with the empty fragment, or when one schema resource is embedded in +another. Any such IRIs with fragments are considered to be non-canonical. -Typically, a meta-schema will specify a set of vocabularies, and validate -schemas that conform to the syntax of those vocabularies. However, meta-schemas -and vocabularies are separate in order to allow meta-schemas to validate schema -conformance more strictly or more loosely than the vocabularies' specifications -call for. Meta-schemas may also describe and validate additional keywords that -are not part of a formal vocabulary. + The root schema is the schema that comprises the entire JSON document in question. The root schema is always a schema resource, where the IRI is @@ -289,22 +305,6 @@ determined as described in {{initial-base}}.[^1] root schema resource in this sense. Exactly how such usages fit with the JSON Schema document and resource concepts will be clarified in a future draft. -Some keywords take schemas themselves, allowing JSON Schemas to be nested: - -```jsonschema -{ - "title": "root", - "items": { - "title": "array item" - } -} -``` - -In this example document, the schema titled "array item" is a subschema, and the -schema titled "root" is the root schema. - -As with the root schema, a subschema is either an object or a boolean. - As discussed in {{id-keyword}}, a JSON Schema document can contain multiple JSON Schema resources. When used without qualification, the term "root schema" refers to the document's root schema. In some cases, resource root schemas are @@ -315,17 +315,20 @@ standalone JSON Schema document. Whether multiple schema resources are embedded or linked with a reference, they are processed in the same way, with the same available behaviors. -## Fragment Identifiers {#fragments} +Canonical schema IRIs MUST NOT change during evaluation. + +### Fragment Identifiers for Schemas {#schema-fragments} In accordance with section 3.1 of [RFC 6839](#rfc6839), the syntax and semantics of fragment identifiers specified for any +json media type SHOULD be as -specified for `application/json`. (At publication of this document, there is no -fragment identification syntax defined for `application/json`.) +specified for `application/json`. However (as of the publication of this +document), there is no fragment identification syntax defined for +`application/json`. + + Additionally, the `application/schema+json` media type supports two fragment -identifier structures: plain names and JSON Pointers. The -`application/schema-instance+json` media type supports one fragment identifier -structure: JSON Pointers. +identifier structures: plain names and JSON Pointers. The use of JSON Pointers as IRI fragment identifiers is described in [RFC 6901](#rfc6901). For `application/schema+json`, which supports two fragment @@ -364,6 +367,78 @@ Defining and referencing a plain name fragment identifier within an `application/schema+json` document are specified in the [`$anchor` keyword](#anchors) section. +## Schema Keywords {#keywords} + +### Keyword Behaviors {#behaviors} + +JSON Schema keywords can exhibit different behaviors in different contexts. +Additional documents MAY: + +- define other behaviors as needed for their specific purpose; +- define new keywords; +- augment behaviors onto keywords which already have behaviors defined + elsewhere. + +This document defines the applicability of keywords which governs how a keyword +applies itself and its contents to an instance and its descendants. This +behavior is described in greater detail in {{applicability}}. + +### Keyword Independence {#independence} + +In general, keywords operate independently of adjacent keywords and their +subschemas. + +When necessary, a keyword's behavior MAY be defined by the presence, contents, +or evaluation outcome of a sibling keyword. Such dependencies + +- MUST be explicitly stated as part of the keyword definition; +- MUST NOT result in a circular dependency. + +Keywords MAY modify their own behavior based on the presence, contents, or +evaluation outcome of another keyword, but they MUST NOT modify the behavior of +another keyword. + + + + + +### Default Behaviors {#default-behaviors} + +A missing keyword MUST NOT apply any behaviors defined for it. + +In some cases, the behavior of a keyword when it is absent is identical to that +produced by that keyword's presence with a certain value, and keyword +definitions SHOULD note such values where known. + +### Extending JSON Schema {#extending} + +Any entity MAY extend JSON Schema by defining new behaviors and keywords as +described by {{behaviors}}. + +Implementations MAY provide the ability to register or load handlers for +keywords that they do not support directly. The exact mechanism for registering +and implementing such handlers is implementation-dependent. + +### Handling of unrecognized or unsupported keywords {#unrecognized} + +Implementations MUST refuse to process schemas which contain keywords they: + +- do not recognize; +- recognize but do not support. + +Likewise, implementations MUST refuse to process schemas which require +unsupported behaviors of recognized keywords. + ## General Considerations ### Range of JSON Values @@ -413,85 +488,42 @@ Finally, implementations MUST NOT take regular expressions to be anchored, neither at the beginning nor at the end. This means, for instance, the pattern "es" matches "expression". -### Extending JSON Schema {#extending} - -Additional schema keywords and schema vocabularies MAY be defined by any entity. -Save for explicit agreement, schema authors SHALL NOT expect these additional -keywords and vocabularies to be supported by implementations that do not -explicitly document such support. - -Implementations MAY provide the ability to register or load handlers for -vocabularies that they do not support directly. The exact mechanism for -registering and implementing such handlers is implementation-dependent. - -#### Explicit annotation keywords {#explicit-annotations} - -The values of keywords which begin with "x-" MUST be collected as annotations. - -Keywords which begin with "x-" symbol MUST NOT affect evaluation of a schema in -any way other than annotation collection. +## Evaluation -Consequently, the "x-" prefix is reserved for this purpose, and extension -vocabularies MUST NOT define any keywords which begin with this prefix. - -#### Handling of unrecognized or unsupported keywords {#unrecognized} - -Implementations SHOULD treat keywords they do not recognize, or that they -recognize but do not support, as annotations, where the value of the keyword is -the value of the annotation. Whether an implementation collects these -annotations or not, they MUST otherwise ignore the keywords. - -## Keyword Behaviors +Evaluating an instance against a schema involves processing all of the keywords +and applying all of their behaviors against the appropriate locations within the +instance. Evaluation recurses into subschemas of applicator keywords until a +schema with no applicators is found. -JSON Schema keywords fall into several general behavior categories. Assertions -validate that an instance satisfies constraints, producing a boolean result. -Annotations attach information that applications may use in any way they see -fit. Applicators apply subschemas to parts of the instance and combine their -results. +Evaluation of a parent schema object can complete once all of its subschemas +have been evaluated; although, as allowed by the aggregate behaviors, evaluation +MAY be short-circuited if the final outcome can be determined before processing +completes. -Extension keywords SHOULD stay within these categories, keeping in mind that -annotations in particular are extremely flexible. Complex behavior is usually -better delegated to applications on the basis of annotation data than -implemented directly as schema keywords. However, extension keywords MAY define -other behaviors for specialized purposes. +### Keyword Evaluation Order -Evaluating an instance against a schema involves processing all of the keywords -in the schema against the appropriate locations within the instance. Typically, -applicator keywords are processed until a schema object with no applicators (and -therefore no subschemas) is reached. The appropriate location in the instance is -evaluated against the assertion and annotation keywords in the schema object. -The interactions of those keyword results to produce the schema object results -are governed by {{annot-assert}}, while the relationship of subschema results to -the results of the applicator keyword that applied them is described by -{{applicators}}. +As per {{independence}}, unless otherwise specified, keywords can generally be evaluated in any order. However there are a few exceptions from the core set of keywords defined by this document. -Evaluation of a parent schema object can complete once all of its subschemas -have been evaluated, although in some circumstances evaluation may be -short-circuited due to assertion results. When annotations are being collected, -some assertion result short-circuiting is not possible due to the need to -examine all subschemas for annotation collection, including those that cannot -further change the assertion result. +- `$schema` MUST be evaluated first as this keyword governs how the rest of the schema in interpreted. +- `$id` MUST be evaluated second as this keyword defines the URI for a schema resource, on which other processing relies. ### Lexical Scope and Dynamic Scope {#scopes} -While most JSON Schema keywords can be evaluated on their own, or at most need -to take into account the values or results of adjacent keywords in the same -schema object, a few have more complex behavior. - -The lexical scope of a keyword is determined by the nested JSON data structure -of objects and arrays. The largest such scope is an entire schema document. The -smallest scope is a single schema object with no subschemas. + Keywords MAY be defined with a partial value, such as a IRI-reference, which must be resolved against another value, such as another IRI-reference or a full IRI, which is found through the lexical structure of the JSON document. The -`$id`, `$ref`, and `$dynamicRef` core keywords, and the "base" JSON Hyper-Schema +`$id`, `$ref`, and `$dynamicRef` core keywords, and the `base` JSON Hyper-Schema keyword, are examples of this sort of behavior. Note that some keywords, such as `$schema`, apply to the lexical scope of the entire schema resource, and therefore MUST only appear in a schema resource's root schema. + + Other keywords may take into account the dynamic scope that exists during the evaluation of a schema, typically together with an instance document. The outermost dynamic scope is the schema object at which processing begins, even if @@ -514,56 +546,11 @@ and collected annotations, as it may be possible to revisit the same lexical scope repeatedly with different dynamic scopes. In such cases, it is important to inform the user of the evaluation path that produced the error or annotation. -### Keyword Interactions - -Keyword behavior MAY be defined in terms of the annotation results of -[subschemas](#root) and/or adjacent keywords (keywords within the same schema -object) and their subschemas. Such keywords MUST NOT result in a circular -dependency. Keywords MAY modify their behavior based on the presence or absence -of another keyword in the same [schema object](#schema-document). - -### Default Behaviors {#default-behaviors} - -A missing keyword MUST NOT produce a false assertion result, MUST NOT produce -annotation results, and MUST NOT cause any other schema to be evaluated as part -of its own behavioral definition. However, given that missing keywords do not -contribute annotations, the lack of annotation results may indirectly change the -behavior of other keywords. - -In some cases, the missing keyword assertion behavior of a keyword is identical -to that produced by a certain value, and keyword definitions SHOULD note such -values where known. However, even if the value which produces the default -behavior would produce annotation results if present, the default behavior still -MUST NOT result in annotations. +## Applicability Behavior {#applicability} -Because annotation collection can add significant cost in terms of both -computation and memory, implementations MAY opt out of this feature. Keywords -that are specified in terms of collected annotations SHOULD describe reasonable -alternate approaches when appropriate. This approach is demonstrated by the -[`items`](#items) and [`additionalProperties`](#additionalproperties) keywords -in this document. - -Note that when no such alternate approach is possible for a keyword, -implementations that do not support annotation collections will not be able to -support those keywords or vocabularies that contain them. - -### Identifiers - -Identifiers define IRIs for a schema, or affect how such IRIs are resolved in -[references](#referenced), or both. The Core vocabulary defined in this document -defines several identifying keywords, most notably `$id`. - -Canonical schema IRIs MUST NOT change while processing an instance, but keywords -that affect IRI-reference resolution MAY have behavior that is only fully -determined at runtime. - -While custom identifier keywords are possible, vocabulary designers should take -care not to disrupt the functioning of core keywords. For example, the -`$dynamicAnchor` keyword in this specification limits its IRI resolution effects -to the matching `$dynamicRef` keyword, leaving the behavior of `$ref` -undisturbed. - -### Applicators {#applicators} +The applicability behavior defines how keywords which contain subschemas apply +those subschemas during evaluation. Keywords which exhibit the applicability +behavior are known as "applicators". Applicators allow for building more complex schemas than can be accomplished with a single schema object. Evaluation of an instance against a [schema @@ -589,7 +576,7 @@ values. Applicator keywords do not play a direct role in this preservation. #### Referenced and Referencing Schemas {#referenced} -As noted in {{applicators}}, an applicator keyword may refer to a schema to be +As noted in {{applicability}}, an applicator keyword may refer to a schema to be applied, rather than including it as a subschema in the applicator's value. In such situations, the schema being applied is known as the referenced schema, while the schema containing the applicator keyword is the referencing schema. @@ -606,208 +593,13 @@ Others, such as `$dynamicRef` (with `$dynamicAnchor`), may make use of dynamic scoping, and therefore only be resolvable in the process of evaluating the schema with an instance. -### Assertions {#assertions} - -JSON Schema can be used to assert constraints on a JSON document, which either -passes or fails the assertions. This approach can be used to validate -conformance with the constraints, or document what is needed to satisfy them. - -JSON Schema implementations produce a single boolean result when evaluating an -instance against schema assertions. - -An instance can only fail an assertion that is present in the schema. - -#### Assertions and Instance Primitive Types - -Most assertions only constrain values within a certain primitive type. When the -type of the instance is not of the type targeted by the keyword, the instance is -considered to conform to the assertion. - -For example, the `maxLength` keyword from the companion [validation -vocabulary](#json-schema-validation): will only restrict certain strings (that -are too long) from being valid. If the instance is a number, boolean, null, -array, or object, then it is valid against this assertion. - -This behavior allows keywords to be used more easily with instances that can be -of multiple primitive types. The companion validation vocabulary also includes a -`type` keyword which can independently restrict the instance to one or more -primitive types. This allows for a concise expression of use cases such as a -function that might return either a string of a certain length or a null value: - -```jsonschema -{ - "type": ["string", "null"], - "maxLength": 255 -} -``` - -If `maxLength` also restricted the instance type to be a string, then this would -be substantially more cumbersome to express because the example as written would -not actually allow null values. Each keyword is evaluated separately unless -explicitly specified otherwise, so if `maxLength` restricted the instance to -strings, then including `"null"` in `type` would not have any useful effect. - -### Annotations {#annotations} - -JSON Schema can annotate an instance with information, whenever the instance -validates against the schema object containing the annotation, and all of its -parent schema objects. The information can be a simple value, or can be -calculated based on the instance contents. - -Annotations are attached to specific locations in an instance. Since many -subschemas can be applied to any single location, applications may need to -decide how to handle differing annotation values being attached to the same -instance location by the same schema keyword in different schema objects. - -Unlike assertion results, annotation data can take a wide variety of forms, -which are provided to applications to use as they see fit. JSON Schema -implementations are not expected to make use of the collected information on -behalf of applications. - -Unless otherwise specified, the value of an annotation keyword is the keyword's -value. However, other behaviors are possible. For example, [JSON -Hyper-Schema's](#json-hyper-schema) `links` keyword is a complex annotation that -produces a value based in part on the instance data. - -While "short-circuit" evaluation is possible for assertions, collecting -annotations requires examining all schemas that apply to an instance location, -even if they cannot change the overall assertion result. The only exception is -that subschemas of a schema object that has failed validation MAY be skipped, as -annotations are not retained for failing schemas. - -#### Collecting Annotations {#collect} - -Annotations are collected by keywords that explicitly define -annotation-collecting behavior. Note that boolean schemas cannot produce -annotations as they do not make use of keywords. - -A collected annotation MUST include the following information: - -- The name of the keyword that produces the annotation -- The instance location to which it is attached, as a JSON Pointer -- The evaluation path, indicating how reference keywords such as `$ref` were - followed to reach the absolute schema location. -- The absolute schema location of the attaching keyword, as a IRI. This MAY be - omitted if it is the same as the evaluation path from above. -- The attached value(s) - -##### Distinguishing Among Multiple Values - -Applications MAY make decisions on which of multiple annotation values to use -based on the schema location that contributed the value. This is intended to -allow flexible usage. Collecting the schema location facilitates such usage. - -For example, consider this schema, which uses annotations and assertions from -the [Validation specification](#json-schema-validation): - -Note that some lines are wrapped for clarity. - -```jsonschema -{ - "title": "Feature list", - "type": "array", - "prefixItems": [ - { - "title": "Feature A", - "properties": { - "enabled": { - "$ref": "#/$defs/enabledToggle", - "default": true - } - } - }, - { - "title": "Feature B", - "properties": { - "enabled": { - "description": "If set to null, Feature B - inherits the enabled - value from Feature A", - "$ref": "#/$defs/enabledToggle" - } - } - } - ], - "$defs": { - "enabledToggle": { - "title": "Enabled", - "description": "Whether the feature is enabled (true), - disabled (false), or under - automatic control (null)", - "type": ["boolean", "null"], - "default": null - } - } -} -``` - -In this example, both Feature A and Feature B make use of the re-usable -"enabledToggle" schema. That schema uses the `title`, `description`, and -`default` annotations. Therefore the application has to decide how to handle the -additional `default` value for Feature A, and the additional `description` value -for Feature B. - -The application programmer and the schema author need to agree on the usage. For -this example, let's assume that they agree that the most specific `default` -value will be used, and any additional, more generic `default` values will be -silently ignored. Let's also assume that they agree that all `description` text -is to be used, starting with the most generic, and ending with the most -specific. This requires the schema author to write descriptions that work when -combined in this way. - -The application can use the evaluation path to determine which values are which. -The values in the feature's immediate "enabled" property schema are more -specific, while the values under the re-usable schema that is referenced to with -`$ref` are more generic. The evaluation path will show whether each value was -found by crossing a `$ref` or not. - -Feature A will therefore use a default value of true, while Feature B will use -the generic default value of null. Feature A will only have the generic -description from the "enabledToggle" schema, while Feature B will use that -description, and also append its locally defined description that explains how -to interpret a null value. - -Note that there are other reasonable approaches that a different application -might take. For example, an application may consider the presence of two -different values for `default` to be an error, regardless of their schema -locations. - -##### Annotations and Assertions {#annot-assert} - -Schema objects that produce a false assertion result MUST NOT produce any -annotation results, whether from their own keywords or from keywords in -subschemas. - -Note that the overall schema results may still include annotations collected -from other schema locations. Given this schema: - -```jsonschema -{ - "oneOf": [ - { - "title": "Integer Value", - "type": "integer" - }, - { - "title": "String Value", - "type": "string" - } - ] -} -``` - -Against the instance `"This is a string"`, the title annotation "Integer Value" -is discarded because the type assertion in that schema object fails. The title -annotation "String Value" is kept, as the instance passes the string type -assertions. - ### Reserved Locations A fourth category of keywords simply reserve a location to hold re-usable components or data of interest to schema authors that is not suitable for re-use. These keywords do not affect validation or annotation results. Their -purpose in the core vocabulary is to ensure that locations are available for -certain purposes and will not be redefined by extension keywords. +purpose is to ensure that locations are available for certain purposes and will +not be redefined by extension keywords. While these keywords do not directly affect results, as explained in {{non-schemas}} unrecognized extension keywords that reserve locations for @@ -816,9 +608,9 @@ circumstances. ### Loading Instance Data -While none of the vocabularies defined as part of this or the associated -documents define a keyword which may target and/or load instance data, it is -possible that other vocabularies may wish to do so. +While none of the keywords defined as part of this or the associated +documents define a keyword which target and/or load instance data, it is +possible that extensions may wish to do so. Keywords MAY be defined to use JSON Pointers or Relative JSON Pointers to examine parts of an instance outside the current evaluation location. @@ -826,64 +618,31 @@ examine parts of an instance outside the current evaluation location. Keywords that allow adjusting the location using a Relative JSON Pointer SHOULD default to using the current location if a default is desireable. -## The JSON Schema Core Vocabulary {#core} - -Keywords declared in this section, which all begin with "$", make up the JSON -Schema Core vocabulary. These keywords are either required in order to process -any schema or meta-schema, including those split across multiple documents, or -exist to reserve keywords for purposes that require guaranteed interoperability. - -The Core vocabulary MUST be considered mandatory at all times, in order to -bootstrap the processing of further vocabularies. Meta-schemas that use the -[`$vocabulary`](#vocabulary) keyword to declare the vocabularies in use MUST -explicitly list the Core vocabulary, which MUST have a value of true indicating -that it is required. - -The behavior of a false value for this vocabulary (and only this vocabulary) is -undefined, as is the behavior when `$vocabulary` is present but the Core -vocabulary is not included. However, it is RECOMMENDED that implementations -detect these cases and raise an error when they occur. It is not meaningful to -declare that a meta-schema optionally uses Core. +## The JSON Schema Core Keywords {#core} -Meta-schemas that do not use `$vocabulary` MUST be considered to require the -Core vocabulary as if its IRI were present with a value of true. +Keywords declared in this section, which all begin with "$", are essential to +processing JSON Schema. These keywords inform implementations how to process any +schema or meta-schema, including those split across multiple documents, or exist +to reserve keywords for purposes that require guaranteed interoperability. -The current IRI for the Core vocabulary is: -`https://json-schema.org/draft/next/vocab/core`. +Support for these keywords MUST be considered mandatory at all times in order to +bootstrap the processing of further keywords. -The current IRI for the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/core`. +The "$" prefix is reserved for use by this specification. Extensions MUST NOT +define new keywords that begin with "$". -The "$" prefix is reserved for use by the Core vocabulary. Vocabulary extensions -MUST NOT define new keywords that begin with "$". +### Meta-Schemas -### Meta-Schemas and Vocabularies {#vocabulary} +Meta-schemas are used to inform an implementation how to interpret a schema. +Every schema has a meta-schema, which can be explicitly declared using the +`$schema` keyword. -Two concepts, meta-schemas and vocabularies, are used to inform an -implementation how to interpret a schema. Every schema has a meta-schema, which -can be declared using the `$schema` keyword. - -The meta-schema serves two purposes: - -Declaring the vocabularies in use: The `$vocabulary` keyword, when it appears in -a meta-schema, declares which vocabularies are available to be used in schemas -that refer to that meta-schema. Vocabularies define keyword semantics, as well -as their general syntax. By combining various vocabularies, distinct -sets of keywords can be made available for use in a schema. This collection of -vocabularies defines a dialect. - -Describing valid schema syntax: A schema MUST successfully validate against its -meta-schema, which constrains the syntax of the available keywords. The syntax -described is expected to be compatible with the vocabularies declared; while it -is possible to describe an incompatible syntax, such a meta-schema would be -unlikely to be useful. - -Meta-schemas are separate from vocabularies to allow for vocabularies to be -combined in different ways, and for meta-schema authors to impose additional -constraints such as forbidding certain keywords, or performing unusually strict -syntactical validation, as might be done during a development and testing cycle. -Each vocabulary typically identifies a meta-schema consisting only of the -vocabulary's keywords. +The meta-schema serves to describe valid schema syntax. A schema MUST +successfully validate against its meta-schema, which constrains the syntax of +the available keywords. The syntax described for a given keyword is expected to +be compatible with the document which defines the keyword; while it is possible +to describe an incompatible syntax, such a meta-schema would be unlikely to be +useful. Meta-schema authoring is an advanced usage of JSON Schema, so the design of meta-schema features emphasizes flexibility over simplicity. @@ -913,7 +672,7 @@ steps. (Note that steps 2 and 3 are mutually exclusive.) If the dialect is not specified through one of these methods, the implementation -MUST refuse to process the schema, as with unsupported required vocabularies. +MUST refuse to process the schema. #### The `$schema` Keyword {#keyword-schema} @@ -937,127 +696,6 @@ keyword appears in a non-resource root schema object, the behavior is undefined. Values for this property are defined elsewhere in this and other documents, and by other parties. -#### The `$vocabulary` Keyword - -The `$vocabulary` keyword is used in meta-schemas to identify the vocabularies -available for use in schemas described by that meta-schema, and whether each -vocabulary is required or optional. Together, this information forms a dialect. - -The value of this keyword MUST be an object. The property names in the object -MUST be IRIs (containing a scheme) and each IRI MUST be normalized. Each IRI -that appears as a property name identifies a specific set of keywords and their -semantics. - -The IRI MAY be a URL, but the nature of the retrievable resource is currently -undefined, and reserved for future use. Vocabulary authors MAY use the URL of -the vocabulary specification, in a human-readable media type such as `text/html` -or `text/plain`, as the vocabulary IRI.[^2] - -[^2]: Vocabulary documents may be added in forthcoming drafts. For now, -identifying the keyword set is deemed sufficient as that, along with meta-schema -validation, is how the current "vocabularies" work today. Any future vocabulary -document format will be specified as a JSON document, so using `text/html` or -other non-JSON formats in the meantime will not produce any future ambiguity. - -The values of the object properties MUST be booleans. If the value is true, then -the vocabulary MUST be considered to be required. If the value is false, then -the vocabulary MUST be considered to be optional. - -##### Required, optional, and omitted vocabularies - -A schema is said to use a dialect and its constituent vocabularies if it is -associated with a meta-schema defining the dialect with `$vocabulary`, either -through `$schema`, through appropriately defined media type parameters or link -relation types, or through documented default implementation-defined behavior in -the absence of an explicit meta-schema. If a meta-schema does not contain -`$vocabulary`, the set of vocabularies in use is determined according to -{{default-vocabs}}. - -Any vocabulary in use by a schema and understood by the implementation MUST be -processed in a manner consistent with the semantic definitions contained within -the vocabulary, regardless of whether that vocabulary is required or optional. - -Any vocabulary that is not present in `$vocabulary` MUST NOT be made available -for use in schemas described by that meta-schema, except for the core vocabulary -as specified by the introduction to {{core}}. - -Implementations that do not support a vocabulary required by a schema MUST -refuse to process that schema. - -Implementations that do not support a vocabulary that is optionally used by a -schema SHOULD proceed with processing the schema. The keywords will be -considered to be unrecognized keywords as addressed by {{unrecognized}}. Note -that since the recommended behavior for such keywords is to collect them as -annotations, vocabularies consisting only of annotations will have the same -behavior when used optionally whether the implementation supports them or not. -This allows annotation-only vocabularies to be supported without custom code, -even in implementations that do not support providing custom code for extension -vocabularies. - -##### Vocabularies are schema resource-scoped - -The `$vocabulary` keyword SHOULD be used in the root schema of any schema -resource intended for use as a meta-schema. It MUST NOT appear in subschemas. - -The `$vocabulary` keyword MUST be ignored in schema resources that are not being -processed as a meta-schema. This allows validating a meta-schema M against its -own meta-schema M' without requiring the validator to understand the -vocabularies declared by M. - -##### Vocabulary and non-vocabulary keywords - -Keywords from different vocabularies, as well as non-vocabulary extension -keywords, can have identical names. These are not considered to be the same -keyword from the perspective of enabling or disabling them through -`$vocabulary`. - -In particular the keywords defined in this specification and its companion -documents MUST be considered to be vocabulary keywords, with availability -governed by `$vocabulary` even in implementations that do not support any -extension vocabularies. - -Guidance regarding vocabularies with identically-named keywords is provided in -{{vocab-practices}}. - -##### Default vocabularies {#default-vocabs} - -If `$vocabulary` is absent, an implementation MAY determine behavior based on -the meta-schema if it is recognized from the IRI value of the referring schema's -`$schema` keyword. This is how behavior (such as Hyper-Schema usage) has been -recognized prior to the existence of vocabularies. - -If the meta-schema, as referenced by the schema, is not recognized, or is -missing, then the behavior is implementation-defined. If the implementation -proceeds with processing the schema, it MUST assume the use of the core -vocabulary. If the implementation is built for a specific purpose, then it -SHOULD assume the use of all of the most relevant vocabularies for that purpose. - -For example, an implementation that is a validator SHOULD assume the use of all -vocabularies in this specification and the companion Validation specification. - -##### Non-inheritability of vocabularies - -Note that the processing restrictions on `$vocabulary` mean that meta-schemas -that reference other meta-schemas using `$ref` or similar keywords do not -automatically inherit the vocabulary declarations of those other meta-schemas. -All such declarations must be repeated in the root of each schema document -intended for use as a meta-schema. This is demonstrated in [the example -meta-schema](#example-meta-schema).[^3] - -[^3]: This requirement allows implementations to find all vocabulary requirement -information in a single place for each meta-schema. As schema extensibility -means that there are endless potential ways to combine more fine-grained -meta-schemas by reference, requiring implementations to anticipate all -possibilities and search for vocabularies in referenced meta-schemas would be -overly burdensome. - -#### Updates to Meta-Schema and Vocabulary IRIs - -Updated vocabulary and meta-schema IRIs MAY be published between specification -drafts in order to correct errors. Implementations SHOULD consider IRIs dated -after this specification draft and before the next to indicate the same syntax -and semantics as those listed here. - ### Base IRI, Anchors, and Dereferencing To differentiate between schemas in a vast ecosystem, schemas are identified by @@ -1134,7 +772,7 @@ Therefore it is RECOMMENDED that `$anchor` be used to create plain name fragments unless there is a clear need for `$dynamicAnchor`. If present, the value of these keywords MUST be a string and MUST conform to the -plain name fragment identifier syntax defined in {{fragments}}.[^4] +plain name fragment identifier syntax defined in {{schema-fragments}}.[^4] [^4]: Note that the anchor string does not include the "#" character, as it is not a IRI-reference. An `$anchor`: "foo" becomes the fragment `#foo` when used @@ -1237,11 +875,6 @@ this string to end users. Tools for editing schemas SHOULD support displaying and editing this keyword. The value of this keyword MAY be used in debug or error output which is intended for developers making use of schemas. -Schema vocabularies SHOULD allow `$comment` within any object containing -vocabulary keywords. Implementations MAY assume `$comment` is allowed unless the -vocabulary specifically forbids it. Vocabularies MUST NOT specify any effect of -`$comment` beyond what is described in this specification. - Tools that translate other media types or programming languages to and from `application/schema+json` MAY choose to convert that media type or programming language's native comments to or from `$comment` values. The behavior of such @@ -1319,9 +952,8 @@ processed both ways in the course of one session. Implementations MAY allow a schema to be explicitly passed as a meta-schema, for implementation-specific purposes, such as pre-loading a commonly used -meta-schema and checking its vocabulary support requirements up front. -Meta-schema authors MUST NOT expect such features to be interoperable across -implementations. +meta-schema and checking its requirements up front. Meta-schema authors MUST NOT +expect such features to be interoperable across implementations. ### Dereferencing @@ -1471,7 +1103,7 @@ the same document to ease transportation. Each embedded Schema Resource MUST be treated as an individual Schema Resource, following standard schema loading and processing requirements, including -determining vocabulary support. +determining keyword support. #### Bundling @@ -1553,10 +1185,10 @@ recursive nesting like this; the behavior is undefined. #### References to Possible Non-Schemas {#non-schemas} Subschema objects (or booleans) are recognized by their use with known -applicator keywords or with location-reserving keywords such as [`$defs`](#defs) -that take one or more subschemas as a value. These keywords may be `$defs` and -the standard applicators from this document, or extension keywords from a known -vocabulary, or implementation-specific custom keywords. +applicator keywords or with location-reserving keywords such as +[`$defs`](#defs) that take one or more subschemas as a value. These keywords may +be `$defs` and the standard applicators from this document or +implementation-specific custom keywords. Multi-level structures of unknown keywords are capable of introducing nested subschemas, which would be subject to the processing rules for `$id`. Therefore, @@ -1655,27 +1287,17 @@ User-Agent: product-name/5.4.1 so-cool-json-schema/1.0.2 curl/7.43.0 Clients SHOULD be able to make requests with a "From" header so that server operators can contact the owner of a potentially misbehaving script. -## A Vocabulary for Applying Subschemas {#applicatorvocab} - -This section defines a vocabulary of applicator keywords that are RECOMMENDED -for use as the basis of other vocabularies. +## Keywords for Applying Subschemas -Meta-schemas that do not use `$vocabulary` SHOULD be considered to require this -vocabulary as if its IRI were present with a value of true. - -The current IRI for this vocabulary, known as the Applicator vocabulary, is: -`https://json-schema.org/draft/next/vocab/applicator`. - -The current IRI for the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/applicator`. +This section defines a set of keywords that enable schema combinations and +composition. ### Keyword Independence Schema keywords typically operate independently, without affecting each other's outcomes. -For schema author convenience, there are some exceptions among the keywords in -this vocabulary: +For schema author convenience, there are some exceptions among these keywords: - `additionalProperties`, whose behavior is defined in terms of `properties` and `patternProperties` @@ -1849,8 +1471,7 @@ keyword. If the `items` subschema is applied to any positions within the instance array, it produces an annotation result of boolean true, indicating that all remaining array elements have been evaluated against this keyword's subschema. This -annotation affects the behavior of `unevaluatedItems` in the Unevaluated -vocabulary. +annotation affects the behavior of `unevaluatedItems`. Omitting this keyword has the same assertion behavior as an empty schema. @@ -1872,8 +1493,7 @@ validates against the corresponding schema. The annotation result of this keyword is the set of instance property names which are also present under this keyword. This annotation affects the behavior -of `additionalProperties` (in this vocabulary) and `unevaluatedProperties` in -the Unevaluated vocabulary. +of `additionalProperties` and `unevaluatedProperties`. Omitting this keyword has the same assertion behavior as an empty object. @@ -1892,8 +1512,7 @@ not implicitly anchored. The annotation result of this keyword is the set of instance property names matched by at least one property under this keyword. This annotation affects the -behavior of `additionalProperties` (in this vocabulary) and -`unevaluatedProperties` (in the Unevaluated vocabulary). +behavior of `additionalProperties` and `unevaluatedProperties`. Omitting this keyword has the same assertion behavior as an empty object. @@ -1912,7 +1531,7 @@ against the `additionalProperties` schema. The annotation result of this keyword is the set of instance property names validated by this keyword's subschema. This annotation affects the behavior of -`unevaluatedProperties` in the Unevaluated vocabulary. +`unevaluatedProperties`. Omitting this keyword has the same assertion behavior as an empty schema. @@ -1996,14 +1615,13 @@ successfully when applied to every index of the instance. The annotation MUST be present if the instance array to which this keyword's schema applies is empty. -This annotation affects the behavior of `unevaluatedItems` in the Unevaluated -vocabulary. +This annotation affects the behavior of `unevaluatedItems`. The subschema MUST be applied to every array element even after the first match has been found, in order to collect annotations for use by other keywords. This is to ensure that all possible annotations are collected. -## A Vocabulary for Unevaluated Locations +## Keywords for Unevaluated Locations The purpose of these keywords is to enable schema authors to apply subschemas to array items or object properties that have not been successfully evaluated @@ -2030,19 +1648,10 @@ subschemas. The behavior of these keywords depend on the annotation results of adjacent keywords that apply to the instance location being validated. -Meta-schemas that do not use `$vocabulary` SHOULD be considered to require this -vocabulary as if its IRI were present with a value of true. - -The current IRI for this vocabulary, known as the Unevaluated Applicator -vocabulary, is: `https://json-schema.org/draft/next/vocab/unevaluated`. - -The current IRI for the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/unevaluated`. - ### Keyword Independence Schema keywords typically operate independently, without affecting each other's -outcomes. However, the keywords in this vocabulary are notable exceptions: +outcomes. However, these keywords are notable exceptions: - `unevaluatedItems`, whose behavior is defined in terms of annotations from `prefixItems`, `items`, `contains`, and itself @@ -2219,7 +1828,7 @@ Servers MUST ensure that malicious parties cannot change the functionality of existing schemas by uploading a schema with a pre-existing or very similar `$id`. -Individual JSON Schema vocabularies are liable to also have their own security +Individual JSON Schema extensions are liable to also have their own security considerations. Consult the respective specifications for more information. Schema authors should take care with `$comment` contents, as a malicious @@ -2250,7 +1859,7 @@ Security considerations:: See {{security}} above. Interoperability considerations:: See Sections [6.2](#language), [6.3](#integers), and [6.4](#regex) above. -Fragment identifier considerations:: See {{fragments}} +Fragment identifier considerations:: See {{schema-fragments}} ### `application/schema-instance+json` @@ -2271,7 +1880,7 @@ Security considerations:: See {{security}} above. Interoperability considerations:: See Sections [6.2](#language), [6.3](#integers), and [6.4](#regex) above. -Fragment identifier considerations:: See {{fragments}} +Fragment identifier considerations:: See {{instance-fragments}} ## References @@ -2413,7 +2022,7 @@ name fragment identifiers. The schemas at the following IRI-encoded [JSON Pointers](#rfc6901) (relative to the root schema) have the following base IRIs, and are identifiable by any -listed IRI in accordance with {{fragments}} and {{embedded}} above. +listed IRI in accordance with {{schema-fragments}} and {{embedded}} above. `#` (document root): canonical (and base) IRI: `https://example.com/root.json` - canonical resource IRI plus pointer fragment: `https://example.com/root.json#` @@ -2598,150 +2207,6 @@ of the node schema objects were moved under `$defs`. It is the matching `$dynamicAnchor` values which tell us how to resolve the dynamic reference, not any sort of correlation in JSON structure. -## [Appendix] Working with vocabularies - -### Best practices for vocabulary and meta-schema authors {#vocab-practices} - -Vocabulary authors should take care to avoid keyword name collisions if the -vocabulary is intended for broad use, and potentially combined with other -vocabularies. JSON Schema does not provide any formal namespacing system, but -also does not constrain keyword names, allowing for any number of namespacing -approaches. - -Vocabularies may build on each other, such as by defining the behavior of their -keywords with respect to the behavior of keywords from another vocabulary, or by -using a keyword from another vocabulary with a restricted or expanded set of -acceptable values. Not all such vocabulary re-use will result in a new -vocabulary that is compatible with the vocabulary on which it is built. -Vocabulary authors should clearly document what level of compatibility, if any, -is expected. - -Meta-schema authors should not use `$vocabulary` to combine multiple -vocabularies that define conflicting syntax or semantics for the same keyword. -As semantic conflicts are not generally detectable through schema validation, -implementations are not expected to detect such conflicts. If conflicting -vocabularies are declared, the resulting behavior is undefined. - -Vocabulary authors SHOULD provide a meta-schema that validates the expected -usage of the vocabulary's keywords on their own. Such meta-schemas SHOULD not -forbid additional keywords, and MUST not forbid any keywords from the Core -vocabulary. - -It is recommended that meta-schema authors reference each vocabulary's -meta-schema using the [`allOf`](#allof) keyword, although other mechanisms for -constructing the meta-schema may be appropriate for certain use cases. - -The recursive nature of meta-schemas makes the `$dynamicAnchor` and -`$dynamicRef` keywords particularly useful for extending existing meta-schemas, -as can be seen in the JSON Hyper-Schema meta-schema which extends the Validation -meta-schema. - -Meta-schemas may impose additional constraints, including describing keywords -not present in any vocabulary, beyond what the meta-schemas associated with the -declared vocabularies describe. This allows for restricting usage to a subset of -a vocabulary, and for validating locally defined keywords not intended for -re-use. - -However, meta-schemas should not contradict any vocabularies that they declare, -such as by requiring a different JSON type than the vocabulary expects. The -resulting behavior is undefined. - -Meta-schemas intended for local use, with no need to test for vocabulary support -in arbitrary implementations, can safely omit `$vocabulary` entirely. - -### Example meta-schema with vocabulary declarations {#example-meta-schema} - -This meta-schema explicitly declares both the Core and Applicator vocabularies, -together with an extension vocabulary, and combines their meta-schemas with an -`allOf`. The extension vocabulary's meta-schema, which describes only the -keywords in that vocabulary, is shown after the main example meta-schema. - -The main example meta-schema also restricts the usage of the Unevaluated -vocabulary by forbidding the keywords prefixed with "unevaluated", which are -particularly complex to implement. This does not change the semantics or set of -keywords defined by the other vocabularies. It just ensures that schemas using -this meta-schema that attempt to use the keywords prefixed with "unevaluated" -will fail validation against this meta-schema. - -Finally, this meta-schema describes the syntax of a keyword, "localKeyword", -that is not part of any vocabulary. Presumably, the implementors and users of -this meta-schema will understand the semantics of "localKeyword". JSON Schema -does not define any mechanism for expressing keyword semantics outside of -vocabularies, making them unsuitable for use except in a specific environment in -which they are understood. - -This meta-schema combines several vocabularies for general use. - -```jsonschema -{ - "$schema": "https://json-schema.org/draft/next/schema", - "$id": "https://example.com/meta/general-use-example", - "$dynamicAnchor": "meta", - "$vocabulary": { - "https://json-schema.org/draft/next/vocab/core": true, - "https://json-schema.org/draft/next/vocab/applicator": true, - "https://json-schema.org/draft/next/vocab/validation": true, - "https://example.com/vocab/example-vocab": true - }, - "allOf": [ - {"$ref": "https://json-schema.org/draft/next/meta/core"}, - {"$ref": "https://json-schema.org/draft/next/meta/applicator"}, - {"$ref": "https://json-schema.org/draft/next/meta/validation"}, - {"$ref": "https://example.com/meta/example-vocab"}, - ], - "patternProperties": { - "^unevaluated": false - }, - "properties": { - "localKeyword": { - "$comment": "Not in vocabulary, but validated if used", - "type": "string" - } - } -} -``` - -This meta-schema describes only a single extension vocabulary. - -```jsonschema -{ - "$schema": "https://json-schema.org/draft/next/schema", - "$id": "https://example.com/meta/example-vocab", - "$dynamicAnchor": "meta", - "$vocabulary": { - "https://example.com/vocab/example-vocab": true, - }, - "type": ["object", "boolean"], - "properties": { - "minDate": { - "type": "string", - "pattern": "\\d\\d\\d\\d-\\d\\d-\\d\\d", - "format": "date", - } - } -} -``` - -As shown above, even though each of the single-vocabulary meta-schemas -referenced in the general-use meta-schema's `allOf` declares its corresponding -vocabulary, this new meta-schema must re-declare them. - -The standard meta-schemas that combine all vocabularies defined by the Core and -Validation specification, and that combine all vocabularies defined by those -specifications as well as the Hyper-Schema specification, demonstrate additional -complex combinations. These IRIs for these meta-schemas may be found in the -Validation and Hyper-Schema specifications, respectively. - -While the general-use meta-schema can validate the syntax of `minDate`, it is -the vocabulary that defines the logic behind the semantic meaning of `minDate`. -Without an understanding of the semantics (in this example, that the instance -value must be a date equal to or after the date provided as the keyword's value -in the schema), an implementation can only validate the syntactic usage. In this -case, that means validating that it is a date-formatted string (using `pattern` -to ensure that it is validated even when `format` functions purely as an -annotation, as explained in the [Validation -specification](#json-schema-validation). - ## [Appendix] References and generative use cases While the presence of references is expected to be transparent to validation diff --git a/jsonschema-validation.md b/jsonschema-validation.md index acf6b781..d3352514 100644 --- a/jsonschema-validation.md +++ b/jsonschema-validation.md @@ -55,11 +55,11 @@ which it applies. This greatly simplifies the implementation requirements for validators by ensuring that they do not need to maintain state across the document-wide validation process. -This specification defines a set of assertion keywords, as well as a small -vocabulary of metadata keywords that can be used to annotate the JSON instance -with useful information. The {{format}} keyword is intended primarily as an -annotation, but can optionally be used as an assertion. The {{content}} keywords -are annotations for working with documents embedded as JSON strings. +This specification defines a set of assertion keywords, as well as a number of +metadata keywords that can be used to annotate the JSON instance with useful +information. The {{format}} keyword is intended primarily as an annotation, but +can optionally be used as an assertion. The {{content}} keywords are annotations +for working with documents embedded as JSON strings. ## Interoperability Considerations @@ -87,32 +87,21 @@ regular expressions in the [JSON Schema Core](#json-schema) specification. The current IRI for the default JSON Schema dialect meta-schema is `https://json-schema.org/draft/next/schema`. For schema author convenience, this -meta-schema describes a dialect consisting of all vocabularies defined in this -specification and the JSON Schema Core specification, as well as two former -keywords which are reserved for a transitional period. Individual vocabulary and -vocabulary meta-schema IRIs are given for each section below. Certain -vocabularies are optional to support, which is explained in detail in the -relevant sections. +meta-schema describes a dialect consisting of all keywords defined in this +specification and the JSON Schema Core specification. Certain keywords specify +some functionality which is optional to support and is explained in detail in +the relevant sections. -Updated vocabulary and meta-schema IRIs MAY be published between specification -drafts in order to correct errors. Implementations SHOULD consider IRIs dated -after this specification draft and before the next to indicate the same syntax -and semantics as those listed here. +Updated meta-schema IRIs MAY be published between specification drafts in order +to correct errors. Implementations SHOULD consider IRIs dated after this +specification draft and before the next to indicate the same syntax and +semantics as those listed here. -## A Vocabulary for Structural Validation +## Keywords for Structural Validation Validation keywords in a schema impose requirements for successful validation of an instance. These keywords are all assertions without any annotation behavior. -Meta-schemas that do not use `$vocabulary` SHOULD be considered to require this -vocabulary as if its IRI were present with a value of true. - -The current IRI for this vocabulary, known as the Validation vocabulary, is: -`https://json-schema.org/draft/next/vocab/validation`. - -The current IRI for the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/validation`. - ### Validation Keywords for Any Instance Type {#general} #### `type` @@ -295,7 +284,7 @@ the name of a property in the instance. Omitting this keyword has the same behavior as an empty object. -## Vocabularies for Semantic Content With `format` {#format} +## Semantic Content With `format` {#format} ### Foreword @@ -320,115 +309,57 @@ can be used alongside the `type` keyword with a value of "integer", or could be explicitly defined to always pass if the number is not an integer, which produces essentially the same behavior as only applying to integers. -The current IRI for this vocabulary, known as the Format-Annotation vocabulary, -is: `https://json-schema.org/draft/next/vocab/format-annotation`. The current -IRI for the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/format-annotation`. Implementing -support for this vocabulary is REQUIRED. - -In addition to the Format-Annotation vocabulary, a secondary vocabulary is -available for custom meta-schemas that defines `format` as an assertion. The IRI -for the Format-Assertion vocabulary, is: -`https://json-schema.org/draft/next/vocab/format-assertion`. The current IRI for -the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/format-assertion`. Implementing support -for the Format-Assertion vocabulary is OPTIONAL. - -Specifying both the Format-Annotation and the Format-Assertion vocabularies is -functionally equivalent to specifying only the Format-Assertion vocabulary since -its requirements are a superset of the Format-Annotation vocabulary. - -### Implementation Requirements - -The `format` keyword functions as defined by the vocabulary which is referenced. - -#### Format-Annotation Vocabulary - -The value of format MUST be collected as an annotation, if the implementation -supports annotation collection. This enables application-level validation when -schema validation is unavailable or inadequate. - -Implementations MAY still treat `format` as an assertion in addition to an -annotation and attempt to validate the value's conformance to the specified -semantics. The implementation MUST provide options to enable and disable such -evaluation and MUST be disabled by default. Implementations SHOULD document -their level of support for such validation.[^2] +Implementing support for `format` as an annotation is REQUIRED (if the +implementation supports annotation collection). -[^2]: Specifying the Format-Annotation vocabulary and enabling validation in an -implementation should not be viewed as being equivalent to specifying the -Format-Assertion vocabulary since implementations are not required to provide -full validation support when the Format-Assertion vocabulary is not specified. - -When the implementation is configured for assertion behavior, it: +Implementing support for `format` as an assertion is OPTIONAL. Implementations +which choose to support assertion behavior: +- MUST still collect the keyword's value as an annotation (if the implementation + supports annotation collection), +- MUST provide a configuration option to enable assertion behavior, defaulting to + annotation-only behavior - SHOULD provide an implementation-specific best effort validation for each - format attribute defined below; + format attribute defined below;[^3] - MAY choose to implement validation of any or all format attributes as a no-op - by always producing a validation result of true;[^3] + by always producing a validation result of true;[^4] +- SHOULD use a common parsing library for each format, or a well-known regular + expression; +- SHOULD clearly document how and to what degree each format attribute is + validated. + +[^3]: The expectation is that for simple formats such as date-time, syntactic +validation will be thorough. For a complex format such as email addresses, which +are the amalgamation of various standards and numerous adjustments over time, +with obscure and/or obsolete rules that may or may not be restricted by other +applications making use of the value, a minimal validation is sufficient. For +example, an instance string that does not contain an "@" is clearly not a valid +email address, and an "email" or "hostname" containing characters outside of +7-bit ASCII is likewise clearly invalid. -[^3]: This matches the current reality of implementations, which provide widely +[^4]: This matches the current reality of implementations, which provide widely varying levels of validation, including no validation at all, for some or all format attributes. It is also designed to encourage relying only on the annotation behavior and performing semantic validation in the application, which is the recommended best practice. -#### Format-Assertion Vocabulary - -When the Format-Assertion vocabulary is declared with a value of true, -implementations MUST provide full validation support for all of the formats -defined by this specification. Implementations that cannot provide full -validation support MUST refuse to process the schema. - -An implementation that supports the Format-Assertion vocabulary: - -- MUST still collect `format` as an annotation if the implementation supports - annotation collection; -- MUST evaluate `format` as an assertion; -- MUST implement syntactic validation for all format attributes defined in this - specification, and for any additional format attributes that it recognizes, - such that there exist possible instance values of the correct type that will - fail validation. - The requirement for minimal validation of format attributes is intentionally vague and permissive, due to the complexity involved in many of the attributes. Note in particular that the requirement is limited to syntactic checking; it is not to be expected that an implementation would send an email, attempt to connect to a URL, or otherwise check the existence of an entity -identified by a format instance.[^4] - -[^4]: The expectation is that for simple formats such as date-time, syntactic -validation will be thorough. For a complex format such as email addresses, which -are the amalgamation of various standards and numerous adjustments over time, -with obscure and/or obsolete rules that may or may not be restricted by other -applications making use of the value, a minimal validation is sufficient. For -example, an instance string that does not contain an "@" is clearly not a valid -email address, and an "email" or "hostname" containing characters outside of -7-bit ASCII is likewise clearly invalid. - -It is RECOMMENDED that implementations use a common parsing library for each -format, or a well-known regular expression. Implementations SHOULD clearly -document how and to what degree each format attribute is validated. - -The [standard core and validation meta-schema](#meta-schema) includes this -vocabulary in its `$vocabulary` keyword with a value of false, since by default -implementations are not required to support this keyword as an assertion. -Supporting the format vocabulary with a value of true is understood to greatly -increase code size and in some cases execution time, and will not be appropriate -for all implementations. +identified by a format instance. #### Custom format attributes Implementations MAY support custom format attributes. Save for agreement between parties, schema authors SHALL NOT expect a peer implementation to support such -custom format attributes. An implementation MUST NOT fail to collect unknown -formats as annotations. When the Format-Assertion vocabulary is specified, -implementations MUST fail upon encountering unknown formats. +custom format attributes. -Vocabularies do not support specifically declaring different value sets for -keywords. Due to this limitation, and the historically uneven implementation of -this keyword, it is RECOMMENDED to define additional keywords in a custom -vocabulary rather than additional format attributes if interoperability is -desired. +An implementation MUST NOT fail to collect unknown formats as annotations. + +When configured for assertion behavior for `format`, implementations MUST fail +upon encountering unknown formats. ### Defined Formats @@ -560,7 +491,7 @@ Implementations that validate formats MUST accept at least the subset of ECMA-262 defined in {{regexinterop}}), and SHOULD accept all valid ECMA-262 expressions. -## A Vocabulary for the Contents of String-Encoded Data {#content} +## Keywords for the Contents of String-Encoded Data {#content} ### Foreword @@ -573,15 +504,6 @@ encoded, and/or how it may be validated. They do not function as validation assertions; a malformed string-encoded document MUST NOT cause the containing instance to be considered invalid. -Meta-schemas that do not use `$vocabulary` SHOULD be considered to require this -vocabulary as if its IRI were present with a value of true. - -The current IRI for this vocabulary, known as the Content vocabulary, is: -`https://json-schema.org/draft/next/vocab/content`. - -The current IRI for the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/content`. - ### Implementation Requirements Due to security and performance concerns, as well as the open-ended nature of @@ -710,20 +632,12 @@ structures: first the header, and then the payload. Since the JWT media type ensures that the JWT can be represented in a JSON string, there is no need for further encoding or decoding. -## A Vocabulary for Basic Meta-Data Annotations These general-purpose annotation -keywords provide commonly used information for documentation and user interface -display purposes. They are not intended to form a comprehensive set of features. -Rather, additional vocabularies can be defined for more complex annotation-based -applications. - -Meta-schemas that do not use `$vocabulary` SHOULD be considered to require this -vocabulary as if its IRI were present with a value of true. +## Keywords for Basic Meta-Data Annotations -The current IRI for this vocabulary, known as the Meta-Data vocabulary, is: -`https://json-schema.org/draft/next/vocab/meta-data`. - -The current IRI for the corresponding meta-schema is: -`https://json-schema.org/draft/next/meta/meta-data`. +These general-purpose annotation keywords provide commonly used information for +documentation and user interface display purposes. They are not intended to form +a comprehensive set of features. Rather, additional keywords can be defined +for more complex annotation-based applications. ### `title` and `description` @@ -815,10 +729,10 @@ example. If `examples` is absent, `default` MAY still be used in this manner. ## Security Considerations {#security} -JSON Schema validation defines a vocabulary for JSON Schema core and concerns -all the security considerations listed there. +JSON Schema Validation assumes all the security considerations listed in the +JSON Schema Core specification. -JSON Schema validation allows the use of Regular Expressions, which have +JSON Schema Validation allows the use of Regular Expressions, which have numerous different (often incompatible) implementations. Some implementations allow the embedding of arbitrary code, which is outside the scope of JSON Schema and MUST NOT be permitted. Regular expressions can often also be crafted to be @@ -969,40 +883,6 @@ draft-bhutton-json-schema-01, June 2022, Hoehrmann, B., "Scripting Media Types", RFC 4329, DOI 10.17487/RFC4329, April 2006, <>. -## [Appendix] Keywords Moved from Validation to Core - -Several keywords have been moved from this document into the [Core -Specification](#json-schema) starting with draft 2019-09, in some cases with -re-naming or other changes. This affects the following former validation -keywords: - -- *`definitions`* Renamed to `$defs` to match `$ref` and be shorter to type. - Schema vocabulary authors SHOULD NOT define a `definitions` keyword with - different behavior in order to avoid invalidating schemas that still use the - older name. While `definitions` is absent in the single-vocabulary - meta-schemas referenced by this document, it remains present in the default - meta-schema, and implementations SHOULD assume that `$defs` and `definitions` - have the same behavior when that meta-schema is used. -- *`allOf`, `anyOf`, `oneOf`, `not`, `if`, `then`, `else`, `items`, - `additionalItems`, `contains`, `propertyNames`, `properties`, - `patternProperties`, `additionalProperties`* All of these keywords apply - subschemas to the instance and combine their results, without asserting any - conditions of their own. Without assertion keywords, these applicators can - only cause assertion failures by using the false boolean schema, or by - inverting the result of the true boolean schema (or equivalent schema - objects). For this reason, they are better defined as a generic mechanism on - which validation, hyper-schema, and extension vocabularies can all be based. -- *`maxContains`, `minContains`* These keywords modify the behavior of - `contains`, and are therefore grouped with it in the applicator vocabulary. -- *`dependencies`* This keyword had two different modes of behavior, which made - it relatively challenging to implement and reason about. The schema form has - been moved to Core and renamed to `dependentSchemas`, as part of the - applicator vocabulary. It is analogous to `properties`, except that instead of - applying its subschema to the property value, it applies it to the object - containing the property. The property name array form is retained here and - renamed to `dependentRequired`, as it is an assertion which is a shortcut for - the conditional use of the `required` assertion keyword. - ## [Appendix] Acknowledgments Thanks to Gary Court, Francis Galiegue, Kris Zyp, Geraint Luff, and Henry diff --git a/jsonschema-vocab-annotations.md b/jsonschema-vocab-annotations.md new file mode 100644 index 00000000..e69de29b diff --git a/jsonschema-vocab-applicators.md b/jsonschema-vocab-applicators.md new file mode 100644 index 00000000..e69de29b diff --git a/jsonschema-vocab-assertions.md b/jsonschema-vocab-assertions.md new file mode 100644 index 00000000..e69de29b diff --git a/jsonschema-vocab-format.md b/jsonschema-vocab-format.md new file mode 100644 index 00000000..e69de29b