From d5da98be23fe029f2109fb900c6a6d556508271d Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Wed, 7 May 2025 13:09:48 -0700 Subject: [PATCH 01/13] Support ordered multipart including streaming This adds support for all `multipart` media types that do not have named parts, including support for streaming such media types. Note that `multipart/mixed` defines the basic processing rules for all `multipart` types, and implementations that encounter unrecognized `multipart` subtypes are required to process them as `multipart/mixed`. Therefore support for `multipart/mixed` addresses all other subtypes to some degree. This builds on the recent support for sequential media types: * `multipart/mixed` and similar meet the definition for a sequential media type, requiring it to be modeled as an array. This does use an expansive definition of "repeating the same structure", where the structure is literally any content with a media type. * As a sequential media type, it also supports `itemSchema` * Adding a parallel `itemEncoding` is the obvious solution to `multipart/mixed` streams requiring an Encoding Object * We have regularly received requests to support truly mixed `multipart/mixed` payloads, and previously claimed such support from 3.0.0 onwards, without actually supporting it. Adding `prefixEncoding` along with `itemEncoding` supports this use case with a clear parallel to `prefixItems`, which is the schema construct needed to support this case. * There is no need for a `prefixSchema` field because the streaming use case requires a repetition of the same schema for each item. Therefore all mixed use cases can use `schema` and `prefixItems` --- src/oas.md | 143 ++++++++++++++++++++++++++--- src/schemas/validation/schema.yaml | 13 ++- 2 files changed, 141 insertions(+), 15 deletions(-) diff --git a/src/oas.md b/src/oas.md index e91c9cb08a..a88ac47bd9 100644 --- a/src/oas.md +++ b/src/oas.md @@ -101,14 +101,17 @@ Some examples of sequential media types (including some that are not IANA-regist application/json-seq application/geo+json-seq text/event-stream + multipart/mixed ``` In the first three above, the repeating structure is any [JSON value](https://tools.ietf.org/html/rfc8259#section-3). -The fourth repeats `application/geo+json`-structured values, while the last repeats a custom text format related to Server-Sent Events. +The fourth repeats `application/geo+json`-structured values, while `text/event-stream` repeats a custom text format related to Server-Sent Events. +The final media type listed above, `multipart/mixed`, provides an ordered list of documents of any media type, and is sometimes streamed. Implementations MUST support mapping sequential media types into the JSON Schema data model by treating them as if the values were in an array in the same order. See [Complete vs Streaming Content](#complete-vs-streaming-content) for more information on handling sequential media types in a streaming context, including special considerations for `text/event-stream` content. +For `multipart` types, see also [Encoding By Position](#encoding-by-position). #### Media Type Registry @@ -1253,7 +1256,9 @@ See [Working With Examples](#working-with-examples) for further guidance regardi | itemSchema | [Schema Object](#schema-object) | A schema describing each item within a [sequential media type](#sequential-media-types). | | example | Any | Example of the media type; see [Working With Examples](#working-with-examples). | | examples | Map[ `string`, [Example Object](#example-object) \| [Reference Object](#reference-object)] | Examples of the media type; see [Working With Examples](#working-with-examples). | -| encoding | Map[`string`, [Encoding Object](#encoding-object)] | A map between a property name and its encoding information, as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). The `encoding` field SHALL only apply when the media type is `multipart` or `application/x-www-form-urlencoded`. If no Encoding Object is provided for a property, the behavior is determined by the default values documented for the Encoding Object. | +| encoding | Map[`string`, [Encoding Object](#encoding-object)] | A map between a property name and its encoding information, as defined under [Encoding By Name](#encoding-by-name). The `encoding` field SHALL only apply when the media type is `multipart` or `application/x-www-form-urlencoded`. If no Encoding Object is provided for a property, the behavior is determined by the default values documented for the Encoding Object. This field MUST NOT be present if `prefixEncoding` or `itemEncoding` are present. | +| prefixEncoding | [[Encoding Object](#encoding-object)] | An array of positional encoding information, as defined under [Encoding By Position](#encoding-by-position). The `prefixEncoding` field SHALL only apply when the media type is `multipart`. If no Encoding Object is provided for a property, the behavior is determined by the default values documented for the Encoding Object. This field MUST NOT be present if `encoding` is present. | +| itemEncoding | [Encoding Object](#encoding-object) | A single Encoding Object that provides encoding information for multiple array items, as defined under [Encoding By Position](#encoding-by-position). The `itemEncoding` field SHALL only apply when the media type is `multipart`. If no Encoding Object is provided for a property, the behavior is determined by the default values documented for the Encoding Object. This field MUST NOT be present if `encoding` is present. | This object MAY be extended with [Specification Extensions](#specification-extensions). @@ -1273,7 +1278,8 @@ For this use case, `maxLength` MAY be implemented outside of regular JSON Schema ###### Streaming Sequential Media Types -The `itemSchema` field is provided to support streaming use cases for sequential media types. +The `itemSchema` field is provided to support streaming use cases for sequential media types, with `itemEncoding` as a corresponding encoding mechanism for streaming [positional `multipart` media types](#encoding-by-position). + Unlike `schema`, which is applied to the complete content (treated as an array as described in the [sequential media types](#sequential-media-types) section), `itemSchema` MUST be applied to each item in the stream independently, which supports processing each item as it is read from the stream. Both `schema` and `itemSchema` MAY be used in the same Media Type Object. @@ -1309,13 +1315,16 @@ properties: ##### Encoding Usage and Restrictions -The `encoding` field defines how to map each [Encoding Object](#encoding-object) to a specific value in the data. +The three encoding fields define how to map each [Encoding Object](#encoding object) to a specific value in the data. +Each field has its own set of media types with which it can be used; for all other media types all three fields SHALL be ignored. -To use the `encoding` field, a `schema` MUST exist, and the `encoding` field's keys MUST exist in the schema as properties. -Array properties MUST be handled by applying the given Encoding Object to one part per array item, each with the same `name`, as is recommended by [[?RFC7578]] [Section 4.3](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. -For all other value types for both top-level non-array properties and for values, including array values, within a top-level array, the Encoding Object MUST be applied to the entire value. +###### Encoding By Name The behavior of the `encoding` field is designed to support web forms, and is therefore only defined for media types structured as name-value pairs that allow repeat values, most notably `application/x-www-form-urlencoded` and `multipart/form-data`. + +To use the `encoding` field, each key under the field MUST exist in the `schema` as a property. +Array properties MUST be handled by applying the given Encoding Object to produce one encoded value per array item, each with the same `name`, as is recommended by [[?RFC7578]] [Section 4.3](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. +For all other value types for both top-level non-array properties and for values, including array values, within a top-level array, the Encoding Object MUST be applied to the entire value. The order of these name-value pairs in the target media type is implementation-defined. For `application/x-www-form-urlencoded`, the encoding keys MUST map to parameter names, with the values produced according to the rules of the [Encoding Object](#encoding-object). @@ -1324,15 +1333,29 @@ See [Encoding the `x-www-form-urlencoded` Media Type](#encoding-the-x-www-form-u For `multipart`, the encoding keys MUST map to the [`name` parameter](https://www.rfc-editor.org/rfc/rfc7578#section-4.2) of the `Content-Disposition: form-data` header of each part, as is defined for `multipart/form-data` in [[?RFC7578]]. See [[?RFC7578]] [Section 5](https://www.rfc-editor.org/rfc/rfc7578.html#section-5) for guidance regarding non-ASCII part names. -Other `multipart` media types are not directly supported as they do not define a mechanism for part names. -However, the usage of a `name` [`Content-Disposition` parameter](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-2) is defined for the `form-data` [`Content-Disposition` value](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-1), which is not restricted to `multipart/form-data`. -Implementations MAY choose to support the a `Conent-Disposition` of `form-data` with a `name` parameter in other `multipart` media types in order to use the `encoding` field with them, but this usage is unlikely to be supported by generic `multipart` implementations. - See [Encoding `multipart` Media Types](#encoding-multipart-media-types) for further guidance and examples, both with and without the `encoding` field. +###### Encoding By Position + +Most `multipart` media types, including `multipart/mixed` which defines the underlying rules for parsing all `multipart` types, do not have named parts. +Data for these media types are modeled as an array, with one item per part, in order. + +To use the `prefixEncoding` and/or `itemEncoding` fields, either an array `schema` or `itemSchema` MUST be present. +These fields are analogous to the `prefixItems` and `items` JSON Schema keywords, with `prefixEncoding` (if present) providing an array of Encoding Objects that are each applied to the value at the same position in the data array, and `itemEncoding` applying its single Encoding Object to all remaining items in the array. + +The `itemEncoding` field can also be used with `itemSchema` to support streaming `multipart` content. + +###### Additional Encoding Approaches + +The `prefixEncoding` field can be used with any `multipart` content to require a fixed part order. +This includes `multipart/form-data`, for which the Encoding Object's `headers` field MUST be used to provide the `Content-Disposition` and part name, as no property names exist to provide the names automatically. + +Prior versions of this specifications advised using the `name` [`Content-Disposition` parameter](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-2) of the `form-data` [`Content-Disposition` value](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-1) with `multipart` media types other than `multipart/form-data` in order to work around the limitations of the `encoding` field. +Implementations MAY choose to support this workaround, but as this usage is not common, implementations of non-`form-data` `multipart` media types are unlikely to support it. + ##### Media Type Examples -For form-related media type examples, see the [Encoding Object](#encoding-object). +For form-related and `multipart` media type examples, see the [Encoding Object](#encoding-object). ###### JSON @@ -1645,8 +1668,9 @@ These fields MAY be used either with or without the RFC6570-style serialization This object MAY be extended with [Specification Extensions](#specification-extensions). The default values for `contentType` are as follows, where an _n/a_ in the `contentEncoding` column means that the presence or value of `contentEncoding` is irrelevant. -This table is based on the value to which the Encoding Object is being applied, which as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions) is the array item for properties of type `"array"`, and the entire value for all other types. -Therefore the `array` row in this table applies only to array values inside of a top-level array. +This table is based on the value to which the Encoding Object is being applied as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). +Note that in the case of [Encoding By Name](#encoding-by-name), this value is the array item for properties of type `"array"`, and the entire value for all other types. +Therefore the `array` row in this table applies only to array values inside of a top-level array when encoding by name. | `type` | `contentEncoding` | Default `contentType` | | ---- | ---- | ---- | @@ -1869,6 +1893,97 @@ requestBody: As seen in the [Encoding Object's `contentType` field documentation](#encoding-content-type), the empty schema for `items` indicates a media type of `application/octet-stream`. +###### Example: Ordered, Unnamed Multipart + +A `multipart/mixed` payload consisting of a JSON metadata document followed by an image which the metadata describes: + +```yaml +multipart/mixed: + schema: + prefixItems: + - # default content type for objects + # is `application/json`type: object + properties: + author: + type: string + created: + type: string + format: datetime + copyright: + type: string + license: + type: string + - # default content type for a schema without `type` + # is `application/octet-stream`, which we need + # to override. + {} + prefixEncoding: + - # Encoding Object defaults are correct for JSON + {} + - contentType: image/* +``` + +###### Example: Ordered Multipart With Required Header + +As described in [[?RFC2557]], a set of HTML pages can be sent in a `multipart/related` payload, preserving links among themselves by defining a `Content-Location` header for each page. + +See [Appendix D](appendix-d-serializing-headers-and-cookies) for an explanation of why `content: {text/plain: {...}}` is used to describe the header value. + +```yaml +multipart/related: + schema: + items: + type: string + itemEncoding: + contentType: text/html + headers: + Content-Location: + required: true + content: + text/plain: + schema: + type: string + format: uri +``` + +While the above example could have used `itemSchema` instead, if the payload is expected to be processed all at once, using `schema` ensures that tools will wait until the complete response is available before processing. + +###### Example: Streaming Multipart + +This example assumes a device that takes large sets of pictures and streams them to the caller. +Unlike the previous example, we use `itemSchema` here because the expectation is that each image is processed as it arrives (or in small batches), since we know that buffering the entire stream will take too much memory. + +```yaml +multipart/mixed: + itemSchema: + $comment: A single data image from the device + itemEncoding: + contentType: image/jpg +``` + +###### Example: Streaming Byte Ranges + +For `multipart/byteranges` [[RFC9110]] [Section 14.6](https://www.rfc-editor.org/rfc/rfc9110.html#section-14.6), a `Content-Range` header is required: + +See [Appendix D](appendix-d-serializing-headers-and-cookies) for an explanation of why `content: {text/plain: {...}}` is used to describe the header value. + +```yaml +multipart/byteranges: + itemSchema: + $comment: A single range of bytes from a video + itemEncoding: + contentType: video/mp4 + headers: + Content-Range: + required: true + content: + text/plain: + schema: + # A suitable "pattern" constraint for this + # header is left as an exercise for the reader + type: string +``` + #### Responses Object A container for the expected responses of an operation. diff --git a/src/schemas/validation/schema.yaml b/src/schemas/validation/schema.yaml index 9990fefb67..529241f982 100644 --- a/src/schemas/validation/schema.yaml +++ b/src/schemas/validation/schema.yaml @@ -533,9 +533,20 @@ $defs: type: object additionalProperties: $ref: '#/$defs/encoding' + prefixEncoding: + type: array + items: + $ref: '#/$defs/encoding' + itemEncoding: + $ref: '#/$defs/encoding' allOf: - - $ref: '#/$defs/specification-extensions' - $ref: '#/$defs/examples' + - $ref: '#/$defs/specification-extensions' + - dependentSchema: + encoding: + properties: + prefixEncoding: false + itemEncoding: false unevaluatedProperties: false encoding: From 699f7e093b31af56167db69e54f6a9a7e75d255f Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Sun, 18 May 2025 07:44:34 -0700 Subject: [PATCH 02/13] Fix formatting in Encoding guidance --- src/oas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index a88ac47bd9..e2124288f9 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1795,7 +1795,7 @@ Note that there are significant restrictions on what headers can be used with `m Note also that `Content-Transfer-Encoding` is deprecated for `multipart/form-data` ([RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.7)) where binary data is supported, as it is in HTTP. Using `contentEncoding` for a multipart field is equivalent to specifying an [Encoding Object](#encoding-object) with a `headers` field containing `Content-Transfer-Encoding` with a schema that requires the value used in `contentEncoding`. -+If `contentEncoding` is used for a multipart field that has an Encoding Object with a `headers` field containing `Content-Transfer-Encoding` with a schema that disallows the value from `contentEncoding`, the result is undefined for serialization and parsing. +If `contentEncoding` is used for a multipart field that has an Encoding Object with a `headers` field containing `Content-Transfer-Encoding` with a schema that disallows the value from `contentEncoding`, the result is undefined for serialization and parsing. Note that as stated in [Working with Binary Data](#working-with-binary-data), if the Encoding Object's `contentType`, whether set explicitly or implicitly through its default value rules, disagrees with the `contentMediaType` in a Schema Object, the `contentMediaType` SHALL be ignored. Because of this, and because the Encoding Object's `contentType` defaulting rules do not take the Schema Object's`contentMediaType` into account, the use of `contentMediaType` with an Encoding Object is NOT RECOMMENDED. From 534b658072ba06a3c0306d093abc76b9fea6d1ae Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Tue, 27 May 2025 13:29:13 -0700 Subject: [PATCH 03/13] Clarify multipart preamble/epilogue --- src/oas.md | 1 + 1 file changed, 1 insertion(+) diff --git a/src/oas.md b/src/oas.md index e2124288f9..fff233c5d9 100644 --- a/src/oas.md +++ b/src/oas.md @@ -107,6 +107,7 @@ Some examples of sequential media types (including some that are not IANA-regist In the first three above, the repeating structure is any [JSON value](https://tools.ietf.org/html/rfc8259#section-3). The fourth repeats `application/geo+json`-structured values, while `text/event-stream` repeats a custom text format related to Server-Sent Events. The final media type listed above, `multipart/mixed`, provides an ordered list of documents of any media type, and is sometimes streamed. +Note that while `multipart` formats technically allow a preamble and an epilogue, the RFC directs that they are to be ignored, making the effectively comments, and this specification does not model them. Implementations MUST support mapping sequential media types into the JSON Schema data model by treating them as if the values were in an array in the same order. From ea8d4e65555f98fb9a0390f2729f526524b8b4a3 Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Tue, 27 May 2025 13:30:41 -0700 Subject: [PATCH 04/13] Clarify lack of nested multipart support. --- src/oas.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/oas.md b/src/oas.md index fff233c5d9..a95fcd95f0 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1686,6 +1686,8 @@ Determining how to handle a `type` value of `null` depends on how `null` values If `null` values are entirely omitted, then the `contentType` is irrelevant. See [Appendix B](#appendix-b-data-type-conversion) for a discussion of data type conversion options. +It is not currently possible to model nested `multipart` media types. + ###### Fixed Fields for RFC6570-style Serialization | Field Name | Type | Description | From f1de57d260105d3449dbd9c8f20acbc286fc3102 Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Fri, 30 May 2025 15:14:59 -0700 Subject: [PATCH 05/13] Fix typo Thanks to @thecheatah for catching this. --- src/oas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index a95fcd95f0..e9e5207d17 100644 --- a/src/oas.md +++ b/src/oas.md @@ -107,7 +107,7 @@ Some examples of sequential media types (including some that are not IANA-regist In the first three above, the repeating structure is any [JSON value](https://tools.ietf.org/html/rfc8259#section-3). The fourth repeats `application/geo+json`-structured values, while `text/event-stream` repeats a custom text format related to Server-Sent Events. The final media type listed above, `multipart/mixed`, provides an ordered list of documents of any media type, and is sometimes streamed. -Note that while `multipart` formats technically allow a preamble and an epilogue, the RFC directs that they are to be ignored, making the effectively comments, and this specification does not model them. +Note that while `multipart` formats technically allow a preamble and an epilogue, the RFC directs that they are to be ignored, making them effectively comments, and this specification does not model them. Implementations MUST support mapping sequential media types into the JSON Schema data model by treating them as if the values were in an array in the same order. From 8bfb1849dffa2f0428c432d32c873fc039ac5911 Mon Sep 17 00:00:00 2001 From: Henry Andrews Date: Fri, 30 May 2025 19:24:15 -0700 Subject: [PATCH 06/13] Apply suggestions from code review Co-authored-by: Jeremy Fiel <32110157+jeremyfiel@users.noreply.github.com> --- src/oas.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/oas.md b/src/oas.md index e9e5207d17..968c441278 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1351,7 +1351,7 @@ The `itemEncoding` field can also be used with `itemSchema` to support streaming The `prefixEncoding` field can be used with any `multipart` content to require a fixed part order. This includes `multipart/form-data`, for which the Encoding Object's `headers` field MUST be used to provide the `Content-Disposition` and part name, as no property names exist to provide the names automatically. -Prior versions of this specifications advised using the `name` [`Content-Disposition` parameter](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-2) of the `form-data` [`Content-Disposition` value](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-1) with `multipart` media types other than `multipart/form-data` in order to work around the limitations of the `encoding` field. +Prior versions of this specification advised using the `name` [`Content-Disposition` parameter](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-2) of the `form-data` [`Content-Disposition` value](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-1) with `multipart` media types other than `multipart/form-data` in order to work around the limitations of the `encoding` field. Implementations MAY choose to support this workaround, but as this usage is not common, implementations of non-`form-data` `multipart` media types are unlikely to support it. ##### Media Type Examples @@ -1905,7 +1905,8 @@ multipart/mixed: schema: prefixItems: - # default content type for objects - # is `application/json`type: object + # is `application/json` + type: object properties: author: type: string From 21d54c9aa68abb1b6dc7462317b121ad6481cbb7 Mon Sep 17 00:00:00 2001 From: Henry Andrews Date: Sun, 1 Jun 2025 15:35:48 -0700 Subject: [PATCH 07/13] Fix typo Co-authored-by: Jeremy Fiel <32110157+jeremyfiel@users.noreply.github.com> --- src/schemas/validation/schema.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/schemas/validation/schema.yaml b/src/schemas/validation/schema.yaml index 529241f982..1b5881671d 100644 --- a/src/schemas/validation/schema.yaml +++ b/src/schemas/validation/schema.yaml @@ -542,7 +542,7 @@ $defs: allOf: - $ref: '#/$defs/examples' - $ref: '#/$defs/specification-extensions' - - dependentSchema: + - dependentSchemas: encoding: properties: prefixEncoding: false From d2cf873fd5c398058f0c3bf0c09c7fa7cad6803d Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Mon, 9 Jun 2025 09:27:12 -0700 Subject: [PATCH 08/13] Better word ordering Co-authored-by: Ralf Handl --- src/oas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index 968c441278..1079912b03 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1341,7 +1341,7 @@ See [Encoding `multipart` Media Types](#encoding-multipart-media-types) for furt Most `multipart` media types, including `multipart/mixed` which defines the underlying rules for parsing all `multipart` types, do not have named parts. Data for these media types are modeled as an array, with one item per part, in order. -To use the `prefixEncoding` and/or `itemEncoding` fields, either an array `schema` or `itemSchema` MUST be present. +To use the `prefixEncoding` and/or `itemEncoding` fields, either `itemSchema` or an array `schema` MUST be present. These fields are analogous to the `prefixItems` and `items` JSON Schema keywords, with `prefixEncoding` (if present) providing an array of Encoding Objects that are each applied to the value at the same position in the data array, and `itemEncoding` applying its single Encoding Object to all remaining items in the array. The `itemEncoding` field can also be used with `itemSchema` to support streaming `multipart` content. From 085d5ee5ebfecb718d78a234b4bd9019811a68d2 Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Mon, 9 Jun 2025 09:53:47 -0700 Subject: [PATCH 09/13] Make Encoding type resolution explicit --- src/oas.md | 47 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 46 insertions(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index 1079912b03..609a6d1678 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1319,6 +1319,51 @@ properties: The three encoding fields define how to map each [Encoding Object](#encoding object) to a specific value in the data. Each field has its own set of media types with which it can be used; for all other media types all three fields SHALL be ignored. +###### Encoding and `type` + +Several encoding behaviors, including which of the Media Type Object's encoding fields can be used, the way in which those fields apply Encoding Objects to either the immediate instance or to items in an instance array, and the default behavior of the Encoding Object's `contentType` field, depend on the relevant schema's `type`. + +When only a single Schema Object with a single-valued `type` keyword is relevant, this behavior is easily-determined. +However, when schemas are assembled through references, this can be more challenging. +When schemas have constraints that are only resolvable at runtime, determining the type prior to runtime can be impossible. + +When working with in-memory data at runtime, if an implementation cannot locate an appropriate `type` keyword but the data is valid according to all relevant Schema Objects, then the runtime type of the data MUST be used to determine the behavior. + +When parsing a data format using an Encoding Object, implementations MUST support using a single-valued `type` keyword that is either in the corresponding Schema Object, or reachable from that Schema Object by following a chain of `$ref` and/or `allOf` keywords. +Note that if `allOf` is used to combine single-valued `type` keywords with conflicting type values, schema validation will always fail, so such conflicts need not be detected in advance as long as validation is applied to the parsed result. +Note also that while the `type` value `"integer"` can be applied with `type: "number"`, the encoding behavior is the same for JSON numbers whether they meet JSON Schema's additional `type: "integer"` constraint or not, so these types do not conflict. + +For example, the relevant type of the `"foo"` property is `"number"`, found by following the `$ref` to the `"Thing"` schema, then the `$ref` under the `"foo"` property schema to the `"Foo"` schema, then the `$ref` under the first branch of the `allOf` to the `"Bar"` schema, which defines a `type` keyword with the single value `"number"`, a primitive type: + +```yaml +components: + schemas: + Thing: + type: object + properties: + foo: + $ref: "#/components/Schemas/Foo" + Foo: + allOf: + - $ref: "#/components/Schemas/Bar" + - $ref: "#/components/Schemas/Baz" + Bar: + type: number + Baz: + minimum: 0 + requestBodies: + Thing: + schema: + $ref: "#/components/Schemas/Thing" + encoding: + foo: + # The default `contentType` is `text/plain` +``` + +Implementations MAY attempt to handle more complex schema arrangements, in which case they MUST document what is handled and with what behavior. +If they do, then `type` keywords that contain multiple values (e.g. `type: ["number", "nul"]`) SHOULD be handled by attempting to parse according to each type in the order provided, falling back to the next type until the list is exhausted. +However OAD authors are advised that depending on handling scenarios other than `$ref`/`allOf`-reachable single-valued `type` keywords is not interoperable. + ###### Encoding By Name The behavior of the `encoding` field is designed to support web forms, and is therefore only defined for media types structured as name-value pairs that allow repeat values, most notably `application/x-www-form-urlencoded` and `multipart/form-data`. @@ -1669,7 +1714,7 @@ These fields MAY be used either with or without the RFC6570-style serialization This object MAY be extended with [Specification Extensions](#specification-extensions). The default values for `contentType` are as follows, where an _n/a_ in the `contentEncoding` column means that the presence or value of `contentEncoding` is irrelevant. -This table is based on the value to which the Encoding Object is being applied as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). +This table is based on the value to which the Encoding Object is being applied as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions), where determining the type is done as described under [Encoding and `type`](#encoding-and-type). Note that in the case of [Encoding By Name](#encoding-by-name), this value is the array item for properties of type `"array"`, and the entire value for all other types. Therefore the `array` row in this table applies only to array values inside of a top-level array when encoding by name. From af80e2fb785a92ccc7b1999c112c132bbcbc28c3 Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Thu, 12 Jun 2025 20:38:48 -0700 Subject: [PATCH 10/13] Add schema tests for encoding fields. --- tests/schema/pass/media-type-examples.yaml | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/tests/schema/pass/media-type-examples.yaml b/tests/schema/pass/media-type-examples.yaml index 2ab4e68076..584608017e 100644 --- a/tests/schema/pass/media-type-examples.yaml +++ b/tests/schema/pass/media-type-examples.yaml @@ -138,3 +138,13 @@ paths: forCoverage2: style: spaceDelimited explode: true + multipart/related: + schema: + type: array + itemEncoding: + contentType: text/plain + prefixEncoding: + - headers: + Content-Location: + schema: + type: string From 3b6b93cbf1fc57ec0157d71dbd6f8477fa196175 Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Fri, 13 Jun 2025 09:17:35 -0700 Subject: [PATCH 11/13] Add negative tests for new encoding fields --- tests/schema/fail/media-type-enc-item-exclusion.yaml | 10 ++++++++++ tests/schema/fail/media-type-enc-prefix-exclusion.yaml | 10 ++++++++++ 2 files changed, 20 insertions(+) create mode 100644 tests/schema/fail/media-type-enc-item-exclusion.yaml create mode 100644 tests/schema/fail/media-type-enc-prefix-exclusion.yaml diff --git a/tests/schema/fail/media-type-enc-item-exclusion.yaml b/tests/schema/fail/media-type-enc-item-exclusion.yaml new file mode 100644 index 0000000000..012f1f44c8 --- /dev/null +++ b/tests/schema/fail/media-type-enc-item-exclusion.yaml @@ -0,0 +1,10 @@ +openapi: 3.2.0 +info: + title: API + version: 1.0.0 +components: + requestBodies: + content: + multipart/mixed: + encoding: {} + itemEncoding: {} diff --git a/tests/schema/fail/media-type-enc-prefix-exclusion.yaml b/tests/schema/fail/media-type-enc-prefix-exclusion.yaml new file mode 100644 index 0000000000..d57c463b9d --- /dev/null +++ b/tests/schema/fail/media-type-enc-prefix-exclusion.yaml @@ -0,0 +1,10 @@ +openapi: 3.2.0 +info: + title: API + version: 1.0.0 +components: + requestBodies: + content: + multipart/mixed: + encoding: {} + prefixEncoding: {} From b6dde4728babb9f4847e87dc8ee104411f8da798 Mon Sep 17 00:00:00 2001 From: Henry Andrews Date: Fri, 13 Jun 2025 09:26:27 -0700 Subject: [PATCH 12/13] Better wording, fix typo Co-authored-by: Ralf Handl --- src/oas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index 609a6d1678..079380c837 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1361,7 +1361,7 @@ components: ``` Implementations MAY attempt to handle more complex schema arrangements, in which case they MUST document what is handled and with what behavior. -If they do, then `type` keywords that contain multiple values (e.g. `type: ["number", "nul"]`) SHOULD be handled by attempting to parse according to each type in the order provided, falling back to the next type until the list is exhausted. +If they do, then `type` keywords that contain multiple values (e.g. `type: ["number", "null"]`) SHOULD be handled by attempting to parse according to each type in the order provided. However OAD authors are advised that depending on handling scenarios other than `$ref`/`allOf`-reachable single-valued `type` keywords is not interoperable. ###### Encoding By Name From d1cc23f3758af10957f5c1e76f3071506694575e Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Wed, 18 Jun 2025 14:05:21 -0700 Subject: [PATCH 13/13] Improve example, add security considerations. --- src/oas.md | 66 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 47 insertions(+), 19 deletions(-) diff --git a/src/oas.md b/src/oas.md index 079380c837..c0bd9d0e97 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1708,7 +1708,7 @@ These fields MAY be used either with or without the RFC6570-style serialization | Field Name | Type | Description | | ---- | :----: | ---- | -| contentType | `string` | The `Content-Type` for encoding a specific property. The value is a comma-separated list, each element of which is either a specific media type (e.g. `image/png`) or a wildcard media type (e.g. `image/*`). Default value depends on the property type as shown in the table below. | +| contentType | `string` | The `Content-Type` for encoding a specific property. The value is a comma-separated list, each element of which is either a specific media type (e.g. `image/png`) or a wildcard media type (e.g. `image/*`). See [Detecting Media Types](#detecting media types) for related security concerns. Default value depends on the property type as shown in the table below. | | headers | Map[`string`, [Header Object](#header-object) \| [Reference Object](#reference-object)] | A map allowing additional information to be provided as headers. `Content-Type` is described separately and SHALL be ignored in this section. This field SHALL be ignored if the media type is not a `multipart`. | This object MAY be extended with [Specification Extensions](#specification-extensions). @@ -1974,29 +1974,52 @@ multipart/mixed: ###### Example: Ordered Multipart With Required Header -As described in [[?RFC2557]], a set of HTML pages can be sent in a `multipart/related` payload, preserving links among themselves by defining a `Content-Location` header for each page. +As described in [[?RFC2557]], a set of resources making up a web pages can be sent in a `multipart/related` payload, preserving links among themselves by defining a `Content-Location` header for each page. +The first part is used as the root resource (unless using `Content-ID`, which RFC2557 advises against), so we use `prefixItems` and `prefixEncoding` to define that it must be an HTML resource, and then allow any of several different types of resources in any order to follow. -See [Appendix D](appendix-d-serializing-headers-and-cookies) for an explanation of why `content: {text/plain: {...}}` is used to describe the header value. +The `Content-Location` header is defined using `content: {text/plain: {...}}` to avoid percent-encoding its URI value; see [Appendix D](appendix-d-serializing-headers-and-cookies) for further details. ```yaml -multipart/related: - schema: - items: - type: string - itemEncoding: - contentType: text/html - headers: - Content-Location: - required: true - content: - text/plain: - schema: - type: string - format: uri +components: + headers: + RFC2557ContentId: + description: Use Content-Location instead of Content-ID + schema: false + RFC2557ContentLocation: + required: true + content: + text/plain: + schema: + $comment: Use a full URI (not a relative reference) + type: string + format: uri + requestBodies: + RFC2557: + content: + multipart/related; type=text/html: + schema: + prefixItems: + - type: string + items: + anyOf: + - type: string + - $comment: To allow binary, this must always pass + prefixEncoding: + - contentType: text/html + headers: + Content-ID: + $ref: '#/components/headers/RFC2557ContentId' + Content-Location: + $ref: '#/components/headers/RFC2557ContentLocation' + itemEncoding: + contentType: text/html, text/css, text/javascript + headers: + Content-ID: + $ref: '#/components/headers/RFC2557ContentId' + Content-Location: + $ref: '#/components/headers/RFC2557ContentLocation' ``` -While the above example could have used `itemSchema` instead, if the payload is expected to be processed all at once, using `schema` ensures that tools will wait until the complete response is available before processing. - ###### Example: Streaming Multipart This example assumes a device that takes large sets of pictures and streams them to the caller. @@ -4138,6 +4161,11 @@ The rules for connecting a [Security Requirement Object](#security-requirement-o OpenAPI Descriptions may contain references to external resources that may be dereferenced automatically by consuming tools. External resources may be hosted on different domains that may be untrusted. +### Detecting Media Types + +Scenarios such as those documented under [Example: Ordered Multipart With Required Header](#example-ordered-multipart-with-required-header) can require an implementation to test whether data matches one of several different media types. +Each media type has its own security considerations to be taken into accound during such testing. + ### Handling Reference Cycles References in an OpenAPI Description may cause a cycle. Tooling must detect and handle cycles to prevent resource exhaustion.