From f2879df2735e2043b967ab2d376f76bc8e64a28a Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Thu, 1 May 2025 13:39:12 -0700 Subject: [PATCH 1/6] Arrange encoding information more clearly Refactor this to put the rules for mapping Encoding Objects to valules with the `encoding` field (which performs the mapping) rather than having most of it in the Encoding Object (which should focus on how to apply a single Encoding Object to a single value). This notably takes the special handling of arrays as repeated values out of the Encoding Object section (and the default `contentType` field value table) and moves it to the Media Type Object. The Encoding Object behavior is now consistent for all types, while the _mapping_ done by the `encoding` field handles the special case. The only change (as opposed to re-organization and re-wording) in this PR is the addition of a default `contentType` of `application/json` for array values, which in the context of the existing behavior is only relevant for array values nested under a top-level array. Past OAS versions were silent on this topic, and presumably it just does not come up much, but it was a gap we should fill. As dicussed in today's TDC call, we have increasing (and modern) use cases for supporting `multipart/mixed` (which we previously claimed to support but never did). This refactor makes possible future support easier by moving the array special case, which is governed by the `multipart/form-data` RFC, out of the Encoding Object (which needs to work with other `multipart` formats) and places it with the `encoding` field (which is web form-format-specific). --- src/oas.md | 62 +++++++++++++++++++++++++++++++----------------------- 1 file changed, 36 insertions(+), 26 deletions(-) diff --git a/src/oas.md b/src/oas.md index faf401ab7c..673a7c784f 100644 --- a/src/oas.md +++ b/src/oas.md @@ -84,6 +84,8 @@ Some examples of possible media type definitions: application/vnd.github.v3.patch ``` +#### Media Type Registry + ### HTTP Status Codes The HTTP Status Codes are used to indicate the status of the executed operation. @@ -1615,10 +1617,33 @@ See [Working With Examples](#working-with-examples) for further guidance regardi | schema | [Schema Object](#schema-object) | The schema defining the content of the request, response, parameter, or header. | | example | Any | Example of the media type; see [Working With Examples](#working-with-examples). | | examples | Map[ `string`, [Example Object](#example-object) \| [Reference Object](#reference-object)] | Examples of the media type; see [Working With Examples](#working-with-examples). | -| encoding | Map[`string`, [Encoding Object](#encoding-object)] | A map between a property name and its encoding information. The key, being the property name, MUST exist in the schema as a property. The `encoding` field SHALL only apply when the media type is `multipart` or `application/x-www-form-urlencoded`. If no Encoding Object is provided for a property, the behavior is determined by the default values documented for the Encoding Object. | +| encoding | Map[`string`, [Encoding Object](#encoding-object)] | A map between a property name and its encoding information for media types supporting name-value pairs and allowing duplicate names, as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). | + This object MAY be extended with [Specification Extensions](#specification-extensions). +##### Encoding Usage and Restrictions + +To use the `encoding` field, a `schema` MUST exist, and the `encoding` field's keys MUST exist in the schema as a property. +Array properties MUST be handled by applying the given Encoding Object to multiple parts (or query parameters) with the same `name`, as is recommended by [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. +For all other property types, including array values within a top-level array, the Encoding Object MUST be applied to the entire values. + +The behavior of the `encoding` field is only defined for media types structured as name-value pairs that allow repeat values. +The order of these name-value pairs in the target media type is implementation-defined. + +For `application/x-www-form-urlencoded`, the encoding keys MUST map to parameter names, with the values produced according to the rules of the [Encoding Object](#encoding-object). +See [Encoding the `x-www-form-urlencoded` Media Type](#encoding-the-x-www-form-urlencoded-media-type) for guidance and examples, both with and without the `encoding` field. + +For `multipart/*`, the encoding keys MUST map to the [`name` parameter](https://www.rfc-editor.org/rfc/rfc7578#section-4.2) of the `Content-Disposition: form-data` header of each part. +See [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-5) for guidance regarding non-ASCII part names. + +This usage of a `name` [`Content-Disposition` parameter](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-2) is defined for `multipart/form-data` ([[?RFC7578]]) and the `form-data` [`Content-Disposition` value](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-1). +Implementations MAY choose to support the `name` `Content-Disposition` parameter and the `encoding` field with other `multipart` formats, but this usage is unlikely to be supported by generic `multipart` implementations. + +See [Encoding `multipart` Media Types](#encoding-multipart-media-types) for further guidance and examples, both with and without the `encoding` field. + +For all media types where no mapping is defined by either this specification or the [Media Type Registry](#media-type-registry), the `encoding` field SHALL be ignored. + ##### Media Type Examples ```json @@ -1732,21 +1757,11 @@ requestBody: To upload multiple files, a `multipart` media type MUST be used as shown under [Example: Multipart Form with Multiple Files](#example-multipart-form-with-multiple-files). -##### Support for x-www-form-urlencoded Request Bodies - -See [Encoding the `x-www-form-urlencoded` Media Type](#encoding-the-x-www-form-urlencoded-media-type) for guidance and examples, both with and without the `encoding` field. - -##### Special Considerations for `multipart` Content - -See [Encoding `multipart` Media Types](#encoding-multipart-media-types) for further guidance and examples, both with and without the `encoding` field. - #### Encoding Object -A single encoding definition applied to a single schema property. -See [Appendix B](#appendix-b-data-type-conversion) for a discussion of converting values of various types to string representations. +A single encoding definition applied to a single value, as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). -Properties are correlated with `multipart` parts using the [`name` parameter](https://www.rfc-editor.org/rfc/rfc7578#section-4.2) of `Content-Disposition: form-data`, and with `application/x-www-form-urlencoded` using the query string parameter names. -In both cases, their order is implementation-defined. +See [Appendix B](#appendix-b-data-type-conversion) for a discussion of converting values of various types to string representations. See [Appendix E](#appendix-e-percent-encoding-and-form-media-types) for a detailed examination of percent-encoding concerns for form media types. @@ -1763,7 +1778,8 @@ These fields MAY be used either with or without the RFC6570-style serialization This object MAY be extended with [Specification Extensions](#specification-extensions). -The default values for `contentType` are as follows, where an _n/a_ in the `contentEncoding` column means that the presence or value of `contentEncoding` is irrelevant: +The default values for `contentType` are as follows, where an _n/a_ in the `contentEncoding` column means that the presence or value of `contentEncoding` is irrelevant. +This table is based on the value to which the Encoding Object is being applied, which as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions) is the array item for properties of type `"array"`, and the entire value for all other types. | `type` | `contentEncoding` | Default `contentType` | | ---- | ---- | ---- | @@ -1772,7 +1788,7 @@ The default values for `contentType` are as follows, where an _n/a_ in the `cont | `string` | _absent_ | `text/plain` | | `number`, `integer`, or `boolean` | _n/a_ | `text/plain` | | `object` | _n/a_ | `application/json` | -| `array` | _n/a_ | according to the `type` of the `items` schema | +| `array` | _n/a_ | `application/json` | Determining how to handle a `type` value of `null` depends on how `null` values are being serialized. If `null` values are entirely omitted, then the `contentType` is irrelevant. @@ -1880,20 +1896,13 @@ However, this is not guaranteed, so it may be more interoperable to keep the pad ##### Encoding `multipart` Media Types -It is common to use `multipart/form-data` as a `Content-Type` when transferring forms as request bodies. In contrast to OpenAPI 2.0, a `schema` is REQUIRED to define the input parameters to the operation when using `multipart` content. This supports complex structures as well as supporting mechanisms for multiple file uploads. - -The `form-data` disposition and its `name` parameter are mandatory for `multipart/form-data` ([RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.2)). -Array properties are handled by applying the same `name` to multiple parts, as is recommended by [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. -See [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-5) for guidance regarding non-ASCII part names. - -Various other `multipart` types, most notable `multipart/mixed` ([RFC2046](https://www.rfc-editor.org/rfc/rfc2046.html#section-5.1.3)) neither require nor forbid specific `Content-Disposition` values, which means care must be taken to ensure that any values used are supported by all relevant software. -It is not currently possible to correlate schema properties with unnamed, ordered parts in media types such as `multipart/mixed`, but implementations MAY choose to support such types when `Content-Disposition: form-data` is used with a `name` parameter. +See [Encoding Usage and Restrictions](#encoding-usage-and-restrictions) for guidance on correlating schema properties with parts. Note that there are significant restrictions on what headers can be used with `multipart` media types in general ([RFC2046](https://www.rfc-editor.org/rfc/rfc2046.html#section-5.1)) and `multi-part/form-data` in particular ([RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.8)). Note also that `Content-Transfer-Encoding` is deprecated for `multipart/form-data` ([RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.7)) where binary data is supported, as it is in HTTP. -+Using `contentEncoding` for a multipart field is equivalent to specifying an [Encoding Object](#encoding-object) with a `headers` field containing `Content-Transfer-Encoding` with a schema that requires the value used in `contentEncoding`. +Using `contentEncoding` for a multipart field is equivalent to specifying an [Encoding Object](#encoding-object) with a `headers` field containing `Content-Transfer-Encoding` with a schema that requires the value used in `contentEncoding`. +If `contentEncoding` is used for a multipart field that has an Encoding Object with a `headers` field containing `Content-Transfer-Encoding` with a schema that disallows the value from `contentEncoding`, the result is undefined for serialization and parsing. Note that as stated in [Working with Binary Data](#working-with-binary-data), if the Encoding Object's `contentType`, whether set explicitly or implicitly through its default value rules, disagrees with the `contentMediaType` in a Schema Object, the `contentMediaType` SHALL be ignored. @@ -1921,8 +1930,9 @@ requestBody: type: string format: binary addresses: - # default for arrays is based on the type in the `items` - # subschema, which is an object, so `application/json` + # for arrays, the Encoding Object applies to each item + # individually based on that item's type, which in this + # example is an object, so `application/json` type: array items: $ref: '#/components/schemas/Address' From 98229e0726d690104df05e508fe70a94f426cb91 Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Sun, 4 May 2025 10:33:02 -0700 Subject: [PATCH 2/6] Improved wording --- src/oas.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index 673a7c784f..6efd25c896 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1759,7 +1759,7 @@ To upload multiple files, a `multipart` media type MUST be used as shown under [ #### Encoding Object -A single encoding definition applied to a single value, as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). +A single encoding definition applied to a single value, with the mapping of Encoding Objects to values determined by the [Media Type Object](@media-type-object) as described under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). See [Appendix B](#appendix-b-data-type-conversion) for a discussion of converting values of various types to string representations. @@ -1780,6 +1780,7 @@ This object MAY be extended with [Specification Extensions](#specification-exten The default values for `contentType` are as follows, where an _n/a_ in the `contentEncoding` column means that the presence or value of `contentEncoding` is irrelevant. This table is based on the value to which the Encoding Object is being applied, which as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions) is the array item for properties of type `"array"`, and the entire value for all other types. +Therefore the `array` row in this table applies only to array values inside of a top-level array. | `type` | `contentEncoding` | Default `contentType` | | ---- | ---- | ---- | From 4fa8b7dcbcf072ce47be656b06b8bd9d94dc0593 Mon Sep 17 00:00:00 2001 From: Henry Andrews Date: Mon, 5 May 2025 16:40:57 -0700 Subject: [PATCH 3/6] Fix wording error from copy-paste Co-authored-by: Mike Kistler --- src/oas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index 6efd25c896..21107ef3f4 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1624,7 +1624,7 @@ This object MAY be extended with [Specification Extensions](#specification-exten ##### Encoding Usage and Restrictions -To use the `encoding` field, a `schema` MUST exist, and the `encoding` field's keys MUST exist in the schema as a property. +To use the `encoding` field, a `schema` MUST exist, and the `encoding` field's keys MUST exist in the schema as properties. Array properties MUST be handled by applying the given Encoding Object to multiple parts (or query parameters) with the same `name`, as is recommended by [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. For all other property types, including array values within a top-level array, the Encoding Object MUST be applied to the entire values. From bb0a4ca0128c14ce1be2032a22212404e7f02665 Mon Sep 17 00:00:00 2001 From: Henry Andrews Date: Mon, 5 May 2025 16:41:10 -0700 Subject: [PATCH 4/6] Fix grammar. Co-authored-by: Mike Kistler --- src/oas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index 21107ef3f4..e02c3a1c31 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1626,7 +1626,7 @@ This object MAY be extended with [Specification Extensions](#specification-exten To use the `encoding` field, a `schema` MUST exist, and the `encoding` field's keys MUST exist in the schema as properties. Array properties MUST be handled by applying the given Encoding Object to multiple parts (or query parameters) with the same `name`, as is recommended by [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. -For all other property types, including array values within a top-level array, the Encoding Object MUST be applied to the entire values. +For all other property types, including array values within a top-level array, the Encoding Object MUST be applied to the entire value. The behavior of the `encoding` field is only defined for media types structured as name-value pairs that allow repeat values. The order of these name-value pairs in the target media type is implementation-defined. From 35dc9355974ecf1391fd60e60d81abacf8a109a5 Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Mon, 5 May 2025 20:48:18 -0700 Subject: [PATCH 5/6] Clarify the rationale for the encoding field The oddities of its media type support derive from its history as the OAS implementation of web forms. --- src/oas.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/oas.md b/src/oas.md index e02c3a1c31..2397c821fa 100644 --- a/src/oas.md +++ b/src/oas.md @@ -1628,7 +1628,7 @@ To use the `encoding` field, a `schema` MUST exist, and the `encoding` field's Array properties MUST be handled by applying the given Encoding Object to multiple parts (or query parameters) with the same `name`, as is recommended by [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. For all other property types, including array values within a top-level array, the Encoding Object MUST be applied to the entire value. -The behavior of the `encoding` field is only defined for media types structured as name-value pairs that allow repeat values. +The behavior of the `encoding` field is designed to support web forms, and is therefore only defined for media types structured as name-value pairs that allow repeat values. The order of these name-value pairs in the target media type is implementation-defined. For `application/x-www-form-urlencoded`, the encoding keys MUST map to parameter names, with the values produced according to the rules of the [Encoding Object](#encoding-object). From ed6073354be60af6d3f3b0801e6587fc443af5fa Mon Sep 17 00:00:00 2001 From: "Henry H. Andrews" Date: Wed, 7 May 2025 09:14:33 -0700 Subject: [PATCH 6/6] Remove media type registry mentions for encoding. This removes the more general language allowing for future expansion with the media type registry (although the general language still had the same effect of restricting to `multipart` and `application/x-www-form-urlencoded` in practice). --- src/oas.md | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/src/oas.md b/src/oas.md index 2397c821fa..ec69a54fe9 100644 --- a/src/oas.md +++ b/src/oas.md @@ -84,8 +84,6 @@ Some examples of possible media type definitions: application/vnd.github.v3.patch ``` -#### Media Type Registry - ### HTTP Status Codes The HTTP Status Codes are used to indicate the status of the executed operation. @@ -1617,33 +1615,33 @@ See [Working With Examples](#working-with-examples) for further guidance regardi | schema | [Schema Object](#schema-object) | The schema defining the content of the request, response, parameter, or header. | | example | Any | Example of the media type; see [Working With Examples](#working-with-examples). | | examples | Map[ `string`, [Example Object](#example-object) \| [Reference Object](#reference-object)] | Examples of the media type; see [Working With Examples](#working-with-examples). | -| encoding | Map[`string`, [Encoding Object](#encoding-object)] | A map between a property name and its encoding information for media types supporting name-value pairs and allowing duplicate names, as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). | - +| encoding | Map[`string`, [Encoding Object](#encoding-object)] | A map between a property name and its encoding information, as defined under [Encoding Usage and Restrictions](#encoding-usage-and-restrictions). The `encoding` field SHALL only apply when the media type is `multipart` or `application/x-www-form-urlencoded`. If no Encoding Object is provided for a property, the behavior is determined by the default values documented for the Encoding Object. | This object MAY be extended with [Specification Extensions](#specification-extensions). ##### Encoding Usage and Restrictions +The `encoding` field defines how to map each [Encoding Object](#encoding-object) to a specific value in the data. + To use the `encoding` field, a `schema` MUST exist, and the `encoding` field's keys MUST exist in the schema as properties. -Array properties MUST be handled by applying the given Encoding Object to multiple parts (or query parameters) with the same `name`, as is recommended by [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. -For all other property types, including array values within a top-level array, the Encoding Object MUST be applied to the entire value. +Array properties MUST be handled by applying the given Encoding Object to one part per array item, each with the same `name`, as is recommended by [[?RFC7578]] [Section 4.3](https://www.rfc-editor.org/rfc/rfc7578.html#section-4.3) for supplying multiple values per form field. +For all other value types for both top-level non-array properties and for values, including array values, within a top-level array, the Encoding Object MUST be applied to the entire value. -The behavior of the `encoding` field is designed to support web forms, and is therefore only defined for media types structured as name-value pairs that allow repeat values. +The behavior of the `encoding` field is designed to support web forms, and is therefore only defined for media types structured as name-value pairs that allow repeat values, most notably `application/x-www-form-urlencoded` and `multipart/form-data`. The order of these name-value pairs in the target media type is implementation-defined. For `application/x-www-form-urlencoded`, the encoding keys MUST map to parameter names, with the values produced according to the rules of the [Encoding Object](#encoding-object). See [Encoding the `x-www-form-urlencoded` Media Type](#encoding-the-x-www-form-urlencoded-media-type) for guidance and examples, both with and without the `encoding` field. -For `multipart/*`, the encoding keys MUST map to the [`name` parameter](https://www.rfc-editor.org/rfc/rfc7578#section-4.2) of the `Content-Disposition: form-data` header of each part. -See [RFC7578](https://www.rfc-editor.org/rfc/rfc7578.html#section-5) for guidance regarding non-ASCII part names. +For `multipart`, the encoding keys MUST map to the [`name` parameter](https://www.rfc-editor.org/rfc/rfc7578#section-4.2) of the `Content-Disposition: form-data` header of each part, as is defined for `multipart/form-data` in [[?RFC7578]]. +See [[?RFC7578]] [Section 5](https://www.rfc-editor.org/rfc/rfc7578.html#section-5) for guidance regarding non-ASCII part names. -This usage of a `name` [`Content-Disposition` parameter](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-2) is defined for `multipart/form-data` ([[?RFC7578]]) and the `form-data` [`Content-Disposition` value](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-1). -Implementations MAY choose to support the `name` `Content-Disposition` parameter and the `encoding` field with other `multipart` formats, but this usage is unlikely to be supported by generic `multipart` implementations. +Other `multipart` media types are not directly supported as they do not define a mechanism for part names. +However, the usage of a `name` [`Content-Disposition` parameter](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-2) is defined for the `form-data` [`Content-Disposition` value](https://www.iana.org/assignments/cont-disp/cont-disp.xhtml#cont-disp-1), which is not restricted to `multipart/form-data`. +Implementations MAY choose to support the a `Conent-Disposition` of `form-data` with a `name` parameter in other `multipart` media types in order to use the `encoding` field with them, but this usage is unlikely to be supported by generic `multipart` implementations. See [Encoding `multipart` Media Types](#encoding-multipart-media-types) for further guidance and examples, both with and without the `encoding` field. -For all media types where no mapping is defined by either this specification or the [Media Type Registry](#media-type-registry), the `encoding` field SHALL be ignored. - ##### Media Type Examples ```json