rfc9512
Internet Engineering Task Force (IETF) R. Polli
Request for Comments: 9512 DTD, Italian Government
Category: Informational E. Wilde
ISSN: 2070-1721 Axway
E. Aro
Mozilla
February 2024
YAML Media Type
Abstract
This document registers the application/yaml media type and the +yaml
structured syntax suffix with IANA. Both identify document
components that are serialized according to the YAML specification.
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Not all documents
approved by the IESG are candidates for any level of Internet
Standard; see Section 2 of RFC 7841.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9512.
Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Revised BSD License text as described in Section 4.e of the
Trust Legal Provisions and are provided without warranty as described
in the Revised BSD License.
Table of Contents
1. Introduction
1.1. Notational Conventions
1.2. Fragment Identification
1.2.1. Fragment Identification via Alias Nodes
2. Media Type and Structured Syntax Suffix Registrations
2.1. Media Type application/yaml
2.2. The +yaml Structured Syntax Suffix
3. Interoperability Considerations
3.1. YAML Is an Evolving Language
3.2. YAML Streams
3.3. Filename Extension
3.4. YAML and JSON
3.5. Fragment Identifiers
4. Security Considerations
4.1. Arbitrary Code Execution
4.2. Resource Exhaustion
4.3. YAML Streams
4.4. Expressing Booleans
5. IANA Considerations
6. References
6.1. Normative References
6.2. Informative References
Appendix A. Examples Related to Fragment Identifier
Interoperability
A.1. Unreferenceable Nodes
A.2. Referencing a Missing Node
A.3. Representation Graph with Anchors and Cyclic References
Acknowledgements
Authors' Addresses
1. Introduction
YAML [YAML] is a data serialization format that is capable of
conveying one or multiple documents in a single presentation stream
(e.g., a file or a network resource). It is widely used on the
Internet, including in the API sector (e.g., see [OAS]), but a
corresponding media type and structured syntax suffix had not
previously been registered by IANA.
To increase interoperability when exchanging YAML streams and
leverage content negotiation mechanisms when exchanging YAML
resources, this specification registers the application/yaml media
type and the +yaml structured syntax suffix [MEDIATYPE].
Moreover, it provides security considerations and interoperability
considerations related to [YAML], including its relation with [JSON].
1.1. Notational Conventions
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
The terms "content negotiation" and "resource" in this document are
to be interpreted as in [HTTP].
The terms "fragment" and "fragment identifier" in this document are
to be interpreted as in [URI].
The terms "presentation", "stream", "YAML document", "representation
graph", "tag", "serialization detail", "node", "alias node",
"anchor", and "anchor name" in this document are to be interpreted as
in [YAML].
Figures containing YAML code always start with the %YAML directive to
improve readability.
1.2. Fragment Identification
A fragment identifies a node in a stream.
A fragment identifier starting with "*" is to be interpreted as a
YAML alias node (see Section 1.2.1).
For single-document YAML streams, a fragment identifier that is empty
or that starts with "/" is to be interpreted as a JSON Pointer
[JSON-POINTER] and is evaluated on the YAML representation graph,
traversing alias nodes; in particular, the empty fragment identifier
references the root node. This syntax can only reference the YAML
nodes that are on a path that is made up of nodes interoperable with
the JSON data model (see Section 3.4).
A fragment identifier is not guaranteed to reference an existing
node. Therefore, applications SHOULD define how an unresolved alias
node ought to be handled.
1.2.1. Fragment Identification via Alias Nodes
This section describes how to use alias nodes (see Sections 3.2.2.2
and 7.1 of [YAML]) as fragment identifiers to designate nodes.
A YAML alias node can be represented in a URI fragment identifier by
encoding it into bytes using UTF-8 [UTF-8], but percent-encoding of
those characters is not allowed by the fragment rule in Section 3.5
of [URI].
If multiple nodes match a fragment identifier, the first occurrence
of such a match is selected.
Users concerned with interoperability of fragment identifiers:
* SHOULD limit alias nodes to a set of characters that do not
require encoding to be expressed as URI fragment identifiers (this
is generally possible since anchor names are a serialization
detail), and
* SHOULD NOT use alias nodes that match multiple nodes.
In the example resource below, the relative reference (see
Section 4.2 of [URI]) file.yaml#*foo identifies the first alias node
*foo pointing to the node with value scalar and not to the one in the
second document, whereas the relative reference file.yaml#*document_2
identifies the root node of the second document {one: [a, sequence]}.
%YAML 1.2
---
one: &foo scalar
two: &bar
- some
- sequence
- items
...
%YAML 1.2
---
&document_2
one: &foo [a, sequence]
Figure 1: A YAML Stream Containing Two YAML Documents
2. Media Type and Structured Syntax Suffix Registrations
This section includes the information required for IANA to register
the application/yaml media type and the +yaml structured syntax
suffix per [MEDIATYPE].
2.1. Media Type application/yaml
The media type for YAML is application/yaml; the following
information serves as the registration form for this media type.
Type name: application
Subtype name: yaml
Required parameters: N/A
Optional parameters: N/A; unrecognized parameters should be ignored.
Encoding considerations: binary
Security considerations: See Section 4 of this document.
Interoperability considerations: See Section 3 of this document.
Published specification: [YAML], this document
Applications that use this media type: Applications that need a
human-friendly, cross-language, and Unicode-based data
serialization language designed around the common data types of
dynamic programming languages.
Fragment identifier considerations: See Section 1.2 of this
document.
Additional information:
Deprecated alias names for this type: application/x-yaml, text/
yaml, and text/x-yaml. These names are used but are not
registered.
Magic number(s): N/A
File extension(s): "yaml" (preferred) and "yml". See Section 3.3
of this document.
Macintosh file type code(s): N/A
Windows Clipboard Name: YAML
Person and email address to contact for further information: See the
Authors' Addresses section of this document.
Intended usage: COMMON
Restrictions on usage: None
Author: See the Authors' Addresses section of this document.
Change controller: IETF
2.2. The +yaml Structured Syntax Suffix
The suffix +yaml MAY be used with any media type whose representation
follows that established for application/yaml. The structured syntax
suffix registration form follows. See [MEDIATYPE] for definitions of
each part of the registration form.
Name: YAML Ain't Markup Language (YAML)
+suffix: +yaml
References: [YAML], this document
Encoding considerations: Same as application/yaml
Interoperability considerations: Same as application/yaml
Fragment identifier considerations: Unlike application/yaml, there
is no fragment identification syntax defined for +yaml.
A specific xxx/yyy+yaml media type needs to define the syntax and
semantics for fragment identifiers because the ones defined for
application/yaml do not apply unless explicitly expressed.
Security considerations: Same as application/yaml
Contact: httpapi@ietf.org or art@ietf.org
Author: See the Authors' Addresses section of this document.
Change controller: IETF
3. Interoperability Considerations
3.1. YAML Is an Evolving Language
YAML is an evolving language, and over time, some features have been
added and others removed.
The application/yaml media type registration is independent of the
YAML version. This allows content negotiation of version-independent
YAML resources.
Implementers concerned about features related to a specific YAML
version can specify it in YAML documents using the %YAML directive
(see Section 6.8.1 of [YAML]).
3.2. YAML Streams
A YAML stream can contain zero or more YAML documents.
When receiving a multi-document stream, an application that only
expects single-document streams should signal an error instead of
ignoring the extra documents.
Current implementations consider different documents in a stream
independent, similarly to JSON text sequences (see [RFC7464]);
elements such as anchors are not guaranteed to be referenceable
across different documents.
3.3. Filename Extension
The "yaml" filename extension is the preferred one; it is the most
popular and widely used on the web. The "yml" filename extension is
still used. The simultaneous usage of two filename extensions in the
same context might cause interoperability issues (e.g., when both a
"config.yaml" and a "config.yml" are present).
3.4. YAML and JSON
When using flow collection styles (see Section 7.4 of [YAML]), a YAML
document could look like JSON [JSON]; thus, similar interoperability
considerations apply.
When using YAML as a more efficient format to serialize information
intended to be consumed as JSON, information not reflected in the
representation graph and classified as presentation or serialization
details (see Section 3.2 of [YAML]) can be discarded. This includes
comments (see Section 3.2.3.3 of [YAML]), directives, and alias nodes
(see Section 7.1 of [YAML]) that do not have a JSON counterpart.
%YAML 1.2
---
# This comment will be lost
# when serializing in JSON.
Title:
type: string
maxLength: &text_limit 64
Name:
type: string
maxLength: *text_limit # Replaced by the value 64.
Figure 2: JSON Replaces Alias Nodes with Static Values
Implementers need to ensure that relevant information will not be
lost during processing. For example, they might consider alias nodes
being replaced by static values as acceptable.
In some cases, an implementer may want to define a list of allowed
YAML features, taking into account that the following features might
have interoperability issues with [JSON]:
* multi-document YAML streams
* non-UTF-8 encoding. Before encoding YAML streams in UTF-16 or
UTF-32, it is important to note that Section 8.1 of [JSON]
mandates the use of UTF-8 when exchanging JSON texts between
systems that are not part of a closed ecosystem and that
Section 5.2 of [YAML] recommends the use of UTF-8.
* mapping keys that are not strings
* cyclic references represented using anchors (see Section 4.2 and
Figure 4)
* .inf and .nan float values, since JSON does not support them
* non-JSON types, including the ones associated with tags like
!!timestamp that were included in the default schema of older YAML
versions
* tags in general, specifically ones that do not map to JSON types,
e.g., custom and local tags such as !!python/object and !mytag
(see Section 2.4 of [YAML])
%YAML 1.2
---
non-json-keys:
0: a number
[0, 1]: a sequence
? {k: v}
: a map
---
non-json-keys:
!date 2020-01-01: a timestamp
non-json-value: !date 2020-01-01
...
Figure 3: Example of Mapping Keys and Values Not Supported in
JSON in a Multi-Document YAML Stream
3.5. Fragment Identifiers
To allow fragment identifiers to traverse alias nodes, the YAML
representation graph needs to be generated before the fragment
identifier evaluation. It is important that this evaluation does not
cause the issues mentioned in Sections 3.4 and 4, such as infinite
loops and unexpected code execution.
Implementers need to consider that the YAML version and supported
features (e.g., merge keys) can affect the generation of the
representation graph (see Figure 9).
In Section 1.2, this document extends the use of specifications based
on the JSON data model with support for YAML fragment identifiers.
This is to improve the interoperability of already-consolidated
practices, such as writing OpenAPI documents [OAS] in YAML.
Appendix A provides a non-exhaustive list of examples to help readers
understand interoperability issues related to fragment identifiers.
4. Security Considerations
Security requirements for both media types and media type suffixes
are discussed in Section 4.6 of [MEDIATYPE].
4.1. Arbitrary Code Execution
Care should be used when using YAML tags because their resolution
might trigger unexpected code execution.
Code execution in deserializers should be disabled by default and
only be enabled explicitly. In the latter case, the implementation
should ensure (for example, via specific functions) that the code
execution results in strictly bounded time/memory limits.
Many implementations provide safe deserializers that address these
issues.
4.2. Resource Exhaustion
YAML documents are rooted, connected, directed graphs and can contain
reference cycles, so they can't be treated as simple trees (see
Section 3.2.1 of [YAML]). An implementation that treats them as
simple trees risks going into an infinite loop while traversing the
YAML representation graph. This can happen:
* when trying to serialize it as JSON or
* when searching/identifying nodes using specifications based on the
JSON data model (e.g., [JSON-POINTER]).
%YAML 1.2
---
x: &x
y: *x
Figure 4: A Cyclic Document
Even if a representation graph is not cyclic, treating it as a simple
tree could lead to improper behaviors, such as triggering an
Exponential Data Expansion (e.g., a Billion Laughs Attack).
%YAML 1.2
---
x1: &a1 ["a", "a"]
x2: &a2 [*a1, *a1]
x3: &a3 [*a2, *a2]
Figure 5: A Billion Laughs Document
This can be addressed using processors that limit the anchor
recursion depth and validate the input before processing it; even in
these cases, it is important to carefully test the implementation you
are going to use. The same considerations apply when serializing a
YAML representation graph in a format that does not support reference
cycles (see Section 3.4).
4.3. YAML Streams
Incremental parsing and processing of a YAML stream can produce
partial results and later indicate failure to parse the remainder of
the stream; to prevent partial processing, implementers might prefer
validating and processing all the documents in a stream at the same
time.
Repeated parsing and re-encoding of a YAML stream can result in the
addition or removal of document delimiters (e.g., --- or ...) as well
as the modification of anchor names and other serialization details
that can break signature validation.
4.4. Expressing Booleans
Section 10.3.2 of [YAML] specifies that only the scalars matching the
regular expression true|True|TRUE|false|False|FALSE are interpreted
as booleans. Older YAML versions were more tolerant (e.g.,
interpreting NO and N as False and interpreting YES and Y as True).
When the older syntax is used, a YAML implementation could then
interpret {insecure: n} as {insecure: "n"} instead of {insecure:
false}. Using the syntax defined in Section 10.3.2 of [YAML] prevents
these issues.
5. IANA Considerations
IANA has updated the "Media Types" registry
(https://www.iana.org/assignments/media-types) with the registration
information in Section 2.1 for the media type application/yaml.
IANA has updated the "Structured Syntax Suffixes" registry
(https://www.iana.org/assignments/media-type-structured-suffix) with
the registration information in Section 2.2 for the structured syntax
suffix +yaml.
6. References
6.1. Normative References
[HTTP] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
Ed., "HTTP Semantics", STD 97, RFC 9110,
DOI 10.17487/RFC9110, June 2022,
<https://www.rfc-editor.org/info/rfc9110>.
[JSON] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
Interchange Format", STD 90, RFC 8259,
DOI 10.17487/RFC8259, December 2017,
<https://www.rfc-editor.org/info/rfc8259>.
[JSON-POINTER]
Bryan, P., Ed., Zyp, K., and M. Nottingham, Ed.,
"JavaScript Object Notation (JSON) Pointer", RFC 6901,
DOI 10.17487/RFC6901, April 2013,
<https://www.rfc-editor.org/info/rfc6901>.
[MEDIATYPE]
Freed, N., Klensin, J., and T. Hansen, "Media Type
Specifications and Registration Procedures", BCP 13,
RFC 6838, DOI 10.17487/RFC6838, January 2013,
<https://www.rfc-editor.org/info/rfc6838>.
[OAS] Miller, D., Whitlock, J., Gardiner, M., Ralphson, M.,
Ratovsky, R., and U. Sarid, "OpenAPI Specification",
v3.0.0, 26 July 2017.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[URI] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
Resource Identifier (URI): Generic Syntax", STD 66,
RFC 3986, DOI 10.17487/RFC3986, January 2005,
<https://www.rfc-editor.org/info/rfc3986>.
[UTF-8] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
2003, <https://www.rfc-editor.org/info/rfc3629>.
[YAML] Ben-Kiki, O., Evans, C., dot Net, I., Müller, T.,
Antoniou, P., Aro, E., and T. Smith, "YAML Ain't Markup
Language Version 1.2", 1 October 2021,
<https://yaml.org/spec/1.2.2/>.
6.2. Informative References
[RFC7464] Williams, N., "JavaScript Object Notation (JSON) Text
Sequences", RFC 7464, DOI 10.17487/RFC7464, February 2015,
<https://www.rfc-editor.org/info/rfc7464>.
Appendix A. Examples Related to Fragment Identifier Interoperability
A.1. Unreferenceable Nodes
This example shows a couple of YAML nodes that cannot be referenced
based on the JSON data model since their mapping keys are not
strings.
%YAML 1.2
---
a-map-cannot:
? {be: expressed}
: with a JSON Pointer
0: no numeric mapping keys in JSON
Figure 6: Example of YAML Nodes That Are Not Referenceable Based
on JSON Data Model
A.2. Referencing a Missing Node
In this example, the fragment #/0 does not reference an existing
node.
%YAML 1.2
---
0: "JSON Pointer `#/0` references a string mapping key."
Figure 7: Example of a JSON Pointer That Does Not Reference an
Existing Node
A.3. Representation Graph with Anchors and Cyclic References
In this YAML document, the #/foo/bar/baz fragment identifier
traverses the representation graph and references the string you.
Moreover, the presence of a cyclic reference implies that there are
infinite fragment identifiers #/foo/bat/../bat/bar referencing the
&anchor node.
%YAML 1.2
---
anchor: &anchor
baz: you
foo: &foo
bar: *anchor
bat: *foo
Figure 8: Example of a Cyclic Reference and Alias Nodes
Many YAML implementations will resolve the merge key "<<:"
(https://yaml.org/type/merge.html) defined in YAML 1.1 in the
representation graph. This means that the fragment #/book/author/
given_name references the string Federico and that the fragment
#/book/<< will not reference any existing node.
%YAML 1.1
---
# Many implementations use merge keys.
the-viceroys: &the-viceroys
title: The Viceroys
author:
given_name: Federico
family_name: De Roberto
book:
<<: *the-viceroys
title: The Illusion
Figure 9: Example of YAML Merge Keys
Acknowledgements
Thanks to Erik Wilde and David Biesack for being the initial
contributors to this specification and to Darrel Miller and Rich Salz
for their support during the adoption phase.
In addition, this document owes a lot to the extensive discussion
inside and outside the HTTPAPI Working Group. The following
contributors helped improve this specification by opening pull
requests, reporting bugs, asking smart questions, drafting or
reviewing text, and evaluating open issues: Tina (tinita) Müller, Ben
Hutton, Carsten Bormann, Manu Sporny, and Jason Desrosiers.
Authors' Addresses
Roberto Polli
Digital Transformation Department, Italian Government
Italy
Email: robipolli@gmail.com
Erik Wilde
Axway
Switzerland
Email: erik.wilde@dret.net
Eemeli Aro
Mozilla
Finland
Email: eemeli@gmail.com
ERRATA