Network Working Group | P. Hoffman |
Internet-Draft | VPN Consortium |
Obsoletes: 2629 (if approved) | January 28, 2014 |
Intended status: Standards Track | |
Expires: August 01, 2014 |
The 'XML2RFC' version 3 Vocabulary
draft-hoffman-xml2rfc-00
This document defines the 'XML2RFC' version 3 vocabulary; an XML-based language used for writing RFCs and Internet-Drafts. It is heavily derived from the version 2 vocabulary that is also under discussion.
Discussion of this draft takes place on the rfc-interest mailing list (rfc-interest@rfc-editor.org), which has its home page at https://www.rfc-editor.org/mailman/listinfo/rfc-interest.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 01, 2014.
Copyright (c) 2014 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document describes version 3 ('v3') of the 'XML2RFC' vocabulary; an XML-based language ('Extensible Markup Language', [XML]) used for writing RFCs ([RFCSTYLE]) and Internet-Drafts ([IDGUIDE]).
This document obsoletes the version ("v2") vocablulary [XML2RFCv2], which contains the extended language definition. That document in turn obsoletes the original version ("v1") [RFC2629]. This document directly copies the material from [XML2RFCv2] where possible; as that document makes its way toward RFC publication, this document will incorporate as many of the changes as possible.
Note that the vocabulary contains certain constructs that might not be used when generating the final text; however, they can provide useful data for other uses (such index generation, populating a keyword database, or syntax checks).
The following is a hopefully-complete list of all the technical changes between [XML2RFCv2] and this document. Note that the list is for the current version of this document only. There are *many* additional changes that are expected to the v3 vocabulary that are being discussed. Also note that changes to the design choices for the differences are also expected.
The XML vocabulary here is defined in prose, based on the Relax NG schema ([RNC]) contained in Appendix B (specified in Relax NG Compact Notation, "RNC").
Note that the schema can be used for automated validity checks, but certain constraints are only described in prose (example: the conditionally required presence of the "abbrev" attribute).
The sections below describe all elements and their attributes.
Note that attributes not labeled "mandatory" are optional.
Contains the abstract of the document. The abstract ought to be self-contained and thus should not contain references or unexpanded abbreviations. See [RFCSTYLE] for more information.
This element appears as child element of: <Section 2.20).
One or more <Section 2.40)
Provides address information for the author.
This element appears as child element of: <Section 2.6).
In any order:
Provides additional prose augmenting a bibliographical reference.
For instance:
<annotation> Latest version available at <eref target='http://www.w3.org/TR/xml'/>. </annotation>
...will generate the text used in the reference for [XML].
This element appears as child element of: <Section 2.33).
In any order:
Provides information about the IETF area this document applies to (currently not used when generating documents).
This element appears as child element of: <Section 2.20).
Content model: only text content.
This element allows the inclusion of "artwork" into the document.
<artwork> provides full control of horizontal whitespace and line breaks, and thus is used for a variety of things, such as:
Alternatively, the "src" attribute allows referencing an external graphics file, such as a bitmap or a vector drawing. In this case, the textual content acts as fallback for output formats that do not support graphics, and thus ought to contain either a "line art" variant of the graphics, or otherwise prose that describes the included image in sufficient detail. Note that RFCs occasionally are published with enhanced diagrams; a recent example is [RFC5598].
This element appears as child element of: <Section 2.18).
Text
Controls whether the artwork appears left (default), centered, or right.
Allowed values:
Alternative text description of the artwork (not just the caption).
The suggested height of the graphics included using the "src" attribute.
This attribute is format-dependent and ought to be avoided.
When generating HTML output, current implementations copy the attribute "as is". For other output formats it is usually ignored.
A filename suitable for the contents (such as for extraction to a local file).
This attribute generally isn't used for document generation, but it can be helpful for other kinds of tools (such as automated syntax checkers which work by extracting the source code).
The URI of a graphics file.
Note that this can be a "data" URI ([RFC2397]) as well, in which case the graphics file essentially is in-lined.
Specifies the type of the artwork.
The value either is a well-known keyword (such as "abnf"), or an Internet Media Type (see [RFC2046]).
How it is used depends on context and application. For instance, a formatter can attempt to syntax-highlight code in certain known languages.
The suggested width of the graphics included using the "src" attribute.
This attribute is format-dependent and ought to be avoided.
When generating HTML output, current implementations copy the attribute "as is". For other output formats it is usually ignored.
Allows specification of the language used. This is sometimes useful for renderers which display different fonts for CJK characters.
Determines whitespace handling.
"preserve" is both the default value and the only meaningful setting anyway (because that's what the <artwork> element is for).
See also [XML].
Allowed values:
Provides information about a document author.
The <author> elements contained within the document's <front> element are used to fill the boilerplate, and also to generate the "Author's Address" section (see [RFCSTYLE]).
Note that an "author" can also be just an organization (by not specifying any of the name attributes, but adding the <
Furthermore, the "role" attribute can be used to mark an author as "editor". This is reflected both on the front page and in bibliographical references. Note that this specification does not define a precise meaning for the term "editor".
See Section "Authors vs. Contributors" of [RFCPOLICY] for more information.
This element appears as child element of: <Section 2.20).
In this order:
The full name (used in the automatically generated "Author's Address" section).
Author initials (used on the front page and in references).
Initials should be provided as a whitespace separated list of pairs of a letter and a dot.
Specifies the role the author had in creating the document.
Allowed values:
The author's surname.
Allows specification of the language used. This is sometimes useful for renderers which display different fonts for CJK characters.
Causes the text to be displayed in a bold font.
This element appears as child element of: <Section 2.3), <Section 2.9), <Section 2.21), <Section 2.31), <Section 2.32), <Section 2.40), and <Section 2.43).
In any order:
Contains the "back" part of the document: the references and appendices.
This element appears as child element of: <Section 2.36).
In this order:
Provides the content of a cell in a table.
This element appears as child element of: <Section 2.41).
In any order:
Gives the city name in a postal address.
This element appears as child element of: <Section 2.29).
Content model: only text content.
Gives the postal region code.
This element appears as child element of: <Section 2.29).
Content model: only text content.
Gives the country in a postal address.
This element appears as child element of: <Section 2.29).
Content model: only text content.
Represents a comment.
Comments can be used in a document while it is work-in-progress. They usually appear either visually highlighted, at the end of the document (depending on file format and settings of the formatter), or not at all (when generating an RFC).
This element appears as child element of: <Section 2.3), <Section 2.7), <Section 2.9), <Section 2.21), <Section 2.31), <Section 2.32), <Section 2.40), <Section 2.43), and <Section 2.44).
Content model: only text content.
Holds the "source" of a comment, such as the name or the initials of the person who made the comment.
Provides information about the publication date.
Note that this element is used both for the boilerplate of the document being produced, and also inside bibliographic references.
In the first case, it defines the publication date, which, when producing Internet-Drafts, will be used for computing the expiration date (see [IDGUIDE]). When "year", "month" or "day" are left out, the processor will attempt to use the current system date if the attributes that are specified do match the system date.
Note that month names need to match the full (English) month name ("January", "February", "March", "April", "May, "June", "July", "August", "September", "October", "November", or "December") in order for expiration calculations to work (some implementations might support additional formats, though).
In the second case, the date information will be embedded as-is into the reference text. Therefore, also vague dates ("ca. 2000"), date ranges, and so on, are allowed.
This element appears as child element of: <Section 2.20).
Content model: this element does not have any contents.
Day of publication.
Month of publication.
Year of publication.
Provides an email address.
The value is expected to be the scheme-specific part of a "mailto" URI (so does not include the prefix "mailto:"). See [RFC6068] for details.
This element appears as child element of: <Section 2.2).
Content model: only text content.
Represents an "external" link (as specified in the "target" attribute).
If the element has text content, that content will be used. Otherwise, the value of the target attribute will be inserted in angle brackets ([RFC3986]).
This element appears as child element of: <Section 2.3), <Section 2.7), <Section 2.9), <Section 2.21), <Section 2.31), <Section 2.32), <Section 2.40), <Section 2.43), and <Section 2.44).
Content model: only text content.
URI of the link target (see [RFC3986]).
Represents the phone number of a fax machine.
The value is expected to be the scheme-specific part of a "tel" URI (so does not include the prefix "tel:"), using the "global numbers" syntax. See [RFC3966] for details.
This element appears as child element of: <Section 2.2).
Content model: only text content.
This element appears as child element of: <Section 2.37), and <Section 2.40).
In this order:
Used to change the alignment of <
Note: does not affect title or <
Allowed values:
Duplicates functionality available on <artwork>; avoid it.
Duplicates functionality available on <artwork>; avoid it.
Duplicates functionality available on <artwork>; avoid it.
Figures that have an "anchor" attribute will automatically get an autogenerated title (such as "Figure 1"). Setting this attribute to "false" will prevent this.
Allowed values:
Duplicates functionality available on <artwork>; avoid it.
Provides a link to an additional format variant for a reference.
Note that these additional links are neither used in published RFCs, nor supported by all tools. If the goal is to provide a single URI for a reference, the "target" attribute on <
This element appears as child element of: <Section 2.33).
Content model: this element does not have any contents.
Octet length of linked-to document.
URI of document.
The type of the linked-to document, such as "TXT", "HTML", or "PDF".
Represent the "front matter": metadata (such as author information), abstract, and additional notes.
This element appears as child element of: <Section 2.33), and <Section 2.36).
In this order:
Causes the text to be displayed in an italic font.
This element appears as child element of: <Section 2.3), <Section 2.7), <Section 2.9), <Section 2.31), <Section 2.32), <Section 2.40), and <Section 2.43).
In any order:
Provides terms for the document's index.
Index entries can be either single items (when just the "item" attribute is given) or nested items (by specifying "subitem" as well).
For instance:
<iref item="Grammar" subitem="item"/>
will produce an index entry for "Grammar, item".
This element appears as child element of: <Section 2.3), <Section 2.7), <Section 2.9), <Section 2.18), <Section 2.21), <Section 2.31), <Section 2.32), <Section 2.37), <Section 2.40), <Section 2.43), and <Section 2.44).
Content model: this element does not have any contents.
The item to include.
Setting this to "true" declares the occurrence as "primary", which might cause it to be highlighted in the index.
Allowed values:
The subitem to include.
Specifies a keyword applicable to the document.
Note that each element should only contain a single keyword; for multiple keywords, the element can simply be repeated.
Keywords are used both in the RFC Index and in the metadata of generated document formats.
This element appears as child element of: <Section 2.20).
Content model: only text content.
Delineates a text list.
Each list item is represented by a <Section 2.46) can be used as workaround.
This element appears as child element of: <Section 2.40).
One or more <Section 2.40)
This attribute holds a token that serves as an identifier for a counter. The intended use is continuation of lists.
Note that this attribute functions only when the style attribute is using the "format..." syntax (Section 2.24.3); otherwise, it is ignored.
For list styles with potentially wide labels, this attribute can override the default indentation level, measured in characters.
Note that it only affects style with variable-width labels ("format..." and "hanging", see below), and it may not affect formats in which the list item text appears below the label.
This attribute is used to control the display of a list.
The value of this attribute is inherited by any nested lists that do not have this attribute set. It may be set to:
And, finally:
Represents the main content of the document.
This element appears as child element of: <Section 2.36).
One or more <Section 2.37)
Creates an unnumbered section that appears after the abstract.
It is usually used for additional information to reviewers (working group information, mailing list, ...), or for additional publication information such as "IESG Notes".
This element appears as child element of: <Section 2.20).
One or more <Section 2.40)
The title of the note.
Specifies the affiliation of an author.
This information appears in both the "Author's Address" section and on the front page ([RFCSTYLE]). If the value is long, an abbreviated variant can be specified in the "abbrev" attribute.
This element appears as child element of: <Section 2.6).
Content model: only text content.
Abbreviated variant.
Represents a phone number.
The value is expected to be the scheme-specific part of a "tel" URI (so does not include the prefix "tel:"), using the "global numbers" syntax. See [RFC3966] for details.
This element appears as child element of: <Section 2.2).
Content model: only text content.
Contains optional child elements providing postal information. These elements will be displayed in an order that is processor-specific. Thus, a postal address should probably contain only a set of <
This element appears as child element of: <Section 2.2).
In any order:
A method for presenting a postal address without using <
This element appears as child element of: <Section 2.29).
Content model: only text content.
Gives text that appears at the bottom of a figure or table.
This element appears as child element of: <Section 2.18), and <Section 2.41).
In any order:
Gives text that appears at the top of a figure or table.
This element appears as child element of: <Section 2.18), and <Section 2.41).
In any order:
Represents a bibliographical reference.
This element appears as child element of: <Section 2.34).
In this order:
Holds the URI for the reference.
Note that depending on the <
Contains a set of bibliographical references.
In the early days of the RFC series, there was only one "References" section per RFC. This convention was later changed to group references into two sets, "Normative" and "Informative"; see item x of [RFCSTYLE]). This vocabulary supports the split with the "title" attribute.
This element appears as child element of: <Section 2.8).
One or more <Section 2.33)
Provides the title for the References section (defaulting to "References").
In general, the title should be either "Normative References" or "Informative References".
Provides the region name in a postal address.
This element appears as child element of: <Section 2.29).
Content model: only text content.
This is the root element of the xml2rfc vocabulary.
Processors distinguish between RFC mode ("number" attribute being present) and Internet-Draft mode ("docName" attribute being present): it is invalid to specify both. Setting neither "number" nor "docName" can be useful for producing other types of document but is out-of-scope for this specification.
In this order:
Document category (see Appendix A.1).
Allowed values:
Affects the generated boilerplate.
See [RFC5741] for more information.
Allowed values:
For Internet-Drafts, this specifies the draft name (which appears below the title).
Note that the file extension is not part of the draft, so in general it should end with the current draft number ("-", plus two digits).
Furthermore, it is good practice to disambiguate current editor copies from submitted drafts (for instance, by replacing the draft number with the string "latest").
See [IDGUIDE] for further information.
Represents the Intellectual Property status of the document. See Appendix A.2 for details.
Allowed values:
Identifies a Section within the document for which extraction "as-is" is explicitly allowed (only relevant for historic values of the "ipr" attribute).
The number of the RFC to be produced.
A comma-separated list of RFC numbers or Internet-Draft names.
When producing a document within document series (such as "STD"): the number within that series.
The document stream.
See [RFC5741] for details.
Allowed values:
A comma-separated list of RFC numbers or Internet-Draft names.
The natural language used in the document (defaults to "en").
See [XML] for more information.
Represents a section (when inside a <middle> element) or an appendix (when inside a <back> element).
Sub-sections are created by nesting <section> elements inside <section> elements.
This element appears as child element of: <Section 2.8), <Section 2.25), and <Section 2.37).
In this order:
If set to "no", this section does not get a section number. Processors will verify that such a section is not followed by a numbered section in a part, and will verify that the section is a top-level section.
Allowed values:
If set to "yes", this section is marked in the processor with text indicating that it should be removed before the document is published as an RFC.
Allowed values:
The title of the section.
Determines whether the section is included in the Table Of Contents.
Allowed values:
Specifies the document series in which this document appears, and also specifies an identifier within that series.
This element appears as child element of: <Section 2.33).
Content model: this element does not have any contents.
The name of the series.
The following names trigger specific processing (such as for auto-generating links, and adding descriptions such as "work in progress"): "BCP", "FYI", "Internet-Draft", "RFC", and "STD".
The identifier within the series specified by the "name" attribute.
For BCPs, FYIs, RFCs, and STDs this is the number within the series. For Internet-Drafts, it is the full draft name (ending with the two-digit version number).
Provides a street address.
This element appears as child element of: <Section 2.29).
Content model: only text content.
Contains a paragraph of text.
This element appears as child element of: <Section 2.1), <Section 2.24), <Section 2.26), and <Section 2.37).
In any order:
Contains a table, consisting of an optional preamble, a header line, rows, and an optional postamble.
The number of columns in the table is determined by the number of <
This element appears as child element of: <Section 2.37).
In this order:
Determines the horizontal alignment of the table.
Allowed values:
Allowed values:
Allowed values:
Represents the document title.
When this element appears in the <front> element of the current document, the title might also appear in page headers or footers. If it's long (~40 characters), the "abbrev" attribute is used to specified an abbreviated variant.
This element appears as child element of: <Section 2.20).
Content model: only text content.
Specifies an abbreviated variant of the document title.
Causes the text to be displayed in a constant-width font.
This element appears as child element of: <Section 2.3), <Section 2.7), <Section 2.9), <Section 2.21), <Section 2.31), <Section 2.32), and <Section 2.40).
In any order:
Contains a column heading in a table.
This element appears as child element of: <Section 2.41).
In any order:
Determines the horizontal alignment within the table column.
Allowed values:
Contains a web address associated with the author.
The contents should be a valid URI (see [RFC3986]).
This element appears as child element of: <Section 2.2).
Content model: only text content.
This element appears as child element of: <Section 2.40).
Content model: this element does not have any contents.
This element is used to specify the Working Group the document originates from, if any. The recommended format is the official name of the Working Group (with some capitalization).
In Internet-Drafts, this is used in the upper left corner of the boilerplate, replacing the "Network Working Group" string. Formatting software can append the words "Working Group" or "Research Group", depending on the "submissionType" property on the <Section 2.36.9).
This element appears as child element of: <Section 2.20).
Content model: only text content.
This element appears as child element of: <Section 2.3), <Section 2.7), <Section 2.9), <Section 2.21), <Section 2.31), <Section 2.32), <Section 2.40), <Section 2.43), and <Section 2.44).
Content model: only text content.
Allowed values:
Unused.
It's unclear what the purpose of this attribute is; processors seem to ignore it and it never was documented.
Allowed values:
This format is based on [XML], thus does not have any issues representing arbitrary Unicode [UNICODE] characters in text content.
However, the current canonical RFC format is restricted to US-ASCII [USASCII] characters (see [RFCSTYLE]). Future versions are likely to relax this role, and it is expected that the vocabulary will be extended so that US-ACSII alternatives can be provided when that makes sense (for instance, in contact information).
The "name" attribute on the <Section 2.5.4) can be used to derive a filename for saving to a local file system. Trusting this kind of information without pre-processing is a known security risk; see [RFC6266] for more information.
Furthermore, all security considerations related to XML processing are relevant as well (see [RFC3470]).
IANA maintains the registry of Internet media types [BCP13] at http://www.iana.org/assignments/media-types.
This document serves as the specification for the Internet media type "application/rfc+xml". The following is to be registered with IANA.
Thanks to everybody who reviewed this document and provided feedback and/or specification text. Thanks especially go to Julian Reschke for editing [XML2RFCv2] and those who provided feedback on that document.
We also thank Marshall T. Rose for both the original design and the reference implementation of the "xml2rfc" formatter.
[XML] | Maler, E., Yergeau, F., Paoli, J., Sperberg-McQueen, M. and T. Bray, "Extensible Markup Language (XML) 1.0 (Fifth Edition)", W3C Recommendation REC-xml-20081126, November 2008. Latest version available at |
[XML2RFCv2] | Freed, N. and J. F. Reschke, "The 'XML2RFC' version 2 Vocabulary", Internet-Draft draft-reschke-xml2rfc, January 2014. |
For RFCs, the category determines the "maturity level" (see [RFC2026]). The allowed values are "std" for "Standards Track", "bcp" for "BCP", "info" for "Informational", "exp" for "Experimental", and "historic" for - surprise - "Historic".
For Internet-Drafts, the category attribute is not needed, but will appear on the front page as "Intended Status". Supplying this information can be useful to reviewers.
This attribute value can take a long list of values, each of which describes an IPR policy for the document. This attribute's values are not the result of a grand plan, but remain simply for historic reasons. Of these values, only a few are currently in use; all others are supported by the various tools for backwards compatibility with old source files.
Disclaimer: THIS ONLY PROVIDES IMPLEMENTATION INFORMATION. IF YOU NEED LEGAL ADVICE, PLEASE CONTACT A LAWYER. For further information, refer to http://trustee.ietf.org/docs/IETF-Copyright-FAQ.pdf.
For the current "Status Of This Memo" text, the submissionType attribute determines whether a statement about "Code Components" is inserted (which is the case for the value "IETF", which is the default). Other values, such as "independent", suppress this part of the text.
The name for these values refers to the "IETF TRUST Legal Provisions Relating to IETF Documents", sometimes simply called the "TLP, that went into effect on February 15, 2009 ([TLP2.0]). Updates to this document were published on September 12, 2009 ([TLP3.0]) and on December 28, 2009 ([TLP4.0]), modifying the license for code components (see http://trustee.ietf.org/license-info/ for further information). The actual text is located in Section 6 ("Text To Be Included in IETF Documents") of these documents.
The tools will automatically produce the "correct" text depending on the document's date information (see above):
TLP | starting with publication date |
---|---|
[TLP3.0] | 2009-11-01 |
[TLP4.0] | 2010-04-01 |
This should be the default, unless one of the more specific '*trust200902' values is a better fit. It produces the text in Sections 6.a and 6.b of the TLP.
This produces additional text from Section 6.c.i of the TLP:
This produces the additional text from Section 6.c.ii of the TLP:
This produces the additional text from Section 6.c.iii of the TLP, frequently called the "pre-5378 escape clause":
See Section 4 of http://trustee.ietf.org/docs/IETF-Copyright-FAQ.pdf for further information about when to use this value.
The attribute values "http://trustee.ietf.org/license-info/archive/IETF-Trust-License-Policy_11-10-08.pdf.
The attribute values "
The attribute values "
The attribute values "
The special value "