Internet DRAFT - draft-bormann-rswg-terminology
draft-bormann-rswg-terminology
Network Working Group C. Bormann
Internet-Draft Universität Bremen TZI
Intended status: Informational 27 July 2023
Expires: 28 January 2024
Terminology for RFCXML Evolution
draft-bormann-rswg-terminology-00
Abstract
The canonical format for RFCs is called RFCXML, with the currently
effective details originally documented in the RFC 799x series. This
format has experienced some uncontrolled evolution since, partially
caused by an unwillingness to recognize the need for overt,
deliberate evolution.
Controlled RFCXML evolution is going to be complex. Its discussion
will need agreed terminology, without which it will devolve into a
Tower of Babel.
About This Document
This note is to be removed before publishing as an RFC.
Status information for this document may be found at
https://datatracker.ietf.org/doc/draft-bormann-rswg-terminology/.
Discussion of this document takes place on the rswg Working Group
mailing list (mailto:rswg@ietf.org), which is archived at
https://mailarchive.ietf.org/arch/browse/rswg/. Subscribe at
https://www.ietf.org/mailman/listinfo/rswg/.
Source for this draft and an issue tracker can be found at
https://github.com/cabo/rswg-terminology.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Bormann Expires 28 January 2024 [Page 1]
Internet-Draft Terminology for RFCXML Evolution July 2023
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 28 January 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Conventions and Definitions . . . . . . . . . . . . . . . 3
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Format . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Instances . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3. Evolution . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4. Types of Evolution . . . . . . . . . . . . . . . . . . . 6
2.5. Correcting Errors . . . . . . . . . . . . . . . . . . . . 7
3. Security Considerations . . . . . . . . . . . . . . . . . . . 7
4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7
5. Normative References . . . . . . . . . . . . . . . . . . . . 7
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 7
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 7
1. Introduction
The canonical format for RFCs is called RFCXML, with the currently
effective details originally documented in the RFC 799x series. This
format has experienced some uncontrolled evolution since, partially
caused by an unwillingness to recognize the need for overt,
deliberate evolution.
Controlled RFCXML evolution is going to be complex. Its discussion
will need agreed terminology, without which it will devolve into a
Tower of Babel.
Bormann Expires 28 January 2024 [Page 2]
Internet-Draft Terminology for RFCXML Evolution July 2023
1.1. Conventions and Definitions
Although this document is not an IETF Standards Track publication, it
adopts the conventions for normative language to provide clarity of
instructions to the implementer. The key words "MUST", "MUST NOT",
"REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14 [RFC2119]
[RFC8174] when, and only when, they appear in all capitals, as shown
here.
2. Terminology
Ultimately, this document should turn into a definitions section of
some other document. For now, we will use a mix of prose and
definition styles.
2.1. Format
XML does not define the meaning of its instances. Saying "this
document is in XML" doesn't tell you much more about its semantics
than "this document is in ASCII".
When we talk about the specific semantics instilled into an XML
document by the RFCXML format, we will therefore always use the term
RFCXML. This term can be split into several aspects:
* *syntactic* aspects. As XML is (mostly) a tree, this is often
reduced to a (tree) *grammar* of XML elements and XML attributes.
However, there are other syntactical aspects, such as for the text
in elements and attribute values (*lexical* aspects): the meaning
of specific characters (e.g., format effectors; hyphenation
semantics) and even whether some text is allowed in certain
elements or attribute values.
* *semantic* aspects. The elements and attributes carry specific
semantics. These semantics are perceived by document users
through *renderings*. Semantic markup is about keeping the
semantics mostly at a domain level, with an ability to infer the
right kind of *layout* (in a wide sense, e.g., including font
choice) in a rendering process. However, there are also semantics
that are defined at a rendering level, e.g., those of <em> and
<strong>. (Officially, these are also semantic markup, but as
soon as a "Conventions" section says "Newly defined terms are
shown in italics", that is no longer true.)
Bormann Expires 28 January 2024 [Page 3]
Internet-Draft Terminology for RFCXML Evolution July 2023
Semantic aspects that are not rendered are *hidden* semantics. E.g.,
the <keywords> element is entirely not rendered in today's
renderings; it is intended for processes outside of rendering/human
consumption (e.g., search). The <sourcecode name= attribute is
rendered only in certain cases, but can be used by sourcecode
extraction processes (e.g., for CI or for re-use of pseudo-code in
other contexts) in other cases, too.
RFCXML currently has three rendering *targets* offered by the RFC
editor: *TXT*, *HTML*, *PDF*. HTML and PDF are *typographic*
renderings, TXT is a *typewriter* rendering. IETF datatracker also
has a forth and fifth rendering target, which uses HTML or PDF, but
tries to emulate TXT rendering while doing so.
Some semantics is hidden in some of these renderings, but not others.
E.g., the <tt> element serves to identify text as different from
normal running text, semantically similar to the way <sourcecode> and
<artwork> do, but syntactically more like <em> or <strong>. Since
xml2rfc 3.10.0, the semantics of <tt> are *suppressed* in TXT
renderings, which leads to problems (not just the middle-of-the-river
semantic change, but also for new documents: the inability to express
certain semantics in a way that they are recognizable in TXT but not
distracting in the typographic renderings). A recent poll whether
<em> should also be suppressed in TXT ended with a negative result.
Note that suppression of certain semantics in certain rendering
targets is fine if the semantics is *ancillary*. Different documents
differ in their usage of certain markup semantics, and even different
authors of the same document may disagree whether some usage is
ancillary or *essential* (i.e., of semantic intent, conveying
meaning): From an author's view, usage of specific markup can be for
aesthetic purposes, it can increase ease of use of the document, it
can help prevent a misunderstanding (which can have very different
levels of likelihood to occur), or it can be essential.
2.2. Instances
RFCs are *instances* of RFCXML, specifically the *publishing* subset
of RFCXML. As of today, these instances are *immutable*. Format
evolution may call for a way to evolve the instances along with an
evolved format specification.
Most RFCs are the result of a *consensus* process, either full IETF
consensus or maybe just the review of a smaller group whether the
document should be published (IAB, IRTF RG, ISE review).
Bormann Expires 28 January 2024 [Page 4]
Internet-Draft Terminology for RFCXML Evolution July 2023
This consensus is almost exclusively formed by review processes that
involve reviewing renderings, only very rarely by looking at the
RFCXML instance itself. These review processes are often extremely
expensive, as they involve contributions from sought-after experts in
the field. Their output constitutes much of the value of the RFC
series.
During the review processes, the document instance is not an RFC.
Specifically, the *authoring* subset of RFCXML is used, which has
slightly different characteristics from the publishing subset. As
mentioned, we sometimes also use different renderers during the
authoring/reviewing process (e.g., datatracker's distinct HTML/PDF
renderings), reducing the congruence of the reviewed document with
what its users will see.
2.3. Evolution
The definition of RFCXML will evolve, by adding functionality, or by
taking elements and attributes out of service (sometimes called
*deprecating*, but see below) that have been obsoleted in some way.
This is relatively straightforward for new documents.
Documents that have been in the authoring process and have already
received expensive review generally need a *transition* strategy,
such as translation from the format defined by an older RFCXML
specification to a newer one. This transition often needs to be
synchronized with tool development more than with consensus processes
on the format itself, which can give tools a de-facto normative role.
Documents that already have been published cannot benefit from format
evolution as long as their XML instances are immutable. This can be
accommodated by keeping RFCXML able to process published documents —
just those, not the entirety of potential instances of a previous
RFCXML specification. This support would be tagged as for backwards
compatibility only. (Backwards compatibility for documents in
authoring/reviewing stage would reduce disruption.)
The corpus of published RFCXML-form documents is large enough that
any translation processes to a new RFCXML specification need to be
*automated*. Such automated processes can then also be made available
for authoring/reviewing (xml2rfc's --v2v3 process is a nicely carried
out example for that) or just focused on the finite set of documents
published to a previous RFCXML specification.
A format change can affect the Syntax (grammar, other syntactic
details not captured in the grammar), the Semantics, and/or the
Rendering (possibly hiding some information in some renderings).
Bormann Expires 28 January 2024 [Page 5]
Internet-Draft Terminology for RFCXML Evolution July 2023
2.4. Types of Evolution
A term that has been used in a non-standard way in the creation of
RFCXMLv3 is *deprecation*. In RFC799x, it means that the deprecated
feature is no longer available for publishing. It is still available
during authoring/reviewing, with an understanding that these
processes provide a way to do a reviewed manual translation or to at
least review automated translation.
* Backwards compatibility (*BC*) means the ability of new systems to
work with old data.
* Forward compatibility means the ability of old systems to work
with new data. Forward compatibility is of little interest for
the current discussion, as we generally view tooling as updated in
sync with evolution processes. xml2rfc's input validation actively
prevents forward compatibility, there is no "ignore-unknown"
functionality even for semantics that could be ancillary.
Here, Backwards compatibility often can only be ascertained by manual
review: It is not sufficient that the new system does not crash with
the old data, the old data MUST be useful in the sense that it would
survive the same review processes. (These are generally too
expensive to be redone just for an RFCXML format change.)
A non-backwards-compatible (*NBC*) change to the RFCXML format can
have *detectable* impact on a document, e.g., by now failing its
validation. Or the impact can be *non-detectable*, i.e., requiring
human review to detect, such as a semantic change that creates a
different rendering that (potentially) has a different meaning.
A *semantic refinement* allows instances of the updated RFCXML
specification to express more detailed information than previously
possible. E.g., the <em> element could be split into usages for term
definitions, true emphasis, and other usages of italic type. It
could carry hints as to how to emulate it in typewriter renderings.
A semantic refinement can be done in a roughly backwards-compatible
way, by retaining the unrefined alternative (e.g. <em>). Giving that
alternative more limited semantics (e.g., by adding an attribute with
a default value) is no longer truly backwards-compatible, as it is a
(usually hidden!) semantic change. Retaining it without
"deprecating" it will require some will-power --- but many documents
may not have a need for the specific refinement (e.g., proposed in
the example) and would be well-served by retaining the unrefined
alternative.
Bormann Expires 28 January 2024 [Page 6]
Internet-Draft Terminology for RFCXML Evolution July 2023
2.5. Correcting Errors
If there is a need to translate RFC instances to new format
specifications, they are no longer immutable (and/or their names need
to be augmented by a revision indicator, possibly with a way added to
obtain the most recent revision).
Opening up mutability provides an opportunity to correct errors in
the originally published document, such as errata.
Such an *instance update* also can be used to replace now deprecated
(in the English sense) markup by modern one.
An example for a detectable NBC change would be to only allow digits
and single spaces between them in <rfc updates= attributes.
Correcting this in the now failing instances would probably be done
by manual intervention, as the number of instances is too small to
justify automation.
3. Security Considerations
TBD
4. IANA Considerations
This document has no IANA actions.
5. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
Acknowledgments
TBD
Author's Address
Bormann Expires 28 January 2024 [Page 7]
Internet-Draft Terminology for RFCXML Evolution July 2023
Carsten Bormann
Universität Bremen TZI
Postfach 330440
D-28359 Bremen
Germany
Phone: +49-421-218-63921
Email: cabo@tzi.org
Bormann Expires 28 January 2024 [Page 8]