RFC Beautification Working Group | R. Gieben |
Internet-Draft | |
Intended status: Informational | July 05, 2013 |
Expires: January 06, 2014 |
Writing I-Ds and RFCs using Pandoc and a bit of XML
draft-gieben-pandoc2rfc-02
This document presents a technique for using a Markdown syntax variant, called Pandoc, and a (bit) of XML [RFC2629] as a source format for documents in the Internet-Drafts (I-Ds) and Request for Comments (RFC) series.
The goal of this technique is to let an author of an I-D focus on the main body of text without being distracted too much by XML tags, it does however not alleviate the need to typeset some files in XML.
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on January 06, 2014.
Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
This document presents a technique for using a Markdown [Markdown] syntax variant, called Pandoci [Pandoc], and a bit of XML [RFC2629] as a source format for documents in the Internet-Drafts (I-Ds) and Request for Comments (RFC) series.
The goal of this technique is to let an author of an I-D focus on the main body of text without being distracted too much by XML tags, it does however not alleviate the need to typeset some files in XML.
Pandoc is an almost plain text format and therefor particularly well suited for editing RFC-like documents. The syntax itself is a super set of the syntax championed by Markdown.
Pandoc's syntax is easy to learn and write and it can be translated to numerous output formats, including, but not limited to: HTML, EPUB, (plain) Markdown and DocBook XML.
Pandoc2rfc allows authors to write in Pandoc syntax which is then transformed to XML and given to xml2rfc. The conversions are, in a way amusing, as we start off with (almost) plain text, use elaborate XML and end up with plain text again.
+-------------------+ pandoc +---------+ | ALMOST PLAIN TEXT | ------> | DOCBOOK | +-------------------+ 1 +---------+ | | non-existent | 2 | xsltproc faster way | | v v +------------+ xml2rfc +---------+ | PLAIN TEXT | <-------- | XML | +------------+ 3 +---------+
Figure 1: Attempt to justify Pandoc2rfc.
The output of step 2 in Figure 1 is XML which is suitable for inclusion in either the middle or back section of an RFC.
Even though Pandoc2rfc abstracts away a lot of XML details, there are still places left where XML files needs to be edited. Most notably in the front section of an RFC.
The simplest way to start using Pandoc2rfc is to create a template XML file and include the appropriate XML for the front, middle and back section:
<?xml version='1.0' ?> <!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [ <!ENTITY pandocAbstract PUBLIC '' 'abstract.xml'> <!ENTITY pandocMiddle PUBLIC '' 'middle.xml'> <!ENTITY pandocBack PUBLIC '' 'back.xml'> <!ENTITY rfc.2629 PUBLIC '' 'reference.RFC.2629.xml'> ]> <rfc ipr='trust200902' docName='draft-gieben-pandoc2rfc-01'> <front> <title>Writing I-Ds and RFCs using Pandoc</title> <author> <organization/> <address><uri>http://www.example.com</uri></address> </author> <date/> <abstract> &pandocAbstract; </abstract> </front> <middle> &pandocMiddle; </middle> <back> <references title="Normative References"> &rfc.2629; </references> &pandocBack; </back> </rfc>
Figure 2: A minimal template.xml.
In this case you will need to edit four documents:
Up to date source code for Pandoc2rfc can be found at [Pandoc2rfc], this includes the style sheet transform.xsl which used for the XML transformation (also see Section 3).
Pandoc2rfc needs xsltproc [XSLT] and pandoc [Pandoc] to be installed. The conversion to xml2rfc XML is done with a style sheet based on XSLT version 1.0 [W3C.REC-xslt-19991116].
When using the template from Figure 2 xml2rfc version 2 (or higher) must be used.
Assuming the setup from Section 2, we can build an I-D as follows (in a Unix-like environment):
for i in abstract middle back; do pandoc -st docbook $i.mkd | xsltproc --nonet transform.xsl - > $i.xml done xml2rfc template.xml -f draft.txt --text # create text output xml2rfc template.xml -f draft.html --html # or create HTML output xml2rfc template.xml -f draft.xml --exp # or create XML output
Figure 3: Building an I-D.
Note that the output file names (abstract.xml, middle.xml and back.xml) must match the names used as the XML entities in template.xml (See the !ENTITY lines in Figure 2). The Pandoc2rfc source repository includes a shell script that incorporates the above transformations. Creating a draft.txt or a draft.xml can be done with pandoc2rfc *.mkd and pandoc2rfc -X *.mkd respectively.
The full description of Pandoc's syntax can be found in [PandocGuide]. The following features of xml2rfc are supported by Pandoc2rfc (also see Table 1 in Appendix B for a "cheat sheet"):
Term 1 : Definition 1
Figure 4: Pandoc syntax used for a hanging paragraph.
With Pandoc2rfc an author of an I-D can get a long way without needing to input XML, but it is not a 100% solution. The initial setup and the reference library still forces the author to edit XML files. The meta data feature (Pandoc's "Title Block" extension) is not used in Pandoc2rfc. This information (authors, date, keyword and URLs) should be put in the template.xml.
Some other quirks:
The following sections detail how to use the Pandoc syntax for figures, tables and references to get the desired output.
Indent the paragraph with 4 spaces as mandated by Pandoc. If you add an inline footnote directly after the figure, the artwork gets a title attribute with the text of that footnote (and a possible anchor).
A table can be entered by using Pandoc's table syntax. You can choose multiple styles as input, but they all are converted to the same style (plain <texttable>) table in xml2rfc. If you add an inline footnote directly after the table, it will get a title attribute with the text of that footnote (and a possible anchor). The built in syntax of Pandoc to create a caption with Table: should not be used.
Pandoc provides a syntax that can be used for references. Its syntax is repeated in this paragraph. Any reference like: [Click here](URI), is an external reference. An internal (i.e. see Section X) reference is typeset with: [](#localid).
For referencing RFCs (and other citations), you will need add the reference source in the template, as an external XML entity, Figure 2 provides an example. After that you can use an internal reference: [](#RFC2629) to reference RFC 2629.
There is no direct support for referencing tables, figures and artworks, but pandoc2rfc employs the following "hack". If an inline footnote is added after the figure or table, the text of the footnote is used as the title. The first word up until a double colon :: will be used as the anchor. If a figure has an anchor it will be centered on the page.
Figure 2 for instance, is followed by this inline footnote:
^[fig:minimal::A minimal template.xml.]
The following people have helped shape Pandoc2rfc: Benno Overeinder, Erlend Hamnaberg, Matthijs Mekking and Trygve Laugstøl.
This document raises no security issues.
This document has no actions for IANA.
[Markdown] | Gruber, J., "Markdown", 2004. |
[XSLT] | Veillard, D., "The XSLT C library for GNOME", 2006. |
[Pandoc] | MacFarlane, J., "Pandoc, a universal document converter", 2006. |
[PandocGuide] | MacFarlane, J., "Pandoc User's Guide", 2006. |
[Pandoc2rfc] | Gieben, R., "Pandoc2rfc git repository", October 2012. |
[RFC2629] | Rose, M.T., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999. |
[W3C.REC-xslt-19991116] | Clark, J., "XSL Transformations (XSLT) Version 1.0", World Wide Web Consortium Recommendation REC-xslt-19991116, November 1999. |
[This section should be removed by the RFC editor before publishing]
Textual construct | Pandoc syntax | Text output |
---|---|---|
Section Header | # Section | 1. Section |
Unordered List | * item | o item |
Unordered List | #. item | item |
Ordered List | 1. item | 1. item |
Ordered List | a. item | a. item |
Ordered List | ii. item | i. item |
Ordered List | II. item | (1) item |
Ordered List | A. item | A. item |
Emphasis | _text_ | text |
Strong Emphasis | **text** | text |
Verbatim | `text` | "text" |
Block Quote | > quote | quote |
External Reference | [Click](URI) | Click [1] |
Internal Reference | [](#id) | Section 1 |
Figure Anchor | ^[fid::text] | N/A |
Figure Reference | [](#fid) | Figure 1 |
Table Anchor | ^[tid::text] | N/A |
Table Reference | [](#tid) | Table 1 |
Citations | [](#RFC2119) | [RFC2199] |
Table | Tables | |
Figures | Code Blocks | |
Definition List | Definition |