Internet DRAFT - draft-cordell-lumas
draft-cordell-lumas
Internet Engineering Task Force P. Cordell
Internet Draft Tech-Know-Ware Ltd
draft-cordell-lumas-05.txt
February 1, 2007
Expires: August 1, 2007
Lumas -
Language for Universal Message
Abstraction and Specification
STATUS OF THIS MEMO
By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on August 1, 2007.
Copyright Notice
Copyright (C) The IETF Trust (2007).
Abstract
A number of methods and tools are available for defining the format
of messages used for application protocols. However, many of these
methods and tools have been designed for purposes other than message
definition, and have been adopted on the basis that they are
Cordell Expires August 1, 2007 [Page 1]
Internet Draft Lumas February 2007
available rather than being ideally suited to the task. This often
means that the methods make it difficult to get definitions correct,
or result in unnecessary complexity and verbosity both in the
definition and on the wire.
Lumas - Language for Universal Message Abstraction and Specification
- has been custom designed for the purpose of message definition. It
is thus easy to specify messages in a compact, extensible format that
is readily machine manipulated to produce a compact encoding on the
wire.
Table of Contents
1. Introduction
2. About Lumas
3. Lumas and Other Message Definition Languages
4. Terminology
5. Example Lumas Message Definition and Message Encoding
5.1 Principles of the Message Definition
5.2 An Example Message Definition
6. Formal Message Definition Syntax
6.1 Lumas Keywords
6.2 Lumas Parameters
6.3 Simple Parameters
6.4 The Simple Types
6.5 Simple Type Definition
6.6 The Pattern Constraint
6.7 The Name
6.8 Cardinality
6.9 Tagging
6.10 The Plugin Extension Mechanism
6.11 Reference Parameters
6.12 Compound Parameters
6.13 Struct Parameters
6.14 Union Parameters
6.15 Combined Parameters
6.16 Referenced Parameters
6.17 External Extensions - Plug and Pluggable
6.18 Module Definition and Directives
6.19 The Top Level Definition
6.20 Locating Lumas within a Specification
7. On-the-Wire Representation
7.1 Principles of the default On-the-Wire Encoding
7.2 Formal On-the-Wire Representation
7.3 Marking Message Boundaries
7.4 Examples of Encoded Types
8. Common ABNF Definitions
9. Notes on Comments
10. Locating Lumas Modules
11. Mandatory to Understand
12. Security Considerations
13. Normative References
Cordell Expires August 1, 2007 [Page 2]
Internet Draft Lumas February 2007
14. Informative References
15. Author's Address
1. Introduction
Lumas is a lightweight, message definition language that is both
flexible and highly extensible. This document defines the Lumas
message definition language, and the default text encoding method for
messages defined in this way.
2. About Lumas
Lumas - Language for Universal Message Abstraction and Specification
- is a simple message definition language that can be used to define
the messages used by protocols. In this context, a message is
defined as a collection of data used to convey information between
two or more machines (or processes). Typically Lumas is used to
define application layer messages (e.g. at the layer at which the
likes of SMTP [SMTP] is defined), but there is no practical reason
why Lumas should not be used at other layers.
The design objectives of Lumas are simplicity, ease of use,
efficiency, and extensibility.
Lumas provides a high-level method for defining messages and a
default set of encoding rules for character based protocols. The
encoding rules describe how instances of messages that conform to the
defined high-level definition are represented on the wire. It is
also possible to define alternative encoding rules that could be used
to define representations of messages in binary form, or other
character based forms; e.g. XML [XML] or JSON [JSON]. In general
Lumas is not able to describe messages with arbitrary sequences of
characters and bytes, any more than a C compiler is able to specify
arbitrary sequences of assembler instructions.
Lumas recognises that message definition is a small part of the
overall development process and thus should not warrant a
disproportionately large investment in learning the language. Lumas
uses the 80/20 principle to keep it simple. Lumas is designed to
readily allow the use of Lumas aware software tools to aid in the
development process. Lumas messages are text-encoded by default so
that they are easy to read, and it is easy to create test messages
for debugging. Using Lumas in applications is designed to be simple
and efficient. Lumas addresses a number of different types of
extensibility, including versioning, external extensions, and
component based architectures.
This makes Lumas an ideal definition language to use where
simplicity, efficiency, compactness and/or a high degree of
extensibility is required, especially where the extensibility
involves plugging external modules into the base syntax.
Cordell Expires August 1, 2007 [Page 3]
Internet Draft Lumas February 2007
3. Lumas and Other Message Definition Languages
Over the years a number of message definition methods have been
developed. These include XDR [XDR], ASN.1 [ASN1], various flavours
of IDL (such as OMG IDL [OMGIDL]), 'bit pictures,' various flavours
of BNF (e.g. ABNF [ABNF]), and XML [XML]. It is therefore worthwhile
considering how Lumas relates to these other message definition
languages.
Lumas differs from XDR in that Lumas is primarily a language for
defining text-encoded messages. XDR is fixed to defining binary
messages of very specific types.
ASN.1 is also primarily a language for defining binary messages,
although recently there have been XML encoding rules defined. ASN.1
information object classes are difficult to understand and a
deterrent to its use. The complexity of some of the encoding rules,
such as BER and PER, make the method difficult to use without using
special tools. ASN.1 has found uses in the IETF, notably in the
areas of cryptography (CMS [CMS] etc) and SNMP [SNMP]. However, it
is not much loved, and efforts such as SMING have been undertaken to
replace its usage (although at the time of writing this effort seems
to have stalled).
The IDL languages such as OMG IDL have similarities with message
definition languages, but are subtly different. IDLs define a
collection of objects, each of which describes a remote procedure
call. They also define a return value for the procedure call. A
protocol message set is typically a single object that can have a
number of variants. A protocol will typically send another message
is response to a message rather than sending a return value.
Perhaps for the reasons mentioned, the above methods have not
received wide usage within the IETF. The main workhorses for message
definition in the IETF have been 'bit pictures,' various types of BNF
and more recently XML.
The term 'bit pictures' is used to refer to the pictures of bits and
bytes that is used to capture the layout of parameters within a
message, such as used to define IP [IP], UDP [UDP] and TCP [TCP].
This is very low-level and really only suitable for protocols
containing a few parameters which ideally have fixed positions.
At a level higher than pure 'bit pictures' is the scheme used in TLS
[TLS], but this again is specific to defining binary messages.
Diameter [DIAMETER] presents another variation on this approach.
A number of types of BNF have been defined over the years, most
recently ABNF. Until recently, the BNFs have been the main workhorse
of IETF application level protocol definition. ABNF is very
low-level, and is much like programming in assembler when high-level
languages would be more useful. It is very difficult to get
Cordell Expires August 1, 2007 [Page 4]
Internet Draft Lumas February 2007
definitions correct, and issues such as ensuring extensibility have
to be addressed not only for each message definition, but also for
each parameter within the definition. The implementation route from
ABNF can also be long as there is typically not enough high level
information in the specification for tools to extract the important
elements.
This leaves XML. XML is a comprehensive and powerful way of defining
messages. It would be a long and unproductive exercise to list all
the things that XML gets right. Instead, the focus here is on the
areas that a developer may wish to consider when choosing between
Lumas and XML.
The main differences between Lumas and XML are in the areas of
simplicity and efficiency. Whether these differences are significant
will depend on the application.
There are two parts of the XML route: XML itself, and the method used
to define the XML messages.
Some of the less significant issues to consider are to do with XML
itself. For example, it has long been recognised that the format of
XML messages, with its start and end tags, is inefficient. (It is
the author's belief that the extra tagging also makes the messages
harder to read, because the message is dominated by tags rather than
the important part, which is the values. Hence, what works well when
there is a high ratio of PCDATA to tags, is detrimental when that
ratio is significantly reduced.) The separation of parameters into
attributes and elements adds complexity, but adds no real value in a
protocol, and is an artefact of markup use. The provision for
multiple character encodings (such as UTF-8, UTF-16BE, UTF-16LE,
ISO-8859-1 etc) places demands on a parser as does the implementation
of namespaces (where in a start tag the namespace is defined after
the first use of the namespace), which requires double parsing or
significant intermediate storage. The task of converting a namespace
prefix to a namespace is potentially an area involving significant
lookup effort. Once expanded, the effective tag is a long sequence
of characters on which comparison operations are performed, the size
of which potentially reduces efficiency. User definable general
entities and parameter entities are additional burdens that have
little value for message definition, as is the white space handling
which is a hang over from XML as a markup language. While these are
surmountable problems, the consideration for a developer has to be
'why pay for it if I don't need it?'
The second issue is how to define the XML messages. Arguably the
current favourite is W3C XML Schema, although there are other methods
including RELAX NG [RELAX] and Schematron [STRON]. First of all, it
has to be admitted that this is currently a controversial area and
the existence of the latter two is largely due to concerns about the
former. The main concern with XML Schema is again complexity. Maybe
in the future one of the other methods will prevail.
Cordell Expires August 1, 2007 [Page 5]
Internet Draft Lumas February 2007
Keeping with XML Schema for now, firstly the language can be very
difficult to learn. The specification is some 350 pages long
(ignoring XML itself, and XML namespaces etc), and uses a formal
language that is very confusing to interpret. In a number of areas
there is even debate among the experts about what is intended. The
constructs can be confusing and apparently contradictory in a number
of areas, such as the notion of complexType with simpleContent and so
on. While XML Schema is touted as being extensible, in practice for
the unwary, there are a number of traps to fall into. For example,
incorporated attribute and element groups, especially those from
different schemas can easily result in name clashes when they are
extended independently. Enumerated strings can not be extended
without careful consideration. Indeed, the Unique Particle
Attribution Constraint makes defining an extensible schema messy and
not something that happens by accident [XMLVER]. There is no support
for capturing what has changed from one version of a schema to the
next, other than doing a diff operation on two files. This again
makes it difficult for tools. Other features also make it difficult
for tools, such as the ability to use patterns to restrict the format
of basic types such as floating point numbers. XML Schema has no
concise way of specifying short tag names while at the same time
specifying descriptive formal names. For example, the most common
XML like syntax, HTML [HTML], has an abundance of short tags such as
<a>, <p>, <b> etc. This makes it easy for the expert to type, and it
must be assumed that the approach has some merit otherwise it
wouldn't have been done that way. But XML Schema does not readily
support this. Verbosity is even more of an issue when it comes to
XML Schema, in a number of cases requiring five of more lines of text
when only one would do. This means extra scrolling or page turning
when editing and viewing, which makes a schema harder to write,
harder to check, and harder for a third-party to understand.
Many of these problems are subjective. Some can also be avoided by
defining style guides and best practices for using XML Schema (for
example [XMLBCP]). Compression can be used to reduce the size of
messages. However, this really just addresses the complexity by
adding more complexity. Not only does this make it harder to learn,
it is important to remember that where there is complexity, there is
the potential for bugs. And bugs not only affect the integrity of
the code, but can affect the security of the system on which the code
runs also. Complexity is also a barrier to implementation. It could
be argued that the Internet has been successful because of its use of
simple protocols. Using XML Schema would seem to be at odds with
that principle.
By being designed to be simple, Lumas avoids these problems.
In summary, currently the main tools used for message definition in
the IETF are ABNF and XML Schema. In many respects these represent
two extremes, one simple and very low-level, and the other complex
and high-level. Lumas is a data point between these two extremes,
giving much of the flexibility of XML with the ease of understanding
Cordell Expires August 1, 2007 [Page 6]
Internet Draft Lumas February 2007
and compactness of ABNF. As such it is a useful extra tool that
allows protocol developers to better tailor protocols to their needs.
On another level, although message definition languages have been
around for many years now, the relative paucity of options available,
and the fact that XML is being trumpeted as a break through in
inter-platform communication suggests that in terms of evolution, the
field is in its infancy. It's easy to see why this might be.
Message definition has not been seen as a core activity, and
developers simply make-do by borrowing what is already available in
other fields, even if they are not an ideal fit to their
requirements. This would suggest that there is scope for much
development, and it may transpire that XML turns out to be the
FORTRAN or COBOL of the message definition world, and there is much
more exciting stuff to come. It is hoped that Lumas can play a part
in that story.
4. Terminology
The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
in this document are to be interpreted as described in [KWORDS].
For the purposes of this document, a "tag" is a fixed sequence of
characters used on the wire as an identifier for the value or values
that it is associated with. Thus identified, the value can be
interpreted and processed in the right way.
5. Example Lumas Message Definition and Message Encoding
As the Lumas message definition syntax is C-like it is felt that many
will immediately understand the majority of a message definition.
For this reason the basic principles of the definition language and a
short example are presented before describing the format in detail.
5.1 Principles of the Message Definition
Following the C language format, the basic format of a parameter
definition is:
type name ;
'Type' specifies things like integers, booleans, ASCII strings,
Unicode strings and so on.
The 'name' is the name of the parameter.
Thus a parameter definition might be:
ascii rfc-name ;
This says that 'rfc-name' is an ASCII string. In addition, a
parameter definition can express constraints on the type, constraints
Cordell Expires August 1, 2007 [Page 7]
Internet Draft Lumas February 2007
on the cardinality (how many instances of the type are valid in a
message), and the tag to be used for the value on the wire. (A tag
is a fixed sequence of characters that is used to identify the value
or values that it is associated with.) For example, an integer may
be limited to the values 0 to 255, and an ASCII string may be limited
to a maximum size. The fuller format of a parameter definition has
the form:
type <constraint> name [cardinality] tagging ;
For example:
int <1..30000> referenced-rfcs [0..255] as refers ;
This defines an integer that can have values between 1 and 30000.
The name of the parameter is 'referenced-rfcs', but is tagged
on-the-wire using the character sequence 'refers'. The parameter can
consist of between 0 and 255 instances of the integer in a valid
message.
Two main types of compound parameter are possible, these being
'struct' and 'union'. Having much the same meaning as they have in
C, a struct specifies a group of parameters, all of which may be used
in a particular instance of the struct. A union similarly specifies
a group of parameters, but in this case only one of the parameters
can be used in any one instance of the union.
An example of a struct is:
struct rfc-info
{
ascii rfc-name;
int <1..30000> referenced-rfcs[0..255] as refers ;
};
A third form of compound type called 'combi' is also available. The
name is short for 'combined' and the type allows a number of values
to be concatenated together into what looks like a single value.
Hence it can be used to define constructs like the character sequence
'HTTP/1.0', and that the '1' and the '0' are the major and minor
version numbers.
5.2 An Example Message Definition and How it is Encoded
The following is an example message definition that is intended to
represent a very crude meeting controller:
Cordell Expires August 1, 2007 [Page 8]
Internet Draft Lumas February 2007
lumas module com.tech-know-ware.my-example;
/*
An example Lumas definition
*/
import com.tech-know-ware.general as tkwg;
struct my-example
{
int <0..255> participant-id as ?;
Action action as ?;
struct my-addition[0..1]
as new.tech-know-ware.com plugin
{
bool tkw-app-capable as ?;
};
};
union Action
{
Join join;
Message message as msg;
void leave;
};
struct Join
{
unicode<0..63> name;
};
struct Message
{
int <0..255> to-participants[1..127] as to;
unicode<1..255> message as msg;
[ // Version 2 additions
tkwg::Priority priority;
]
[ // Version 5 additions
ascii<0..16> font-name[0..1] as font;
void bold[0..1];
void italic[0..1];
void underlined[0..1] as ul;
]
};
The first construct (in this case the struct my-example) is the root
of all messages for the protocol. Each message identifies a
participant using an integer in the range 0 to 255, called
'participant-id'. When encoded on the wire, this parameter will be
untagged due to the 'as ?' specification.
On-the-wire, the default encoding generally encodes parameters in the
form:
Cordell Expires August 1, 2007 [Page 9]
Internet Draft Lumas February 2007
tag = value
where 'value' is a textual representation of the parameters value.
However, if a parameter is marked as untagged, then it is represented
simply as:
value
Hence, if in a message an instance of participant-id is to have a
value of, say, 12, then, due to being marked as untagged, it is
encoded simply as:
12
Rather than the following, which would be the case if it was not
marked as untagged:
participant-id = 12
In this example, each message then has an action, which is also
untagged. The type of the action parameter is not immediately
specified, and instead references the 'Action' definition.
The Action definition is a union in which only one of the specified
parameters may appear in an instance of the Action construct. This
effectively represents a fork in the semantics of any given message.
In this case the options within Action can indicate that somebody has
joined the meeting, left the meeting, or is sending a message to
other participants.
There is no explicit tag for the 'join' and 'leave' options, so these
will be tagged on-the-wire by the parameters' names, 'join' and
'leave' respectively. Conversely, an explicit tag for the 'message'
parameter is specified, and hence the message option will be tagged
by 'msg' on-the-wire.
The join parameter also has a referenced definition; the struct named
Join. For the purposes of this example, when a person joins a
meeting, all the other participants are informed of their name. The
name member in the struct is a UTF-8 encoded Unicode string that has
a minimum length of 0 characters and a maximum length of 63
characters. Hence an example of the join parameter encoded on the
wire is:
join = { name = "Alice" }
Here, the braces delimit the extent of the members in the struct and
the double quotes delimit the characters representing the name.
The message option is also a referenced definition. Conceptually, to
send a message, the 'participant-id' is used to identify the sender,
Cordell Expires August 1, 2007 [Page 10]
Internet Draft Lumas February 2007
and the 'to-participants' field contains the participant ids of all
the people to whom the message is being sent. On-the-wire, the
to-participants parameter will be tagged with 'to'. Between 1 and
127 (inclusive) instances of the to-participants parameter may appear
in a message. For efficiency, Lumas allows multiple occurrences of
the same parameter to be represented as a comma separated list.
Hence an example of the on-the-wire encoding of the to-participants
parameter would be:
to = 2, 5, 8, 58
Also, the message itself is included. The message will consist of
Unicode characters and can be between 1 and 255 Unicode characters
long. On-the-wire, the message parameter will have the tag 'msg'.
An example of the on-the-wire format is thus:
msg = "Where are we going for dinner"
The priority field within the message struct has been added in a
later version of the protocol. This is indicated by the square
brackets in which the parameter is wrapped. Similarly, font-name,
and the associated parameters have, according to the comment, been
added in version 5 of the protocol. The type of the 'priority'
parameter is defined in an external module that has the alias 'tkwg'.
The 'import' directive at the beginning of the example indicates that
the 'tkwg' alias corresponds to the module
'com.tech-know-ware.general', and it is in this module that the
definition of 'Priority' is located. The definition indicates that
'font-name' is an ASCII string. The reader should already understand
enough of the definition language to understand the meaning of the
other fields.
Returning to the 'my-example' root, a third-party has added an
extension to the protocol in the form of the 'my-addition' parameter.
It is identified as not being part of the base specification by the
keyword 'plugin'. On-the-wire, the additional parameter will be
identified by the tag 'new.tech-know-ware.com' to differentiate it
from additions that may be made by other third parties.
In summary, the following are complete examples of the default
on-the-wire representation of the example message definition:
12
join = { name = "Alice" }
new.tech-know-ware.com = { True }
and:
Cordell Expires August 1, 2007 [Page 11]
Internet Draft Lumas February 2007
12
msg = { to = 2, 5, 8, 58
msg = "Where are we going for dinner"
font = 'Arial' }
and:
12
leave
Note that the placing of each parameter on a separate line is not
significant. Lumas is free form with respect to white space. Hence,
the message above could equally be represented as:
12 join={name="Alice"} new.tech-know-ware.com={True}
6. Formal Message Definition Syntax
The sections below describe the Lumas message definition syntax. The
'top-level' production is 'lumas-definition', which is defined in
6.19, "The Top Level Definition". The following sections define the
components of the message definition language building up to the
top-level production.
The Lumas syntax is defined using ABNF [ABNF].
6.1 Lumas Keywords
Lumas keywords are case-sensitive. Therefore "AS" can not be used in
place of "as". As ABNF literal strings are case-insensitive, this
section defines the Lumas keywords in a case-sensitive way.
as-kw = %x61.73 ; as in lowercase
ascii-kw = %x61.73.63.69.69 ; ascii in lowercase
b = %x62
bool-kw = %x62.6F.6F.6C ; bool in lowercase
bytes-kw = %x62.79.74.65.73 ; bytes in lowercase
combi-kw = %x63.6F.6D.62.69 ; combi in lowercase
const-kw = %x63.6F.6E.73.74 ; const in lowercase
d-upper = %x44 ; Uppercase D
d = %x64
date-kw = %x64.61.74.65 ; date in lowercase
double-kw = %x64.6F.75.62.6C.65 ; double in lowercase
embedded-kw = %x65.6D.62.65.64.64.65.64 ; embedded in lowercase
endmodule-kw = %x65.6E.64.6D.6F.64.75.6C.65
; endmodule in lowercase
extends-kw = %x65.78.74.65.6E.64.73 ; extends in lowercase
f = %x66
float-kw = %x66.6C.6F.61.74 ; float in lowercase
import-kw = %x69.6D.70.6F.72.74 ; import in lowercase
int-kw = %x69.6E.74 ; int in lowercase
into-kw = %x69.6E.74.6F ; into in lowercase
Cordell Expires August 1, 2007 [Page 12]
Internet Draft Lumas February 2007
ipv4-kw = %x69.70.76.34 ; ipv4 in lowercase
ipv6-kw = %x69.70.76.36 ; ipv6 in lowercase
lumas-kw = %x6C.75.6D.61.73 ; lumas in lowercase
module-kw = %x6D.6F.64.75.6C.65 ; module in lowercase
n = %x6E
oid-kw = %x6F.69.64 ; oid in lowercase
plug-kw = %x70.6C.75.67 ; plug in lowercase
pluggable-kw = %x70.6C.75.67.67.61.62.6C.65
; pluggable in lowercase
plugin-kw = %x70.6C.75.67.69.6E ; plugin in lowercase
r = %x72
s-upper = %x53 ; Uppercase S
s = %x73
single-kw = %x73.69.6E.67.6C.65 ; single in lowercase
struct-kw = %x73.74.72.75.63.74 ; struct in lowercase
t = %x74
time-kw = %x74.69.6D.65 ; time in lowercase
unicode-kw = %x75.6E.69.63.6F.64.65 ; unicode in lowercase
union-kw = %x75.6E.69.6F.6E ; union in lowercase
unquoted-ascii-kw = %x75.6E.71.75.6F.74.65.64.2D.61.73.63.69.69
; unquoted-ascii in lowercase
void-kw = %x76.6F.69.64 ; void in lowercase
w = %x77
w-upper = %x57 ; Uppercase W
x = %x78
z = %x7A
6.2 Lumas Parameters
The main building block of a Lumas message definition is the
parameter. There are three classess of parameter in Lumas, simple
parameters, compound parameters and reference parameters, which are
defined as:
lumas-parameter = simple-param / compound-param /
reference-param
A simple parameter typically describes a simple value such as a
string, integer or date. They may represent a name, a temperature or
a birthday.
Compound parameters are collections of simple parameters and other
compound parameters, similar to how Java and C++ classes group
together simple variables and other classes.
Reference parameters allow a parameter to be defined in terms of a
type (either simple, compound or reference) that is defined elsewhere
in the message definition.
6.3 Simple Parameters
The ABNF definition of a simple parameter is:
Cordell Expires August 1, 2007 [Page 13]
Internet Draft Lumas February 2007
simple-param = simple-type WS name [ OWS cardinality ]
[ WS as-kw WS explicit-tag ]
[ WS plugin-kw ] OWS ";" OWS
where 'WS' represents white space, and 'OWS' represents optional
white space. ('WS' and 'OWS' are defined in Section 8 - 'Common ABNF
Definitions'. Generally, comments can be included wherever white
space is allowed.)
As can be seen, the main parts of the definition of a simple
parameter are the simple type and the name. Additional specification
allows further control of the message contents. These fields are
discussed below.
6.4 The Simple Types
Simple parameters have simple types such as integers, booleans etc.
Each of Lumas' simple types are listed and described below. How
these simple types are specified in a message definition is described
in the following section.
The Lumas simple types are:
void
A parameter that has no value. This is most useful in unions
(wherein a converts a union into an enumerated type), and can
also be used in a struct to represent boolean events wherein
the absence of the parameter indicates false, and the presence
of the parameter indicates true. It is more useful than you
might at first think!
bool
A Boolean value. Can be true or false.
int
An integer value.
float
A floating point value. The constraints of a float specify the
float to be either in accordance with a single precision value
or a double precision value as specified in IEEE 754 [IEEE754].
The absence of a constraint indicates a single precision value.
ipv4
Represents an IPv4 address, but not the port.
ipv6
Cordell Expires August 1, 2007 [Page 14]
Internet Draft Lumas February 2007
Represents an IPv6 address, but not the port.
date
Date according to the Gregorian calendar, with year, month and
day of month. Other calendar types may be constructed from
primitive types if required.
time
Represents the time in hours, minutes and seconds using the 24
hour clock notation. By default the time MUST be adjusted to
UTC, unless the time can be guaranteed to have only local
significance.
oid
This is an ASN.1 style Object Identifier. This is primarily
included to enable identification of security protocols.
ascii
A string made up of ASCII characters, limited to the values 0
to 127.
unquoted-ascii
An ascii string usually has quote marks around it. This type
does not have quotes around it. Consequently it can not have
any white space, or include any special characters (such as
"=", ")", and "}") that would confuse the parser.
unicode
A string representing Unicode characters.
const
This type allows a constant value to be inserted into the
encoded message. It will typically be untagged. One thing it
might be used for is identifying the protocol of the message
definition. For example:
const <HTTP> protocol as ?;
bytes
An array of bytes. Also useful for carriage of opaque data.
embedded
The value is an embedded Lumas message. This allows layering
Cordell Expires August 1, 2007 [Page 15]
Internet Draft Lumas February 2007
of message definitions.
6.5 Simple Type Definition
Lumas simple types are specified in a Lumas message as described in
this section. The 'simple-type' construct represents the type of the
parameter. It has the following form:
simple-type = void-kw / bool-kw / integer-type / float-type /
ipv4-kw / ipv6-kw / date-kw / time-kw / oid-kw /
string-type / const-type / bytes-type /
embedded-type
As can be seen, many of the types are specified using a single
keyword. Other types such as integers and strings allow the
specification of additional constraints (such as the maximum value
that an integer is allowed to have). The definition of these types
are as follows:
integer-type = int-kw OWS "<" OWS int-constraint OWS ">"
float-type = float-kw OWS [ "<" OWS float-constraint OWS ">" ]
string-type = ( ascii-kw / unquoted-ascii-kw / unicode-kw )
[ OWS "<" OWS string-constraint OWS ">" ]
const-type = const-kw OWS "<" first-safe-char *( safe-char ) ">"
; See the section 'Notes on Comments' below
bytes-type = bytes-kw [ OWS "<" OWS length-constraint OWS ">" ]
embedded-type = embedded-kw [OWS "<" OWS embed-constraint OWS ">"]
The constraints for the numerical types are specified as follows:
int-constraint = min-int-constraint OWS ".." OWS max-int-constraint
[ OWS use-leading-zero-marker ]
min-int-constraint = ["-"] pos-number
max-int-constraint = ["-"] pos-number
use-leading-zero-marker = z ; lower case z
float-constraint = single-kw / double-kw
The constraints for the string, const, bytes and embedded types are
as follows:
Cordell Expires August 1, 2007 [Page 16]
Internet Draft Lumas February 2007
string-constraint = [ length-constraint ] [ OWS pattern-constraint ]
embed-constraint = [ length-constraint ]
[ OWS embedded-module-constraint ]
embedded-module-constraint = "(" OWS module-name OWS ")"
length-constraint =
[ min-len-constraint OWS ".." OWS ] max-len-constraint
min-len-constraint = pos-number
max-len-constraint = pos-number / unlimited-length-token
unlimited-length-token = "*"
These constraints use the following definition:
pos-number = 1*DIGIT ; Decimal number
/ "0"x 1*HEXDIG ; Hex number
/ 1*DIGIT b ; Specifies number of binary bits
In the case of 'integer-type', the mandatory constraint specifies the
minimum and maximum permissible values that the integer can take. If
the 'use-leading-zeros-marker' character ('z') is included in the
constraint, then where necessary the integer MUST be represented on
the wire with leading zeros to make the value fixed width. (This is
primarily applicable to combined types.)
The 'pos-number' construct used to specify the integer value
constraint has a form that can specify the number of binary bits.
The number of bits specified does not include any sign bits. Hence
an unsigned 32 bit number can be represented as 0..32b, whereas a
signed 32 bit number can be represented as -31b..31b (although this
will actually exclude the most negative value of a signed 32 bit
number).
A float is either a single precision IEEE 754 number or a double
precision IEEE 754 number [IEEE754]. The absence of a constraint
indicates single precision. (Developers are advised that in a number
of cases a binary IEEE 754 number can not be exactly represented in a
text-based base 10 format. Hence the decoder's binary representation
of a floating-point number may differ from the encoder's binary
representation of the number. If such discrepancies are not
acceptable, developers should use an alternative representation for
floating-point numbers.)
In the case of 'string-type', the optional constraint specifies the
minimum and maximum number of characters that are allowed to be
represented in a valid encoding and optionally a valid pattern of
characters. The minimum and maximum character constraint specifies
the minimum and maximum number of characters at the application
level, not the actual number of characters that are used to represent
the application level characters on the wire. The format of the
pattern constraint is designed to simplify regular expression
evaluation by preventing the need for the trial and error type
processing of general regular expressions. Thus, in accordance with
Cordell Expires August 1, 2007 [Page 17]
Internet Draft Lumas February 2007
Lumas' 80/20 principle, valid patterns MUST not require the regular
expression evaluator to do backtracking. The pattern constraint is
described further in Section 6.6.
In the case of 'bytes-type', the optional constraint specifies the
minimum and maximum number of bytes that are allowed to be
represented in a valid encoding. The constraint specifies the
minimum and maximum number of bytes at the application level, not the
number of characters that are used to encode those bytes on the wire.
The optional constraint in 'embedded-type' MAY specify the permitted
length of the embedded message and/or the Lumas module name of the
message that is to be embedded. For example:
embedded<(com.tech-know-ware.scp)> embedded-scp;
In the constraint syntax, a maximum value '*' means infinite or
unbounded.
6.6 The Pattern Constraint
The pattern-constraint has the following form:
pattern-constraint = "/" sub-pattern *( "|" sub-pattern ) "/"
sub-pattern = *pattern-element
pattern-element = pattern-char [ quantifier ]
pattern-char = %x20-29 / %x2C-2E / %x30-3E / %x40-5A
/ %x5D-7A / %x7D-FF ;not \/|[?*+{
/ escaped-char / special-char / character-class
escaped-char = "\\" ; Matches \
/ "\/" ; Matches /
/ "\|" ; Matches |
/ "\[" ; Matches [
/ "\?" ; Matches ?
/ "\*" ; Matches *
/ "\+" ; Matches +
/ "\{" ; Matches {
/ "\." ; Matches .
special-char = "\" r ; Matches the return character
/ "\" n ; Matches the new line character
/ "\" t ; Matches the tab character
/ "\" f ; Matches the form feed character
/ "\" s ; Matches white space [ \t\r\n\f]
/ "\" d ; Matches any digit [0-9]
/ "\" w ; Matches any word character [a-zA-Z_0-9]
/ "\" s-upper ; \S Matches anything not matched by \s
/ "\" d-upper ; \D Matches anything not matched by \d
/ "\" w-upper ; \W Matches anything not matched by \w
/ "." ; Matches any character
Cordell Expires August 1, 2007 [Page 18]
Internet Draft Lumas February 2007
character-class = matching-character-class / inverse-character-class
matching-character-class = "[" *(class-char / class-range) "]"
; For a successful match, the character in the string
; being matched must be one of the characters
; specified in the matching-character-class.
inverse-character-class = "[^" *(class-char / class-range) "]"
; For a successful match, the character in the string
; being matched must NOT be one of the characters
; specified in the inverse-character-class.
class-char = class-single-char / class-escaped-char
/ escaped-char / special-char
class-single-char = %x20-2C / %x2E-5B / %x5E-FF ; not - ] \
class-escaped-char =
"\-" ; Matches -
/ "\]" ; Matches ]
; /|[?*+{. need not be escaped within character-class
class-range = first-range-char "-" last-range-char
; The class-range matches all character that have
; an ASCII value greater or equal to that of
; first-range-char and less than or equal to
; last-range-char.
first-range-char = class-single-char / class-escaped-char
/ escaped-char
last-range-char = class-single-char / class-escaped-char
/ escaped-char
quantifier = "?" / "*" / "+"
/ "{" quant-min-occurs [ "," [ quant-max-occurs ] ] "}"
; The absence of a quantifier indicates once and only
; once
quant-min-occurs = 1*DIGIT
quant-max-occurs = 1*DIGIT
The 'pattern-constraint' allows a number of 'sub-pattern's to be
defined, any one of which may match the string value. In each
'sub-pattern' there are no grouping or alternation constructs. This
removes the need for backtracking and is suitable for 80% (or more)
of applications.
The pattern matching uses a "greedy" match. Each 'sub-pattern' can
be viewed as a concatenation of 'pattern-element's.
Each 'pattern-element' is a pattern-char and an optional
'quantifier'. The 'pattern-char' may actually match multiple
characters. The 'quantifier' indicates how many times the associated
'pattern-char' may appear in a valid pattern. If the 'quantifier' is
'?', the 'pattern-char' may appear 0 or 1 times. If the 'quantifier'
is '*', the 'pattern-char' may appear 0 or more times. If the
'quantifier' is '+', the 'pattern-char' may appear 1 or more times.
If the quantifier is of the form '{n,m}', the 'pattern-char' may
appear a minimum of n times, and a maximum of m times. If the
Cordell Expires August 1, 2007 [Page 19]
Internet Draft Lumas February 2007
quantifier is of the form '{n}', the 'pattern-char' must appear
exactly n times. If the quantifier is of the form '{n,}', the
'pattern-char' may appear n or more times.
To ensure that a string is in a suitable form to represent the value,
the application, subject to the quantifier of a pattern-element,
MUST, starting with the first character, keep matching successive
characters of the string with the first pattern-element until the
match fails. The application MUST then try to match the unmatched
character of the string along with subsequent characters in the
string with the next pattern-element, again taking into account the
quantifier for that pattern-element. If a pattern-element has a
quantifier that allows zero matches, then if the unmatched character
of the previous pattern-element does not match the current
pattern-element, the application should attempt to match the
unmatched character against the next pattern-element, and so on. The
process is repeated until the whole string is matched, or the
application is unable to match the current string character with an
appropriate pattern-element. If the application is unable to match
the current input character with an appropriate patter-element, the
whole sub-pattern match is deemed to have failed. The application
MUST NOT backtrack to a previous pattern-element in order to attempt
to find a match. This process is repeated for each of the
sub-patterns until one of the sub-patterns matches the string, or all
sub-patterns fail to match the string. The message MUST NOT be
encoded if none of the patterns matches the string.
Example patterns include /\d{4} \d{4} \d{4} \d{4}/ for a (UK) credit
card number, or /\d{4}-\d{2}-\d{2}T\d+:\d+:\d+Z/ for a date & time
matching the form 2003-03-03T12:45:32Z. The pattern / ?\d+|
?\d+\.\d+| ?\d+\.\d+[eE][+\ ]?\d+/ matches a floating point number
that can be represented as either an integer, a decimal without
exponent, or full 'scientific' format. This pattern illustrates some
of the impact of not allowing pattern groupings.
For more information on regular expressions, see [PERL].
6.7 The Name
Referring back to the simple-param definition, 'name' is the name of
the parameter. It has the format:
name = ALPHA *( ALPHA / DIGIT / "-" / "_" )
If there is no explicitly defined tag, then, in the case of character
based protocols, the name is also used as the parameter's tag
on-the-wire. In this case, the length of the name MUST NOT exceed 63
characters in length. See Section 6.9 for more on tagging.
6.8 Cardinality
The cardinality of a parameter specifies how many times a particular
Cordell Expires August 1, 2007 [Page 20]
Internet Draft Lumas February 2007
parameter can appear in a message. The format mirrors a C-like array
specification, but uses UML style ranges rather than the single
values used in C. If the cardinality field is absent, then one and
only one instance of the parameter must occur in a valid message.
The format of the cardinality specification is:
cardinality = "[" ( cardinality-range / "?" / "*" / "+" ) "]"
; [?] short hand for [0..1]
; [*] short hand for [0..*]
; [+] short hand for [1..*]
cardinality-range = [ min-occurrences ".." ] max-occurrences
min-occurrences = 1*DIGIT
max-occurrences = 1*DIGIT / unbounded-token
unbounded-token = "*"
Once again, the '*' in max-occurrences represents infinite or
unbounded. If in the 'cardinality-range' only 'max-occurrences' is
present and it has a numerical value, the containing struct MUST have
exactly 'max-occurrences' instances of the parameter.
Example cardinalities are as follows:
[0..1] ; Zero or one time
[?] ; Short hand for zero or one time
[0..*] ; Zero or more times
[*] ; Same as above, zero or more times
[1..*] ; One or more times
[+] ; Same as above, one or more times
[2..*] ; Two or more times
[5] ; Exactly five times
6.9 Tagging
A parameter can have a tag associated with it. A tag is a fixed
sequence of characters used on the wire to enable a parser to
identify the value or values that it is associated with.
By default, the name of the parameter is used as the tag. If the
name of the parameter is used as the tag the name MUST NOT exceed 63
characters in length.
Alternatively an explicit tag can be specified. It can be any
sequence of characters that do not have special significance to the
parser. To facilitate buffer management, an explicit tag MUST NOT
Cordell Expires August 1, 2007 [Page 21]
Internet Draft Lumas February 2007
exceed 63 characters in length. If the tag definition begins with a
"?", the "?" is discarded. Thus to specify that "?" should be used
as the tag on-the-wire, 'explicit-tag' should be specified as "??".
explicit-tag = [ "?" ] tag ; tag defined in common definitions
In certain constructs a parameter may also be untagged. This is
discussed in the relevant sections below.
6.10 The Plugin Extension Mechanism
Marking a parameter as 'plugin' indicates to the developer and the
tools that this parameter is (probably) not part of the original
message definition. For example, it might be a proprietary
extension. It also indicates that the parameter may not be present
in all received messages.
A parameter that is marked as 'plugin' MUST have an explicit-tag
defined for it. The explicit-tag MUST be constructed from a domain
name [DOMAINS] owned by the entity defining the parameter, plus a
sequence of characters that differentiate the explicit-tag from other
explicit-tags defined by the defining entity. The component parts of
the explicit-tag are presented in the normal domain name order so
that the most variable part of the string is at the beginning, thus
improving parsing efficiency.
An example explicit-tag for tech-know-ware.com might be:
my-tag.tech-know-ware.com
6.11 Reference Parameters
In a struct or union, it is also possible to reference types that are
defined elsewhere. The format of a 'reference-param' is:
reference-param = reference-name WS name [ OWS cardinality ]
[ WS as-kw WS explicit-tag ]
[ WS plugin-kw ] OWS ";" OWS
reference-name = [ module-name "::" ] name
Other forms of reference-parameter are defined in the sections below.
6.12 Compound Parameters
The compound types are struct, union and combi. For a struct,
depending on the various parameters' cardinality specifications, any
all or none of the parameters that a struct groups together may
appear in a valid encoding. In the case of a union, only one of the
parameters may be encoded in a valid instance. The combi form is
effectively a compact encoding of a struct, but is subject to a
number of additional constraints, which are described below.
Cordell Expires August 1, 2007 [Page 22]
Internet Draft Lumas February 2007
The definition format of each of the compound parameters is similar
to the simple parameters.
The 'compound-param' has the form:
compound-param = struct-param / union-param / combined-param
6.13 Struct Parameters
The definition of a 'struct-param' is:
struct-param = struct-kw WS name [ OWS cardinality ]
[ WS as-kw WS explicit-tag ]
[ WS pluggable-kw ]
[ WS plugin-kw ]
WS "{" struct-body "}" OWS ";" OWS
'Cardinality' and 'explicit-tag' have the same meaning as for the
simple types. The 'pluggable' keyword is defined in Section 6.17.
The format of the 'struct-body' is:
struct-body = *( untagged-lumas-parameter )
*( lumas-parameter )
*( struct-extension )
The struct body starts with all the untagged parameters. Untagged
parameters may have a cardinality other than one. Note that, if the
cardinality of an untagged parameter allows it to be absent, then
when encoded on the wire, if the untagged parameter is absent, then
all subsequent parameters, including tagged parameters MUST also be
absent. Thus great care is recommended when defining a message
syntax that allows for an untagged parameter to be absent.
The tagged parameters follow the untagged parameters.
When the message definition is subsequently extended, an instance of
the 'struct-extension' construct MUST be added to the end of the
struct definition for each version in which the struct is extended.
The 'struct-extension' construct wraps the added parameters within
square brackets to indicate that they are added in a new version.
This not only allows a developer to see what has been added in a new
version, but also allows a parser to do the same. This is important
because a parser must always consider absence of the new parameters
to be a valid encoding so that it can receive messages from entities
that are using an earlier version of the protocol. (To do this
manually would dictate that all extension parameters would have to
have a cardinality specification that included zero. This would be
tedious, potentially error prone, and loses some expressiveness.)
During the extension process, all new parameters MUST be added onto
the end of an existing construct, and the order of parameters MUST
NOT be rearranged from one version to the next. Note that
Cordell Expires August 1, 2007 [Page 23]
Internet Draft Lumas February 2007
'struct-extension' does not allow the specification of untagged
parameters.
All of these have a similar format to the types already defined,
except that in some cases they may be untagged. To make the ABNF
definition accurate it is therefore necessary to repeat the above
basic definitions with the appropriate tagging specifications.
The definition of the untagged struct parameters is:
untagged-lumas-parameter = untagged-simple-param /
untagged-compound-param /
untagged-reference-param
untagged-simple-param = simple-type WS name [ OWS cardinality ]
WS as-kw WS "?" OWS ";" OWS
untagged-compound-param = untagged-struct-param /
untagged-union-param /
untagged-combined-param
untagged-struct-param =
struct-kw WS name [ OWS cardinality ]
WS as-kw WS "?"
[ WS pluggable-kw ]
WS "{" struct-body "}" OWS ";" OWS
untagged-union-param = union-kw WS name [ OWS cardinality ]
WS as-kw WS "?"
[ WS pluggable-kw ]
WS "{" union-body "}" OWS ";" OWS
untagged-combined-param =
combi-kw WS name [ OWS cardinality ]
WS as-kw WS "?"
WS "{" combined-body "}" OWS ";" OWS
untagged-reference-param = reference-name WS name [ OWS cardinality ]
OWS ";" OWS
Note that the 'plugin' keyword is not applicable to untagged
parameters.
The tagged parameters have the basic parameter definition that was
initially presented, i.e. lumas-parameter.
The struct body extension fields have the format:
struct-extension = "[" OWS 1*( lumas-parameter ) "]" OWS
6.14 Union Parameters
Cordell Expires August 1, 2007 [Page 24]
Internet Draft Lumas February 2007
A union parameter has the following definition:
union-param = union-kw name [ OWS cardinality ]
[ WS as-kw WS explicit-tag ]
[ WS pluggable-kw ]
[ WS plugin-kw ]
WS "{" union-body "}" OWS ";" OWS
'Cardinality' and 'explicit-tag' have the same meaning as for the
simple types. The 'pluggable' keyword is defined in Section 6.17.
A union-body MAY have a single untagged integer parameter. All other
parameters MUST be tagged and have a cardinality of one and only one.
Other than the cardinality constraints of a union, a union can be
extended in the same way as a struct.
The untagged integer parameter allows integers to be defined that
have wild-carding options. For example, a union might be defined as:
union select
{
int<0..65535> numbered as ?;
void any as *;
};
Examples of the encoded form might be:
select = 12
select = *
The parameters within a union are only allowed unary cardinality to
avoid ambiguity in the on-the-wire encoding. If multiple instances
of a parameter must be included as an option in a union, it is
necessary to wrap the parameters within a struct, using something
similar to:
struct X { X x[1..*] as ?; };
The definition of a union-body is as follows:
union-body = [ integer-type WS name WS as-kw WS "?" OWS ";" OWS ]
*( singular-lumas-parameter )
*( union-extension )
As mentioned previously, most of the parameters within a union are
tagged and have a cardinality of one. Their defininition is:
Cordell Expires August 1, 2007 [Page 25]
Internet Draft Lumas February 2007
singular-lumas-parameter = singular-simple-param /
singular-compound-param /
singular-reference-param
singular-simple-param = simple-type WS name
[ WS as-kw WS explicit-tag ]
[ WS plugin-kw ] OWS ";" OWS
singular-compound-param = singular-struct-param /
singular-union-param /
singular-combined-param
singular-struct-param = struct-kw WS name [ WS as-kw WS explicit-tag ]
[ WS pluggable-kw ]
[ WS plugin-kw ]
OWS "{" struct-body "}" OWS ";" OWS
singular-union-param = union-kw WS name [ WS as-kw WS explicit-tag ]
[ WS pluggable-kw ]
[ WS plugin-kw ]
OWS "{" union-body "}" OWS ";" OWS
singular-combined-param = combi-kw WS name
[ WS as-kw WS explicit-tag ]
[ WS plugin-kw ]
OWS "{" combined-body "}" OWS ";" OWS
singular-reference-param = reference-name WS name
[ WS as-kw WS explicit-tag ]
[ WS plugin-kw ] OWS ";" OWS
The union extension operates in a similar fashion to that of a
struct, but references singular-lumas-parameters. Its definition is:
union-extension = "[" OWS 1*( singular-lumas-parameter ) "]" OWS
6.15 Combined Parameters
A combined parameter has the following definition:
combined-param = combi-kw name [ OWS cardinality ]
[ WS as-kw WS explicit-tag ]
[ WS plugin-kw ]
WS "{" combined-body "}" OWS ";" OWS
The combined compound type provides a simple mechanism for defining
new combined types similar to that used for date and time. All the
members of a combined type are encoded on the wire using their
untagged form and concatenated together with no intervening white
space. The result of the encoding MUST meet all the constraints of
an unquoted-ascii value. In addition, the parameters that make up
the combined type are subject to the following constraints:
Cordell Expires August 1, 2007 [Page 26]
Internet Draft Lumas February 2007
- Each unquoted-ascii parameter that is part of a combined
body MUST have a fixed number of characters,
- The first character of unquoted-ascii and const parameters
MUST NOT be a digit,
- integer values MUST NOT be adjacent.
The form of the combined body is:
combined-body = *( combined-simple-type WS name ";" )
combined-simple-type = integer-type / const-type /
unquoted-ascii-kw OWS "<" 1*DIGIT ">"
In many respects the combined type simply makes the encoded form look
prettier, and anything that can be encoded with the combined type can
also be represented with the struct type. The combined type should
also not be used for defining patterns of ASCII or Unicode
characters. Note also that a combined type is not pluggable and
hence can not be extended. It is therefore recommended that the
combined type be used sparingly.
An example of a combined type is:
combi protocol as ?
{
const <HTTP/> const1;
int<0..99> major-version;
const <.> const2;
int<0..99> minor-version;
};
Which might be encoded as: HTTP/1.1
Combined types also allow you to define numbers that contain decimal
points. An example of such is:
Cordell Expires August 1, 2007 [Page 27]
Internet Draft Lumas February 2007
union currency as ?
{
void dollars as US$;
void pounds as GBP;
void francs as FFr;
}
combi amount as ?
{
int<-31b..31b> main-denomination;
const <.> const2;
int<0..99z> sub-denomination;
};
Which might be encoded as: US$ 100.05
6.16 Referenced Parameters
It was mentioned previously that structs and unions can reference
types that are defined elsewhere. Referenced types do not have a
cardinality specification, and do not specify an explicit tag. This
is because the cardinality and tagging of the type are defined in the
item that does the referencing, rather than where the referenced type
is defined. (If a referenced type needs a cardinality other than
one, it is recommended that the technique for giving a parameter
within a union a non-unary cardinality be used.)
The definition of the referenced types are:
referenced-lumas-parameter = referenced-simple-param /
referenced-compound-param /
referenced-reference-param
referenced-simple-param = simple-type WS name OWS ";" OWS
referenced-compound-param = referenced-struct-param /
referenced-union-param /
referenced-combined-param
referenced-struct-param = struct-kw WS name [ WS pluggable-kw ]
OWS "{" struct-body "}" OWS ";" OWS
referenced-union-param = union-kw WS name [ WS pluggable-kw ]
OWS "{" union-body "}" OWS ";" OWS
referenced-combined-param = combi-kw WS name
OWS "{" combined-body "}" OWS ";" OWS
referenced-reference-param = reference-name WS name OWS ";" OWS
6.17 External Extensions - Plug and Pluggable
Cordell Expires August 1, 2007 [Page 28]
Internet Draft Lumas February 2007
A protocol may be extended via an external specification without
directly modifying the original definition. This may be to define a
proprietary extension, or to define an external profile of the base
protocol. The specification for this type of extension is:
external-extension =
plug-kw WS
( external-struct-extension /
external-union-extension )
WS into-kw WS into-name
*( OWS COMMA OWS into-name ) OWS ";" OWS
into-name = [ module-name "::" ] hierarchical-name
hierarchical-name = *( name "." ) name
external-struct-extension = 1*lumas-parameter
external-union-extension = 1*singular-lumas-parameter
This specifies a parameter that is to be plugged into an existing
construct. For example, if the following is defined:
plug
ascii cookie as cookie.tech-know-ware.com;
into my-example.my-addition;
The resulant definition would be treated as if it were:
struct my-example
{
int <0..255> participant-id as ?;
Action action as ?;
struct my-addition[0..1]
as new.tech-know-ware.com plugin;
{
bool tkw-app-capable as ?;
ascii cookie as cookie.tech-know-ware.com plugin;
};
};
The 'into-name' field indicates the name of the construct that the
item is to be plugged into. The optional 'module-name' part of the
name specifies the name of the module that contains the parameter
into which the extension is to be plugged. The 'hierarchical-name'
specifies the name of the parameter within the module that the
extensions are to be plugged into. The name is hierarchical because
parameters can be locally defined within structs and unions. The
hierarchical name is made up of the name of each of the parameter's
ancestors' names plus the name of the parameter itself joined
together by the '.' character. If the parameter to be extended is
contained within another parameter, the first name is the name of the
outer-most parameter that contains the parameter to be extended (i.e.
Cordell Expires August 1, 2007 [Page 29]
Internet Draft Lumas February 2007
one that is not contained within any other parameter), and the second
name is the name of the next outer-most parameter that contains the
parameter to be extended (if present), and so on until the parameter
itself is named. An illustration of the naming is shown in the
example above.
In a struct and union the 'pluggable' keyword is used to indicate
that the construct is a location that the message designers have
formally declared as extendible using the 'plug' mechanism. Lumas
compilers SHOULD emit warnings when extra material is plugged into
locations that are not marked as pluggable, but MUST NOT consider it
an error. Combined types are not pluggable.
If a party other than the original message designers use the plug
mechanism to define an extension, each added parameter MUST have an
explicit-tag constructed according to the rules described in Section
6.10.
6.18 Module Definition and Directives
A single protocol may be defined in a number of message definition
files. This might be for the purpose of accessing predefined
libraries, or specifying a definition that the current definition
extends. A message definition therefore begins with a set of
optional directives expressing this information. They have the form:
lumas-directives =
[ lumas-kw WS module-kw WS module-name OWS ";" OWS ]
[ extends-kw WS module-name [ WS as-kw WS alias ] OWS ";" OWS ]
*( import-kw WS module-name [ WS as-kw WS alias ] OWS ";" OWS )
module-name = [ "+" ] name *( "." name )
alias = name
The 'module' directive specifies the name of the module.
The 'extends' directive is used in a definition that contains an
external extension. The module-name in the extends specification
indicates the message definition that is being extended.
The 'import' statement indicates a library message definition that
contains referenced types that are referenced within the message
definition.
The 'module-name' is a hierarchical namespace that is based on the
name of the protocol, combined with a domain name [DOMAINS] owned by
the entity defining the protocol. The parts of the module-name are
combined together so that it looks like a regular domain name. The
order in which the domain levels is written is then reversed, so that
the top-level domain becomes the first written domain, and the second
level domain becomes the second written domain and so on. For
Cordell Expires August 1, 2007 [Page 30]
Internet Draft Lumas February 2007
example, if a protocol called the Simple Conference Protocol (SCP)
was defined by Tech-Know-Ware Ltd with a domain name of
tech-know-ware.com, the module name might be:
com.tech-know-ware.scp
It is the responsibility of the entity owning the domain name to
ensure that the module names it creates using its domain name are
unique.
Lumas defines a number of pseudo top level domains for its own
purposes. These are currently as follows:
+ietf A pseudo top level domain for the Internet Engineering Task
Force.
+iso A pseudo top level domain for the International Standards
Organisation. The sub-domains of this domain follow the
structure of ISO defined Object Identifiers. All spaces must
be removed and numbers in brackets should be ignored when
parsing this domain. E.g. iso(1) member-body(2) us(840)
rsadsi(113549) digestAlgorithm(2) 5 is represented as
+iso(1).member-body(2).us(840).rsadsi(113549).digestAlgorithm(2).5
and looked up as +iso.member-body.us.rsadsi.digestAlgorithm.5 .
+itu A pseudo top level domain for the International
Telecommunications Union. The sub-domains of this domain
follow the structure of ITU defined Object Identifiers.
Processing of such identifiers follows that defined for
processing of ISO Object Identifiers.
+lms A pseudo top level domain for defining Lumas extensions and
libraries.
+uuid A pseudo top level domain that uses Universally Unique
Identifiers for identification. An example is:
+uuid.4d36e96c-e325-11ce-bfc1-08002be10318
National standards bodies such as ANSI and BSI are defined under
their national top-level domain.
The 'alias' part of the import and export statements is used as an
alias of the 'module-name', so that items within 'module-name' can be
referenced in the abbreviated form of:
alias::item
For example, if a parameter definition called 'id' is contained in
the module 'com.tech-know-ware.scp', and the following import
statement is specified:
Cordell Expires August 1, 2007 [Page 31]
Internet Draft Lumas February 2007
import com.tech-know-ware.scp as tkwscp;
Then 'id' can be referenced by:
tkwscp::id
6.19 The Top Level Definition
Finally, we are in a position to describe a complete Lumas message
definition. This is:
lumas-definition = OWS lumas-directives
*external-extension
*referenced-lumas-parameter
[ OWS endmodule-kw OWS ";" ]
OWS
The first parameter defined within the message definition is the root
of the message definition tree, and is thus the outer-most construct
of an encoded message.
The end of a Lumas definition MAY be marked with the 'endmodule'
keyword. Marking the end of a module in this way allows multiple
Lumas definitions to be included in a single a file or document. If
the 'endmodule' keyword is not present, the definition ends at the
end of the file or document.
6.20 Locating Lumas within a Specification
It is not sufficient to use Lumas alone to define a protocol.
Additional narrative is required to define the semantics of a
protocol in addition to the syntax defined by Lumas. Thus Lumas and
narrative typically need to be combined in a single document. The
issue here is that at some point the Lumas must be extracted from the
document to be useful. If the Lumas is intermingled with the
narrative, it can be manually removed using cut and paste, however
this is tedious and error-prone. An alternative is to put all the
Lumas in a separate section so that it can be easily extracted.
However, this distances the Lumas specification from the narrative
that explains it, which is undesirable. A third option is to do both
- interleave one copy of the Lumas with the narrative and a separate
copy that can be used for compiling. This approach makes it
difficult to keep the two versions in step, and errors can easily
creep in.
Lumas compilers MUST implement a fourth option. Before parsing a
file, a compiler MUST first look for a line of text on which the
first non-white space text is lumas*/ and only has white space after
it. If such a line is found, compilation starts at the following
line. Subsequent narrative is then included in /* */ comment marks.
If no such line is found, then compilation begins at the beginning of
the file.
Cordell Expires August 1, 2007 [Page 32]
Internet Draft Lumas February 2007
For example, if any */ character sequences that follow this example
are removed (which have been included to discuss how they are used
and hence not properly matched), a Lumas compiler must be able to
find and process the following Lumas syntax:
lumas*/
// The first 'official' line of Lumas
struct top
{
not-much not-much;
};
/*
This is narrative.
*/
int <0..1> not-much;
/**
For a fuller description of Lumas comments, see Section 9.
7. On-the-Wire Representation
This section describes the default character based on-the-wire
encoding of Lumas messages. Messages defined using the Lumas message
definition language may be represented using other character encoding
forms or even binary forms.
7.1 Principles of the default On-the-Wire Encoding
The basic format of the default text based on-the-wire encoding is to
use the format:
tag = value
The tag is a fixed sequence of characters that identifies the
parameter with which a particular value (or values) is associated.
For example, there may be multiple parameters that have integer
values within a struct, that might specify, say, width and height.
The tags are used to identify which integer value belongs to which
parameter.
If there are multiple instances of a parameter, then they may either
be conveyed as multiple instances of the above construct, or as a
comma separated list, as in:
tag = value, value, value
If a tag is explicitly specified in the message definition, then this
is used on the wire. If no tag is explicitly specified, then the
name of the parameter is used as the tag.
Tagged items may appear in any order within a struct, and do not have
to be in the same order as they are defined in the struct definition.
Cordell Expires August 1, 2007 [Page 33]
Internet Draft Lumas February 2007
It is also possible to specify that no tag should be used on the wire
by specifying 'as ?'. All untagged items MUST appear in a struct in
the same order that they are defined in the message definition, and
MUST appear before any tagged items within a struct definition.
Untagged parameters that have greater than one instance MUST be
constructed as a comma separated list. Thus untagged values have the
format:
value
or:
value, value, value
If an untagged parameter has a cardinality that allows it to be
absent from an encoded message, then all subsequent parameters in the
enclosing struct, including tagged parameters, MUST also be absent.
Consequently, great care should be taken when defining a message
definition that allows untagged parameters to be absent.
For the examples quoted earlier, that is:
ascii rfc-name ;
int <1..30000> referenced-rfcs [0..255] as refers;
The format on the wire would be something like (depending on the
actual values in question):
rfc-name = 'Lumas' refers = 2234, 791, 2045
7.2 Formal On-the-Wire Representation
The principle representation of a Lumas defined message on the wire
is text based.
The top-level construct of a Lumas definition is a referenced type,
which essentially has no tag associated with it. (Indeed, the
presence of such a tag would not convey any information.) The
top-level construct on the wire is therefore either a struct body, or
a union body, as in:
lumas-text-message = (struct-body / union-body) OWS
A struct body can contain untagged and tagged parameters. All
untagged parameters MUST appear before any tagged parameters. The
values of untagged parameters that have non-singular cardinality MUST
be comma separated. Tagged parameters that have non-singular
cardinality may either have a tag followed by a comma separated list
of values, have multiple instances of the "tag = value" form, or some
combination of the two. All parameters in a struct body are
separated by white space, but white space is optional either before
or after the struct body. (This logical specification of where white
Cordell Expires August 1, 2007 [Page 34]
Internet Draft Lumas February 2007
space is used leads to an unfortunately complex ABNF definition for a
struct body.)
The definition of a struct-body is therefore:
struct-body = OWS (
struct-untagged-set
/ struct-tagged-set
/ (struct-untagged-set WS struct-tagged-set) )
struct-untagged-param = value *( COMMA value )
struct-untagged-set = struct-untagged-param *(WS struct-untagged-param)
struct-tagged-param = tag ; For a void parameter
/ (tag EQUAL value *( COMMA value ))
struct-tagged-set = struct-tagged-param *(WS struct-tagged-param)
Except for a single integer parameter that may be untagged, all items
of a union body MUST be tagged. Also, parameters must only have a
cardinality of one in the encoding to avoid ambiguities in the
encoded message. Therefore a union body has the form:
union-body = OWS ( integer-value
/ tag ; For a void parameter
/ ( tag EQUAL value ) )
The definition for 'tag' is defined in the common definitions
section, Section 8.
'value' has the following definition:
value = simple-value / compound-value
simple-value = bool-value / integer-value / float-value /
ipv4-value / ipv6-value /
date-value / time-value / oid-value /
ascii-value / unquoted-ascii-value / unicode-value /
const-value / bytes-value / embedded-value
Which in turn are defined as follows:
bool-value = True-kw / False-kw / T / F
integer-value = [ "-" ] 1*DIGIT
Cordell Expires August 1, 2007 [Page 35]
Internet Draft Lumas February 2007
float-value = float-number
/ NaN-kw ; IEEE 754 Not a Number
/ INF-kw ; Positive infinity
/ "-" INF-kw ; Negative infinity
; Note that "-0" is included in float-number
float-number = float-mantissa [ (e/E) float-exponent ]
float-mantissa = ["-"] 1*DIGIT ["." 1*DIGIT]
float-exponent = ["-"/"+"] 1*DIGIT
True-kw = %x54.72.75.65 ; 'True'
False-kw = %x46.61.6C.73.65 ; 'False'
T = %x54 ; 'T'
F = %x46 ; 'F'
NaN-kw = %x4E.61.4E ; 'NaN'
INF-kw = %x49.4E.46 ; 'INF'
E = %x45 ; 'E'
e = %x65 ' 'e'
The value encoding of a float is the base 10 representation of a base
2 number. There will typically be a degree of error introduced when
the conversion is made. Hence the float type should be looked upon
as a convenient way to convey floating point information where bit
level accuracy between the encoder's base 2 representation of the
number and the decoder's base 2 representation of the number is not
required. If this is not acceptable, then implementers should seek
other ways of presenting floating point numbers that do not suffer
from this loss of accuracy.
The 'float-mantissa' part of the number is NOT restricted to the
range 1.0 to 9.9.
An 'oid-value' is represented as:
oid-value = 1*DIGIT *( "~" 1*DIGIT )
As can be seen, only the oid's numerical values are encoded.
The IP address values are:
ipv4-value = 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT
ipv6-value = hexseq / hexseq "::" [ hexseq ] / "::" [ hexseq ]
hexseq = hex4 *( ":" hex4)
hex4 = 1*4HEXDIG
Note that the IPv4 address within an IPv6 address format is not
supported.
Date and time parameters have fixed width to aid parsing. As such
the various fields have leading zeros if required. (They adopt one
of the ISO-8601 formats.)
Cordell Expires August 1, 2007 [Page 36]
Internet Draft Lumas February 2007
Dates are according to the Gregorian calendar. Other calendar types
may be constructed from other types if required.
Unless the time can be guaranteed to have only local significance,
the time MUST be converted to UTC prior to including it in a message.
The time uses 24-hour clock notation. The absence of the
'time-seconds' field is interpreted as meaning seconds = 0.
date-value = date-year "-" date-month "-" date-day-of-month
date-year = 4DIGIT ; e.g. 2002
date-month = 2DIGIT ; With leading zeros, 01 to 12
date-day-of-month = 2DIGIT ; With leading zeros, 01 to 31
time-value = time-hours ":" time-minutes [ ":" time-seconds ]
time-hours = 2DIGIT ; With leading zeros, e.g. 00 to 23
time-minutes = 2DIGIT ; With leading zeros, e.g. 00 to 59
time-seconds = 2DIGIT ; With leading zeros, e.g. 00 to 59
unquoted-ascii-value = first-safe-char *( safe-char )
; See the section 'Notes on Comments' below
The string types have the format:
ascii-value =
"'" *( %x00-26 / %x28-5B / %x5D-7F / "\\" / "\'" ) "'"
unicode-value = DQUOTE
*( %x00-21 / %x23-5B / %x5D-FF / "\\" / "\" DQUOTE )
DQUOTE
; DQUOTE defined in [ABNF]
For 'unicode-value', each Unicode character is represented on the
wire using the UTF-8 transform [UTF8].
The 'bytes-value' encodes binary data using the Base64 transform
[BASE64], and is defined as:
bytes-value = "[" OWSNC base64-line *( WSNC base64-line ) OWSNC "]"
base64-line = 0*18( 4BASE64-CHAR )
(
( 4BASE64-CHAR ) /
( 3BASE64-CHAR "=" ) /
( 2BASE64-CHAR "=" "=" )
)
BASE64-CHAR = ALPHA / DIGIT / "+" / "/"
The white space between base64-lines should include characters to
move to a new line as specified in [BASE64].
Cordell Expires August 1, 2007 [Page 37]
Internet Draft Lumas February 2007
const-value = first-safe-char *( safe-char )
; See the section 'Notes on Comments' below
embedded-value = "(" *(%x00-FF) ")"
Any occurrence of '(' within an embedded message that is not part of
a string, must be matched by a corresponding ')'.
Illustrating the recursiveness of the message format, we have:
compound-value = struct-value / union-value / combined-value
struct-value = "{" struct-body "}"
union-value = union-body
combined-value = first-safe-char *( safe-char )
EQUAL = OWS "=" OWS
7.3 Marking Message Boundaries
Before a message is parsed it is necessary to know the boundary of
the message. There are many ways in which this can be done, and the
method adopted should be specified in the protocol specification.
However, in the absence of any other way, Lumas parsers should take
the presence of an unmatched closing brace to be the end of message
marker. Hence, the definition of a message delimited in this way
becomes:
delimited-lumas-text-message = lumas-text-message ( "}" / ")" )
7.4 Examples of Encoded Types
This section illustrates how the types look once they have been
encoded according to the syntax above. The tag of each item has the
format 'my-XXXX'. Except in the case of the 'void' example, the XXXX
part indicates the type that is encoded to the right of the equals
sign.
my-void // Tag only for a void parameter
my-bool = True
my-int = 5643
my-float = 102.4519
my-ipv4 = 192.0.2.1
my-ipv6 = 2001:DB8::1
Cordell Expires August 1, 2007 [Page 38]
Internet Draft Lumas February 2007
my-date = 2002-02-28
my-time = 12:00:00
my-oid = 1~2~840~113549~2~5
my-ascii = 'Lumas'
my-unquoted-ascii = Lumas
my-unicode = "Lumas"
my-const = Lumas
my-bytes = [ 01AF3C== ]
my-embedded = ( my-other-int=5 single-closing-bracket-text=')' )
my-struct = { 5434 All time=98787654654 }
my-union = 5434
my-union = Switch
my-union = Volume = 11
8. Common ABNF Definitions
The following definitions are common to both the message definition
syntax and the on the wire representation.
Cordell Expires August 1, 2007 [Page 39]
Internet Draft Lumas February 2007
tag = first-tag-safe-char 0*62( safe-char )
; Tag MUST NOT exceed 63 characters in length
first-tag-safe-char = %x21 /
; Not "
%x23-26 /
; Not ' ( )
%x28-2B /
; Not , -
%x2E-2F /
; Not 0 1 2 3 4 5 6 7 8 9
%x3A-3C /
; Not =
%x3E-5A /
; Not [
%x5C-7A /
; Not {
%x7C /
; Not }
%x7E-7F
; Visible characters except = , " ' { } ( ) [ -
; and digits (tags must not get confused with integers)
first-safe-char = first-tag-safe-char / DIGIT / "-"
safe-char = first-safe-char / DQUOTE / "'" / "{" / "(" / "["
; Not = } ) ,
WS = 1*( comment / SP / HTAB / CR / LF )
; HTAB, CR, LF defined in [ABNF]
OWS = [ WS ] ; Optional white space
WSNC = 1*( SP / HTAB / CR / LF ) ; Whitespace - no comment
OWSNC = [ WSNC ] ; Optional white space - no comment
COMMA = OWS "," OWS
; See section 'Notes on Comments' below for more on comments
comment = c-comment / cpp-comment / narrative-comment
c-comment = "/*" <any except */> (nested-end / hard-end )
nested-end = "*/"
hard-end = "**/"
cpp-comment = "//" *( HTAB / %x20-7F ) ( CR / LF )
narrative-comment = "/**" <any except "lumas*/"> "lumas*/"
; A comment is treated as a single space during parsing
ALPHA, DIGIT, HEXDIG and DQUOTE are defined in [ABNF].
9. Notes on Comments
To aid development Lumas allows comments to appear in both a message
definition and on the wire.
Cordell Expires August 1, 2007 [Page 40]
Internet Draft Lumas February 2007
On the wire, const and unquoted-ascii values MUST NOT begin with
comment start markers ('//' and '/*'). However, if the values
contain comment start marker characters, the characters MUST be
interpreted as part of the value, and do not indicate the start of a
comment.
For example, in the first of the examples below, the text
"This-is-a-comment" MUST be treated as a comment, whereas in the
second example the text "this-is-part-of-the-value" MUST be treated
as part of the value.
ascii-value = /*This-is-a-comment*/This-is-the-value
ascii-value = and-//this-is-part-of-the-value
In a message definition (but not on the wire) the ABNF c-comment
production allows nesting of comments. In a nested comment, each
occurrence of the '/*' character sequence MUST be matched by a
corresponding occurrence of the '*/' character sequence before the
comment ends or, the end of the comment can be forced by the hard end
of comment marker defined as '**/', which overrides the nesting.
(This provision allows the commenting out of headers and footers in
text only message definition documents.)
To further support Lumas embedded in specification documents, Lumas
supports a 'narrative-comment'. These are comments that may
coincidentally contain Lumas end of comment markers such as C example
code. The narrative comment begins with the symbol '/**', and ends
with the symbol 'lumas*/'.
A comment is treated as a single space for the purposes of parsing.
10. Locating Lumas Modules
It is not intended that applications should find Lumas modules
'on-the-fly'. It is expected that some human involvement will be
required to locate and interpret a Lumas definition. A Lumas
definition does not therefore have any way of specifying the physical
location from where a referenced definition can be acquired.
Instead, the strategy is to exploit the fact that a module definition
can begin with the text "lumas module" followed by the module name.
By entering this text (e.g. "lumas module org.lumas.mine") into a web
search engine (either one that covers the whole Internet, or is
limited to a specific site) a user can locate a particular Lumas
module. Determining whether a Lumas module so located is authentic
is beyond the scope of this document.
11. Mandatory to Understand
Many protocols require the capability to signal that certain
extension parameters are mandatory to understand, and if they are not
understood the message should be rejected in some way. Lumas
Cordell Expires August 1, 2007 [Page 41]
Internet Draft Lumas February 2007
provides no in-built mechanism for this feature. Instead
implementers are recommended to use a feature similar to SIP's
'Require' header [SIP] which presents a list of feature identifiers
that must be understood. Naturally, provision for this mechanism
must be included in the first version of the protocol, as it is not
possible to define such semantics at a later time. An example of
such a construct might be:
union require [*] pluggable { };
And could be populated using:
plug
void my-feature;
into require;
12. Security Considerations
Lumas itself does not have any security issues related to it, but the
security requirements of a protocol must be borne in mind when
writing a Lumas message definition. Common advice is that it is
difficult to add security to a protocol once it has been released,
and hence security issues must be considered from the outset. This
is of issue to a Lumas message definition as it may affect the format
of messages. This is particularly the case for integrity check
values that are effectively appended to the end of the message once
it is encoded. This may mean that it is appropriate to define both a
main message definition and a message definition that is a wrapper
that can provide cryptographic services for the main message
definition. For example, a message definition wrapper might look
like:
struct my-protocol-wrapper
{
embedded main-definition as ?;
bytes<1..64> signature as signed;
oid signature-algorithm as sig-alg;
};
13. Normative References
[ABNF]D. Crocker, & P. Overell, "Augmented BNF for Syntax
Specifications: ABNF, " Internet Engineering Task Force, RFC
4234, October 2005.
[BASE64]N. Freed, & N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message Bodies,"
Internet Engineering Task Force, RFC 2045, November 1996.
[DOMAINS]J. Postel, "Domain Name System Structure and Delegation,"
Internet Engineering Task Force, RFC 1591, March 1994.
Cordell Expires August 1, 2007 [Page 42]
Internet Draft Lumas February 2007
[IEEE754]"IEEE Standard for Binary Floating-Point Arithmetic," IEEE
754-1985, IEEE, 1985.
[KWORDS]S. Bradner, "Key words for use in RFCs to Indicate
Requirement Levels," RFC 2119, March 1997.
[PERL]L. Wall, T.Christiansen, & J. Orwant, "Programming Perl",
O'Reilly, ISDN-0-596-00027-8.
[UTF8]F. Yergeau, "UTF-8, a transformation format of ISO 10646," RFC
2279, January 1998.
14. Informative References
[ASN1]International Organization for Standardization, "Information
Processing Systems - Open Systems Interconnection -
Specification of Abstract Syntax Notation One (ASN.1)", ISO
Standard 8824, December 1990.
[CMS] R. Housley, "Cryptographic Message Syntax," RFC 2630, June
1999.
[DIAMETER]Pat R. Calhoun, John Loughney, Erik Guttman, Glen Zorn,
Jari Arkko, "Diameter Base Protocol,"
draft-ietf-aaa-diameter-xx, Work in Progress.
[IP] "Internet Protocol," RFC 791, September 1981.
[JSON]"Introducing JSON," http://www.json.org/.
[OMGIDL]"Common Object Request Broker Architecture: Core
Specification, " Object Management Group, December 2002.
(Accessible via:
http://www.omg.org/technology/documents/corba_spec_catalog.htm)
[RELAX]OASIS Technical Committee: RELAX NG, "RELAX NG Specification",
December 2001,
<http://www.oasis-open.org/committees/relax-ng/spec-20011203.html>.
[SCHEMA]Thompson, H., Beech, D., Maloney, M. and N. Mendelsohn, "XML
Schema Part 1: Structures", W3C REC-xmlschema-1, May 2001,
<http://www.w3.org/TR/xmlschema-1/>, and Biron, P. and A.
Malhotra, "XML Schema Part 2: Datatypes", W3C REC-xmlschema-2,
May 2001, <http://www.w3.org/TR/xmlschema-2/>.
[SIP] J. Rosenberg et al., "SIP: Session Initiation Protocol,"
Internet Engineering Task Force, RFC 3261, June 2002.
[SMTP]Klensin, J. (Ed.), "Simple Mail Transfer Protocol", RFC 2821,
April 2001.
[SNMP]J. Case, M. Fedor, M. Schoffstall, J. Davin, "A Simple Network
Cordell Expires August 1, 2007 [Page 43]
Internet Draft Lumas February 2007
Management Protocol (SNMP)," RFC 1157, May 1990.
[STRON]Jelliffe, R., "The Schematron", November 2001,
<http://www.ascc.net/xml/schematron/>.
[TCP] "Transmission Control Protocol," RFC 793, September 1981.
[TLS] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC
2246, January 1999.
[UDP] "User Datagram Protocol, " RFC 768, August 1980.
[XDR] R. Srinivasan, "XDR: External Data Representation Standard,"
RFC 1832, August 1995.
[XML] "Extensible Markup Language (XML) 1.0 (Second Edition)", W3C
REC-xml, October 2000.
[XMLBCP]S. Hollenbeck, M. Rose, and L. Masinter, "Guidelines for the
Use of Extensible Markup Language (XML) within IETF Protocols,"
RFC 3470, January 2003.
[XMLVER]David Orchard, "Versioning XML Vocabularies," XML.com,
December 03, 2003,
http://www.xml.com/pub/a/2003/12/03/versioning.html
15. Author's Address
Pete Cordell
Tech-Know-Ware Ltd
P.O. Box 30
Ipswich
IP5 2WY
UK
pete@tech-know-ware.com
http://www.tech-know-ware.com
Full Copyright Statement
Copyright (C) The IETF Trust (2007).
This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors
retain all their rights.
This document and the information contained herein are provided on an
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
Cordell Expires August 1, 2007 [Page 44]
Internet Draft Lumas February 2007
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Intellectual Property
The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to
pertain to the implementation or use of the technology described in
this document or the extent to which any license under such rights
might or might not be available; nor does it represent that it has
made any independent effort to identify any such rights. Information
on the procedures with respect to rights in RFC documents can be
found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any
assurances of licenses to be made available, or the result of an
attempt made to obtain a general license or permission for the use of
such proprietary rights by implementers or users of this
specification can be obtained from the IETF on-line IPR repository at
http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary
rights that may cover technology that may be required to implement
this standard. Please address the information to the IETF at
ietf-ipr@ietf.org.
Acknowledgment
Funding for the RFC Editor function is provided by the IETF
Administrative Support Activity (IASA).
Cordell Expires August 1, 2007 [Page 45]