Internet DRAFT - draft-hallambaker-protogen
draft-hallambaker-protogen
Internet Engineering Task Force (IETF) Phillip Hallam-Baker
INTERNET-DRAFT Comodo Group Inc.
Intended Status: July 6, 2015
Expires: January 7, 2016
Protocol Specification Tool
draft-hallambaker-protogen-01
Abstract
The syntax for the PROTOGEN protocol specification tool is described
and the use of the tool to generate protocol specifications,
prototype and production implementations. While the primary focus of
PROTOGEN is to develop protocols using JSON message syntax, the
PROTOGEN framework has been successfully applied to generate
prototypes using ASN.1, TLS, XML and RFC822 style syntax.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Hallam-Baker January 7, 2016 [Page 1]
Internet-Draft Protocol Specification Tool July 2015
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Previous work . . . . . . . . . . . . . . . . . . . . . . 3
1.2. Schema driven documentation . . . . . . . . . . . . . . . 3
1.3. Schema driven code generation . . . . . . . . . . . . . . 4
1.4. Application Examples . . . . . . . . . . . . . . . . . . 4
2. Protocol Specification . . . . . . . . . . . . . . . . . . . . 6
2.1. Protocol . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2. Description . . . . . . . . . . . . . . . . . . . . . . . 7
2.3. Service . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4. Transaction . . . . . . . . . . . . . . . . . . . . . . . 7
2.5. Message . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.6. Structure . . . . . . . . . . . . . . . . . . . . . . . . 8
2.7. Status . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.8. Using . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3. Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2. Inherits . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3. Null Values . . . . . . . . . . . . . . . . . . . . . . . 10
3.4. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.5. Decimal . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.6. DateTime . . . . . . . . . . . . . . . . . . . . . . . . 11
3.7. Binary . . . . . . . . . . . . . . . . . . . . . . . . . 11
4. Further Work . . . . . . . . . . . . . . . . . . . . . . . . . 11
5. Acnowledgements . . . . . . . . . . . . . . . . . . . . . . . 11
6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1. Normative References . . . . . . . . . . . . . . . . . . 11
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 11
Hallam-Baker January 7, 2016 [Page 2]
Internet-Draft Protocol Specification Tool July 2015
1. Introduction
The use of schemas to describe communication protocols is well
established and plays a central role in the development of ASN.1 and
XML based protocols. No such tools are currently widely used for
writing JSON based protocols.
It is the view of the author that the first, last and only purpose of
a protocol schema language is to enable the use of tools to support
the development effort. A schema language that delays rather than
advances the development of correct code and consistent documentation
has become a liabilty, not an enabler.
1.1. Previous work
One of the main reasons for the lack of such a tool has been the
widespread concern as to the complexity of traditional schema tools
and in particular the tendency of such tools to impose a complex data
model on simple problems.
One major difference in the design of the Protogen schema language to
its predecessors is that it does not attempt to support every feature
of the JSON data model. Protogen is designed to allow programmers to
design and implement network service protocols quickly using widely
used programming languages such as C, C# and Java. JSON features that
do not map conveniently to the majority of widely used languages are
best ignored.
The XML Schema language is particularly obtuse presenting a two level
type system in which element definitions provide typing for data and
element types provide a type system for elements. At least three
different inheritance mechanisms are supported.
The ASN.1 schema language introduces a distinction between lists and
sets that is entirely frivolous in a serialization format and
gratuitous distinctions between implicit and explict tagging.
The lesson to be drawn from these abominations is clear: The primary
purpose of a schema language should be to allow the programmer to
forget and ignore the wireline representation of protocol messages.
Features that allow fine tuning of the wireline representation should
be avoided.
While the notion of validating input data against a schema prior to
passing data to an application is superficially attractive, schema
constraints are rarely sufficient for this purpose. Thus applied to
protocol design, schema validation rarely provides a meaningful
benefit over checking that an encoding is well formed.
1.2. Schema driven documentation
Hallam-Baker January 7, 2016 [Page 3]
Internet-Draft Protocol Specification Tool July 2015
1.3. Schema driven code generation
1.4. Application Examples
The following is based on an example from [RFC4627].
[
{
"precision": "zip",
"Latitude": 37.7668,
"Longitude": -122.3959,
"Address": "",
"City": "SAN FRANCISCO",
"State": "CA",
"Zip": "94107",
"Country": "US"
},
{
"precision": "zip",
"Latitude": 37.371991,
"Longitude": -122.026020,
"Address": "",
"City": "SUNNYVALE",
"State": "CA",
"Zip": "94085",
"Country": "US"
}
]
The corresponding Protogen schema is:
Structure SiteList
Description
|A list of sites
Struct Site Sites
Multiple
Structure Site
Description
|A site location
String Country
Description
|ISO ALPHA-2 Country Code.
String precision
Decimal Latitude
Decimal Longitude
String Address
String City
String State
String Zip
Hallam-Baker January 7, 2016 [Page 4]
Internet-Draft Protocol Specification Tool July 2015
For the sake of example, the description of the site structure
entries is elided. While Protogen does not require description
elements to be provided to produce code, descriptions are of course
essential if useful documentation is to be generated.
Protogen is built using the Goedel code metasynthesizer which
attempts to eliminate all unnecessary clutter from the code
specification to minimize error. By default, indentation and the off-
side rule are used to denote block structure following the approach
used in occam and Python. Punctuation characters are only used to
delimit strings ("), text blocks (|) and comments (!).
Note that the Latitude and Longitude are specified using the type
Decimal rather than Float. This allows an implementation to avoid the
loss of precision that inevitably occurs converting between a binary
floating point representation such as IEEE 754 binary 64 and the
decimal encoding used in JSON.
The example fragment is sufficient to describe a data structure and
generate methods for JSON serialization and deserializtion. It is not
however sufficient to generate a useful implementation of a Web
service or client access library. to do this we must define a
protocol with services, transactions and messages defined as follows:
Protocol
A collection of related services.
Service
A set of transactions with a distinct DNS SRV prefix and HTTP
well known service label.
Transaction
A defined sequence of protocol messages supported by a service.
Currently only request-response design pattern is supported.
Message
A JSON document that corresponds to a request or response.
To build a service using the Site structure, we prepend add following
declaration:
Hallam-Baker January 7, 2016 [Page 5]
Internet-Draft Protocol Specification Tool July 2015
Protocol Sitefinder STFND
Service Finder "_siteFinder._wks" "SiteFinder" Request Response
Description
|Find sites for new donut stores.
Message Request
Struct Site WhereIAm
Message Response
Struct Site WhereAreDonuts
Multiple
We can now run Protogen to generate any of the following:
* Documentation in HTML
* Documentation in RFC2XML schema
* A C# client access library.
* A C# stub service library.
* A C header file describing corresponding C structures and data
tables to enable serialization/deserialization.
Support for partial classes makes C# a particularly attractive target
language for code generation as it allows classes produced by
generated code to be conveniently extended. Support for other modern
languages aligned with the Java/.NET data model requires only
straightforward modification of the code generator.
While the C# generator is optimized for development of protocols and
production code, the generator for C is intended for developing
production code after the protocol architecture is largely static.
The generator is intentionally biased towards flexibility rather than
functionality since a modern programer using C is most likely to be
doing so to build on a legacy code base. The ability to easily adapt
the output of the generator to the existing coding style(s) is likely
to be more highly valued than minimizing implementation effort.
2. Protocol Specification
2.1. Protocol
Top level specification of a protocol. The Protocol element contains
two attributes and a list of entries as follows:
Namespace
Namespace identifier for use in .NET and Java style programming
environments
Hallam-Baker January 7, 2016 [Page 6]
Internet-Draft Protocol Specification Tool July 2015
Prefix
Prefix for use in C style programming environments.
Entries
A list of [Service Transaction Message Structure Description
Using] elements
2.2. Description
Describes the parent element. Multiple description elements may be
specified in which case the first SHOULD be a standalone short
description. The description element has one attribute:
Text
Text field data identified by use of the | prefix.
2.3. Service
A service is a named set of transactions within a protocol namespace.
At present, due to an implementation limitation, all request and
response messages used in a service MUST inherit from a single
message type. This is bogus and should be fixed.
The service element has the following attributes:
ID
The code identifier of the service
Discovery
The DNS service prefix of the service for use in SRV, NAPTR
style discovery
WellKnown
The HTTP well known service prefix.
Request
The parent class for all request messages supported by the
service.
Response
The parent class for all response messages supported by the
service.
Entries
A list of [Description Status] entries
Hallam-Baker January 7, 2016 [Page 7]
Internet-Draft Protocol Specification Tool July 2015
2.4. Transaction
Specifies a Request-Response transaction supported by a specified
service.
At present transactions are specific to a service which is kind of
bogus if multiple services were defined.
The Transaction element has the following attributes:
Service
The identifier of the service
ID
The identifier of the transaction
Request
The request message which must not be an abstract type.
Response
The response message returned for normal completion. An
abstract type may be specified.
Entries
A list of [Description Status] entries
2.5. Message
Specifies a protocol message. This is almost the same as a structure
except that the name of a request message is a command to a server
and the name of a response message identifies a response.
Id
The message identifier
Entries
A list of [Description Abstract Inherits Boolean Integer Binary
Float Label Name String URI DateTime Struct Enum Status
Authentication Format Decimal] entries
2.6. Structure
2.7. Status
This feature is not yet implemented, the idea being that status codes
should be represented at both the HTTP layer and JSON layer so that
appropriate handling can be specified at either.
Hallam-Baker January 7, 2016 [Page 8]
Internet-Draft Protocol Specification Tool July 2015
2.8. Using
Specifies a message or structure defined in another schema.
3. Data Types
Protogen recognizes ten intrinsic data types. While this is
considerably larger than the three intrinsic types supported in JSON,
the additional expressive power allows the tools to do more work for
the programmer. For example, distinguishing strings that represent
date-time values from other strings allows the tool to perform the
work of encoding/decoding these values.
The following table sumarizes the Protogen schema types and their
(default) corresponding C#/C equivalents.
+----------+-----------------------------+-------------+------------+
| Schema | JSON | C# | C |
+----------+-----------------------------+-------------+------------+
| Boolean | true | false | bool | bool |
| | | | |
| Float | number | double | double |
| | | | |
| Decimal | number | Int64 | long long |
| | | | |
| Integer | number | Int64 | int |
| | | | |
| Binary | string (base64 encoded) | byte[] Data | BinaryType |
| | | | |
| Label | string | string | StringType |
| | | | |
| Name | string | string | StringType |
| | | | |
| String | string | string | StringType |
| | | | |
| URI | string | string | StringType |
| | | | |
| DateTime | string | DateTime | struct tm |
+----------+-----------------------------+-------------+------------+
Every data type supports the following options:
Required
The minimum number of occurrences is 1.
Multiple
Multiple values may be specified.
Description
Description of the element for use in code generation.
Hallam-Baker January 7, 2016 [Page 9]
Internet-Draft Protocol Specification Tool July 2015
Deaful
Default value for the element if unspecified.
While the Protogen schema definition does include additional options
for some data types (e.g. LengthBits, LengthFixed) these are only
used in the TLS encoding generator and are ignored when JSON encoding
is being used.
3.1. Abstract
Messages and structures may be marked Abstract which means that they
may be used as base classes for inheritance from other messages or
structures but cannot appear on the wire.
3.2. Inherits
Specifies that a message or structure inherits from another message
or structure.
Note that inheritance relationships are represented in the generated
code for languages that support inheritance (e.g. C#) and flattened
out in languages that do not (e.g. C).
3.3. Null Values
No distinction is made between a value that is not present and a
value that is present with the value null. Thus the following JSON
documents are considered to specify the same object.
{ "Value": 1 }
{ "Value": 1,
"Optional": null }
An entry that has the 'Required' option set MUST always be specified
even if the value is null.
3.4. Lists
No distinction is made between a list that is not present, a list
with the null value and an empty list. Thus the following encodings
desribe the same object:
Hallam-Baker January 7, 2016 [Page 10]
Internet-Draft Protocol Specification Tool July 2015
{ "Value": 1 }
{ "Value": 1,
"List": null }
{ "Value": 1,
"List": [] }
To simplify scripting language implementation an entry that has the
'Multiple' option MUST be encoded as a list.
3.5. Decimal
The decimal encoding provides an alternative to use of floating point
to represent decimal fractions.
Since 10 is not a power of 2, conversion between decimal and binary
fractions is inexact and using Real32 or Real64 values for this
purpose introduces an unnecessary loss of precision.
Since modern programming languages lack support for a Decimal
intrinsic type, this is implemented by mapping the datum to a 64 bit
integer with an offset of 1,000,000,000. This approach allows for
numbers up to 9,223,372 to be represented with nine digit precision.
3.6. DateTime
Date Time Values are encoded as strings in IETF format.
3.7. Binary
Binary values are encoded using BASE64URL encoding.
4. Further Work
5. Acnowledgements
6. References
6.1. Normative References
[RFC4627] Crockford, D., "The application/json Media Type for
JavaScript Object Notation (JSON)", RFC 4627, July 2006.
Author's Address
Phillip Hallam-Baker
Comodo Group Inc.
philliph@comodo.com
Hallam-Baker January 7, 2016 [Page 11]