RFC : | rfc9226 |
Title: | |
Date: | 1 April 2022 |
Status: | EXPERIMENTAL |
Independent Submission M. Breen
Request for Comments: 9226 mbreen.com
Category: Experimental 1 April 2022
ISSN: 2070-1721
Bioctal: Hexadecimal 2.0
Abstract
The prevailing hexadecimal system was chosen for congruence with
groups of four binary digits, but its design exhibits an indifference
to cognitive factors. An alternative is introduced that is designed
to reduce brain cycles in cases where a hexadecimal number should be
readily convertible to binary by a human being.
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for examination, experimental implementation, and
evaluation.
This document defines an Experimental Protocol for the Internet
community. This is a contribution to the RFC Series, independently
of any other RFC stream. The RFC Editor has chosen to publish this
document at its discretion and makes no statement about its value for
implementation or deployment. Documents approved for publication by
the RFC Editor are not candidates for any level of Internet Standard;
see Section 2 of RFC 7841.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9226.
Copyright Notice
Copyright (c) 2022 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
Table of Contents
1. Introduction
1.1. The Pernicious Advance of Hexadecimal
1.2. Problems with Hexadecimal
1.3. Other Proposals
2. Bioctal
3. Objections to Be Dismissed
4. Security Considerations
5. IANA Considerations
6. Conclusion
7. Informative References
Acknowledgments
Author's Address
1. Introduction
1.1. The Pernicious Advance of Hexadecimal
Octal has long been used to represent groups of three binary digits
as single characters, and that system has the considerable merit of
not requiring any digits other than those already familiar from
decimal numbers. Unfortunately, the increasing use of 16-bit
machines and other machines that have word lengths that are evenly
divisible by four (but not by three) has led to the widespread
adoption of hexadecimal. Table 1 presents the digits of the
hexadecimal alphabet.
+=======+=======+
| Value | Digit |
+=======+=======+
| 0 | 0 |
+-------+-------+
| 1 | 1 |
+-------+-------+
| 2 | 2 |
+-------+-------+
| 3 | 3 |
+-------+-------+
| 4 | 4 |
+-------+-------+
| 5 | 5 |
+-------+-------+
| 6 | 6 |
+-------+-------+
| 7 | 7 |
+-------+-------+
| 8 | 8 |
+-------+-------+
| 9 | 9 |
+-------+-------+
| 10 | A |
+-------+-------+
| 11 | B |
+-------+-------+
| 12 | C |
+-------+-------+
| 13 | D |
+-------+-------+
| 14 | E |
+-------+-------+
| 15 | F |
+-------+-------+
Table 1: The
Hexadecimal Alphabet
The choice of alphabet is clearly arbitrary: On the exhaustion of the
decimal digits, the first letters of the Latin alphabet are used in
sequence for the remaining hexadecimal digits. An arbitrary alphabet
may be acceptable on an interim or experimental basis. However,
given the diminishing likelihood of a return to 18-bit computing, a
review of this choice of alphabet is merited before its use, like
that of the QWERTY keyboard, becomes too deeply established to permit
the easy adoption of a more logical alternative.
1.2. Problems with Hexadecimal
One problem with the hexadecimal alphabet is well known: It contains
two vowels, and numbers expressed in hexadecimal have been found to
collide with words offensive to vegetarians and other groups.
Imposing a greater constraint on the solution space, however, is the
difficulty of mentally converting a number expressed in hexadecimal
to (or from) binary. Consider the hexadecimal digit 'D', for
example. First, one must remember that 'D' represents a value of 13
-- and, while it may be easy to recall that 'F' is 15 with all bits
set, for digits in the middle of the non-decimal range, such as 'C'
and 'D', one may resort to counting ("A is ten, B is eleven, ...").
Next, one must subtract eight from that number to arrive at a number
that is in the octal range. Thus, the benefit of representing one
additional bit incurs the cost of two additional mental operations
before one arrives at the position where the task that remains
reduces to the difficulty of converting the remaining three digits to
binary.
These mental steps are not difficult per se, since a child could do
them, but if it is possible to avoid employing children, then it
should be avoided. An appeal to the authority of cognitive
psychology is perhaps also due here, in particular to the "seven plus
or minus two" principle [Miller] -- either because octal is within
the upper end of that range (nine) and hexadecimal is not, or else
because the difference in the size of the alphabets is greater than
the lower end of that range (five). Either way, it is almost
certainly relevant.
1.3. Other Proposals
Various alternatives have already been suggested. Some of these are
equally arbitrary, e.g., in selecting the last six letters of the
Latin alphabet rather than the first six letters.
The scheme that comes closest to solving the main problem to date is
described by Bruce A. Martin [Martin] who proposes new characters for
the entire octal alphabet. While his principal motivation is to
distinguish hexadecimal numbers from decimals, the design of each
character uses horizontal lines to directly represent the "ones" of
the corresponding binary number, making mental translation to binary
a trivial task.
Unfortunately for this and other proposals involving new symbols,
proposals to change the US-ASCII character set [USASCII] might no
longer be accepted. Also, it seems unrealistic to expect keyboards
or printer type elements (whether of the golf ball or daisy wheel
kind) to be replaced to accommodate new character designs.
2. Bioctal
Table 2 presents the hexadecimal alphabet once again, this time in a
sequence of two octaves with values increasing left to right and top
to bottom.
+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+---+
| 8 | 9 | A | B | C | D | E | F |
+---+---+---+---+---+---+---+---+
Table 2: The Hexadecimal
Alphabet in Sequential
Octaves
Arranged thus, the binary representation of each digit in the second
octave is the same as the digit above it, but with the most
significant of the four bits set to '1' instead of '0'.
The incongruity of two decimal digits in the second octave also
suggests that, in blindly aligning with four bits, hexadecimal (six
plus ten, neither of which are powers of two) misses an opportunity
to align also with three bits.
Bioctal restores congruence by replacing the second row with
characters mnemonically related to the corresponding character in the
first octave.
Table 3 shows the compelling result.
+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+---+
| c | j | z | w | f | s | b | v |
+---+---+---+---+---+---+---+---+
Table 3: Bioctal in
Sequential Octaves
The mnemonic basis is the shape of the lowercase character. This is
seen directly for '2', '5', and '6'. For '3', '4', and '7', the
corresponding letters are the result of a quarter-turn clockwise
(assuming an "open" '4'). The choice of 'c' and 'j' for '0' and '1'
avoids vowels and lowercase 'L', the latter being confusable with '1'
in some fonts.
With this choice of letters, it is immediately evident that both
problems with hexadecimal are solved. Mental conversion is now
straightforward: if the digit is a letter, then the most significant
of the four binary bits is '1', and the remaining three bits are the
same as for the Arabic numeral with the same shape in the first
octave.
3. Objections to Be Dismissed
Several objections can be anticipated, the first of which concerns
the name. The term "bioctal" is already used to refer to the
combination of two octal characters into a single field on, for
example, paper tape (e.g., [UNIVAC]). However, if the word "bioctal"
must be disadvantaged relative to words such as "biannual" in the
number of meanings it is allowed to have, then it is the paper tapers
who must give way: in that context, the "octal" part of "bioctal"
refers to the number of distinct values that three bits can have,
while the "bi" refers to a doubling of the number of bits, not
values. A meaning depending on such a discordant etymology does not
deserve to endure.
Second, it may be argued that the use of hexadecimal has already
become too entrenched to be changed in the short term: Bioctal should
be introduced only after those working in the industry who have grown
accustomed to hexadecimal have retired. Such a dilatory contention
cannot be allowed to impede the march of progress. Instead, any data
entry technician who claims to have difficulty with bioctal may be
reassigned to duties involving only binary numbers.
A third possible objection is that numbers in bioctal do not sort
numerically. However, this assumes a sort based on the US-ASCII
order of symbols; it is quite possible that bioctal numbers sort
naturally in some lesser known variety of EBCDIC. Further,
resistance to numeric sorting may be an indicator of virtue, being
suggestive of an alphabet with a certain strength of character.
One difficulty remains: Not all computers support lowercase letters.
While this is indeed true, it should be confirmed in any particular
instance: the author has observed that in many cases a machine having
a keyboard with buttons marked only with uppercase letters also
supports lowercase letters. In any case, it is permissible to use
uppercase letters instead of the lowercase ones of Table 3; the
morphology mnemonic continues to work for most bioctal digits in
uppercase, although an extra mental cycle is required for 'B'.
4. Security Considerations
The letters 'b' and 'f' appear in both the bioctal and hexadecimal
alphabets, which makes potential misinterpretation a concern. A case
of particular hazard arises where two embedded systems engineers work
to develop a miniature lizard detector designed to be worn like a
wristwatch. One engineer works on the lizard proximity sensor and
the other on a minimal two-character display. The interface between
the circuits is 14 bits. To make things easier, the engineer working
on the display arranges for these bits to be set in a pattern that
allows them to be used directly as two seven-bit US-ASCII characters
indicating the most significant lacertilian species detected in the
vicinity of the device. Due to the use of an old US-ASCII table
(i.e., one in hex, not bioctal) and human error, some of the values
specified as outputs for the detection subsystem are in hexadecimal,
not the bioctal the engineer developing that subsystem expects --
including, in the case of one type of lizard, "4b 4f". The result is
that the detector displays "NL" (No Lizards) when it should display
"KO" (Komodo dragon). This may be considered prejudicial to the
security of the user of the device.
Extensive research has uncovered no other security-related scenarios
to date.
5. IANA Considerations
This document has no IANA actions.
6. Conclusion
Bioctal is a significant advance over hexadecimal technology and
promises to reduce the small (but assuredly non-zero) contribution to
anthropogenic global warming of mental hex-to-binary conversions.
Since the mnemonic basis of the alphabet is independent of English or
any other particular natural language, there is no reason that it
should not be adopted immediately around the world, excepting perhaps
certain islands of Indonesia to which _Varanus komodoensis_ is
native.
7. Informative References
[Martin] Martin, B. A., "Letters to the editor: On binary
notation", Communications of the ACM, Vol. 11, No. 10,
DOI 10.1145/364096.364107, October 1968,
<https://doi.org/10.1145/364096.364107>.
[Miller] Miller, G. A., "The Magical Number Seven, Plus or Minus
Two: Some Limits on Our Capacity for Processing
Information", Psychological Review, Vol. 101, No. 2, 1956.
[UNIVAC] Sperry Rand Corporation, "Programmers Reference Manual for
UNIVAC 1218 Computer", Revision C, Update 2, November
1969, <http:/bitsavers.computerhistory.org/pdf/univac/
military/1218/PX2910_Univac1218PgmrRef_Nov69.pdf>.
[USASCII] American National Standards Institute, "Coded Character
Set -- 7-bit American Standard Code for Information
Interchange", ANSI X3.4, 1986.
Acknowledgments
The author is indebted to R. Goldberg for assistance with Section 4.
Author's Address
Michael Breen
mbreen.com
Email: rfc@mbreen.com
ERRATA