Internet DRAFT - draft-zeppelzauer-data-over-sound
draft-zeppelzauer-data-over-sound
Internet Draft M. Zeppelzauer
Intended status: Experimental A. Ringot
Expires: September 2019 St. Poelten UAS
March 6, 2019
SoniTalk: An Open Protocol for Data-Over-Sound Communication
draft-zeppelzauer-data-over-sound-00.txt
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. This document may not be modified,
and derivative works of it may not be created, except to publish it
as an RFC and to translate it into languages other than English.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on September 6, 2019.
Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
Zeppelzauer, Ringot Expires September 6, 2019 [Page 1]
Internet-Draft SoniTalk March 2019
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Abstract
This document defines a new protocol for communication via sound (and
in particular via near-ultrasound) that is simple enough to be
implemented on devices with limited computational resources, such as
Internet-of-Things (IoT) devices. The near-ultrasonic frequency band
in the range of 18-22kHz represents a novel and so far hardly used
channel for the communication of different devices, such as mobile
phones, computers, TVs, personal assistants, and potentially a wide
range of IoT devices. Moreover, data-over-sound enables to connect
low-end hardware devices to the Internet by near field communication
with other Internet-connected devices. Data-over-sound requires only
a standard loudspeaker and a microphone for communication, and thus
has very low hardware requirements compared to other communication
standards such as Bluetooth, WLAN and NFC. "SoniTalk" is designed as
an open and transparent near-ultrasonic data transmission protocol
for data-over-sound. This document provides a specification of the
protocol at the lowest layer (physical layer) in the sense of the OSI
model.
Table of Contents
1. Introduction...................................................2
2. Details........................................................3
3. Security Considerations........................................6
4. IANA Considerations............................................7
5. Conclusions....................................................7
6. References.....................................................7
6.1. Normative References......................................7
6.2. Informative References....................................7
7. Acknowledgments................................................7
Authors' Addresses................................................9
1. Introduction
The typical frequency band for data-over-sound starts at 18kHz. This
band can be corrupted by noise from the environment, which requires a
number of counter measures to ensure a robust signal transmission.
Especially the temporally varying characteristics of the channel
makes the transmission of messages over longer time-spans more likely
to be corrupted. The proposed protocol tries to mitigate these
sources of error by including redundancy in the encoding. Redundancy
Zeppelzauer, Ringot Expires September 6, 2019 [Page 2]
Internet-Draft SoniTalk March 2019
is generated by encoding each bit in terms of a Manchester code with
a transition from high to low (and vice versa) for bit 1 and 0,
respectively. This type of redundancy makes the code not only more
robust, but also enables a simpler decoding of the message. To
minimize the temporal message duration and maximize data rate,
information is sent in multiple channels in parallel.
2. Details
Data in the protocol is represented by individual messages. Each
message is represented by an acoustic signal that encodes the
information contained in the message. A message has a temporal and a
spectral dimension, i.e., a two-dimensional layout in terms of
frequency and time (see Figure 1). Along the temporal dimension, a
message is composed of several consecutive blocks. Each message
starts with a "start block", followed by M "message blocks" and an
"end block". Each message block has a duration of D ms. The start-
and end blocks have a duration of D/2 ms. Each block spans multiple
carrier frequencies Fi, where Fi in {F1, F2, ... ,FC} are C equally-
spaced carrier frequencies covering a frequency band of B = FC-F1 Hz.
The spacing of the frequencies is S = B/(C-1) Hz. Each bit in the
message can be addressed by a block number and carrier frequency.
This layout allows for sending information in parallel on multiple
frequencies.
Information is encoded binary. Each message block encodes one bit at
each carrier frequency Fi. For a logical "1" the amplitude of the
first D/2 ms at frequency Fi of the block is "high" and the amplitude
of the second D/2 ms is zero. For a logical "0" the opposite is the
case, i.e., the amplitude of the first D/2 ms at carrier frequency Fi
of the block is zero and the amplitude of the second D/2 ms is
"high". The magnitude of "high" amplitude is not normative and
depends on the actual use case, employed hardware and the targeted
transmission range.
The binary message content is encoded across the carrier frequencies
(from lowest to highest frequency, i.e. F1 to FC) starting with the
first message block, i.e. the first bit is encoded at message block 1
and carrier frequency F1, the second bit is located at message block
2 and carrier frequency F2, etc.
Between two message blocks and in the middle of each block (i.e.
after the first D/2 ms of a message block) a pause can be inserted of
duration P with P >= 0. For a pause, the sending amplitude is set to
Zeppelzauer, Ringot Expires September 6, 2019 [Page 3]
Internet-Draft SoniTalk March 2019
zero. The overall message duration is thus: D/2 + P + D*M + P*(2*M-
1) + P + D/2 = D*(M+1) + P(2*M+1) ms.
The first and last blocks of a message represent the start- and end
blocks. Start and end blocks are represented by the following
encoding: a start block has "high" amplitude at the higher C/2
frequencies (C/2 rounded up in case C is an odd number) and zero
amplitude at the remaining frequencies. For the end blocks the
opposite is the case, i.e. "high" amplitude is present at the lower
C/2 (C/2 rounded down) carrier frequencies and zero amplitude for the
remaining frequencies.
From the above specification it follows that the number of bits that
can be represented by a message is: M*C. The theoretical maximal
data rate corresponds to 1000 / (D*(M+1) + P*(2*M+1)) * (M*C) bits
per second.
The schematic two-dimensional spectro-temporal layout (time at the x-
axis and frequency on the y-axis) of a message for parameters:
M=4 blocks,
C=8 frequencies,
D=2 (corresponding to the spacing of 2 characters along the
temporal axis: "--"),
P=4 (corresponding to the spacing of 4 characters along the
temporal axis: "----"),
encoding the following binary information:
"01010011 01101111 01101110 01101001"
is provided in the following. Character "+" indicates "high"
amplitude and "0" indicates zero amplitude. Pause periods are
indicated with the following pattern "...." for better visibility:
Zeppelzauer, Ringot Expires September 6, 2019 [Page 4]
Internet-Draft SoniTalk March 2019
+--------------------------------------------------------------+
| ^ |
| | ------------------------------------------------- |
| f | F8 | +....+....0....+....0....0....+....+....0....0 | |
| r | F7 | +....+....0....+....0....+....0....0....+....0 | |
| e | F6 | +....0....+....+....0....+....0....0....+....0 | |
| q | F5 | +....0....+....+....0....+....0....+....0....0 | |
| u | F4 | 0....+....0....0....+....0....+....0....+....+ | |
| e | F3 | 0....0....+....+....0....+....0....1....0....+ | |
| n | F2 | 0....+....0....+....0....+....0....1....0....+ | |
| c | F1 | 0....0....+....0....+....0....+....0....+....+ | |
| y | ------------------------------------------------- |
| | |
| | start message message message message end |
| | block block 1 block 2 block 3 block 4 block |
| | |
| -------------------------------------------------------> |
| time |
+--------------------------------------------------------------+
Figure 1 The spectro-temporal layout of a single message, "msg"
Note, the first eight bits of the message are encoded by the first
half of message block 1 from low to high frequency. The second half
of message block 1 represents the inverted information. The second
eight bits are encoded in the first half of message block 2 from low
to high frequency, etc.
Different profiles (configurations) of the protocol can be defined to
adapt it to the specific requirements of the respective use-cases.
The definition of a profile requires the following information:
D: the duration of a bit (i.e. a message block) in ms
P: the pause period in ms
F1: the lowest frequency in Hz
C: the number of frequencies
S: the spacing between successive frequencies Fi and Fi+1 in Hz
M: the number of message blocks
Zeppelzauer, Ringot Expires September 6, 2019 [Page 5]
Internet-Draft SoniTalk March 2019
3. Security Considerations
This specification is targeting solely the physical layer of the
protocol. Thus SoniTalk itself provides no communications security,
and therefore a large number of attacks are possible including replay
attacks, sniffing, eavesdropping, denial of service attacks, message
destruction and message insertion. A passive attack is sufficient to
recover the binary information of messages transmitted with SoniTalk.
No endpoint authentication is provided by the protocol as this
definition only targets the physical layer. Sender jamming is
trivial, and therefore making messages unreadable is trivial.
Attacks are however limited to the local environment around the
communicating parties (usually within a few meters). If the
communication takes place in a room, possible attacks are most likely
successful from inside the room and unlikely from outside the room as
near-ultrasonic signals hardly pass through walls.
Unlikely attacks are message deletion and message modification as
this would require to acoustically manipulate the message while it is
sent over the air. While it cannot be guaranteed with absolute
certainty such attacks would be extremely difficult, e.g. sending
interference sound to cancel out a message acoustically. Furthermore
acoustically modifying individual bits of a message for message
modification would require precise timing and would very likely
destroy the integrity of the message since the acoustic overlay would
introduce interferences.
To ensure data integrity the use of an error detecting (e.g. a CRC
code) or an error correcting code is highly recommended when encoding
the message. To establish, confidentiality the binary message should
further be encrypted, e.g. by a symmetric or asymmetric encryption
scheme where the keys should be exchanged over an out-of-band channel
(e.g. Bluetooth). Peer entity authentication is also not implemented
at the physical layer and needs to be provided at a higher layer.
It is the particular duty of the developers of applications using the
protocol to comprehensively inform the user about the near-ultrasonic
data exchange (both sending and receiving) and moreover to inform the
users when personal information is sent over the protocol.
Particular care has to be taken in selecting the carrier frequencies
for the data transmission so that no actively or passively
participating party is disturbed by potential hearable artifacts of
the acoustic data transmission. This in particular includes children
as well as animals in the environment.
Zeppelzauer, Ringot Expires September 6, 2019 [Page 6]
Internet-Draft SoniTalk March 2019
4. IANA Considerations
This document has no actions for IANA.
5. Conclusions
This internet draft introduces SoniTalk, which is the first open
protocol for acoustic near field communication via the near-
ultrasonic band. Near-ultrasound communication represents an
alternative and complement to other existing near-field communication
protocols, such as Bluetooth, radio-based NFC and WLAN and is
particularly well-suited for IoT devices thanks to its low hardware
requirements. This document specifies the protocol at the physical
layer and thus primarily focuses on the definition of the message
structure for information exchange. Extensions on top of this layer
are subject to future specification efforts.
6. References
6.1. Normative References
6.2. Informative References
[1] Hubert Zimmermann, OSI Reference Model - The ISO Model of
Architecture for Open Systems Interconnection, IEEE
Transactions on Communications, vol. 28, no. 4, April 1980, pp.
425-432
7. Acknowledgments
The work which led to this protocol specification was funded by
netidee Open Innovations of the Internet Foundation Austria.
This document was prepared using 2-Word-v2.0.template.dot.
Zeppelzauer, Ringot Expires September 6, 2019 [Page 7]
Internet-Draft SoniTalk March 2019
Appendix A. Scope and Remarks
A.1. Remarks
It is recommended to split the M*C bits of a message into E parity
bits for error detection and error correction and M*C-E bits for the
payload of the message. The size of the parity information is not
normative and depends on the actual application (e.g. environmental
conditions etc.)
The message length is fixed and must not vary. In case the specified
message length is longer than the actual information to be sent, the
remaining bits must be filled (e.g. by some special symbol) to comply
with the protocol specifications.
A.2. Out of Scope
The spacing of carrier frequencies, the actual height of the
frequencies, the pause duration P inside a message as well as the
spacing between successive messages is not part of this
specification.
This protocol specification focuses exclusively on the lowest network
layer (i.e. physical layer according to the OSI reference model [1]).
A protocol for distributing information across several messages,
session handling, addressing, error detection and correction as well
as synchronous and asynchronous communication is beyond this
specification and subject to future norming initiatives.
Zeppelzauer, Ringot Expires September 6, 2019 [Page 8]
Internet-Draft SoniTalk March 2019
Appendix B. Comments and Feedback
Please address all comments, discussions, and questions to
matthias.zeppelzauer@fhstp.ac.at
Authors' Addresses
Matthias Zeppelzauer
St. Poelten University of Applied Sciences
Matthias Corvinus-Strasse 15, 3100 St. Poelten
Austria
Email: matthias.zeppelzauer@fhstp.ac.at
Alexis Ringot
St. Poelten University of Applied Sciences
Email: alexis.ringot@fhstp.ac.at
Zeppelzauer, Ringot Expires September 6, 2019 [Page 9]