Internet DRAFT - draft-guan-paws-smart-database
draft-guan-paws-smart-database
Jianfeng Guan
Neng Zhang
Changqiao Xu
Mingchuan Zhang
PAWS Hongke Zhang
Internet-Draft BUPT
Intended status: Informational Hongke Zhang
Expires: June 12, 2013 December 12, 2013
PAWS Smart Database
draft-guan-paws-smart-database-00
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html
This Internet-Draft will expire on June 12, 2014.
Copyright Notice
Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
<Guan, et al.> Expires June 12, 2014 [Page 1]
Internet-Draft PAWS Smart Database December 2013
Abstract
This document provides a Smart Database operation mechanism for PAWS.
By this mechanism the master device gets the optimized white space it
should communicate to in the regulatory domain. The mechanism is an
extension of protocol to access spectrum Database based on user
behavior analysis and machine learning concept.
Table of Contents
1. Introduction.............................................. ..2
2. Conventions used in this document............................3
3. Procedure Overview........................................ ..4
3.1. Problem Description.....................................4
3.2. Multi-Dimensional Aggregation Policy....................5
3.3. Data Preprocessing......................................6
4. Specification........................................... ....6
4.1. Feature Abstraction.....................................6
4.2. Dataset Training by Machine Learning Methods............8
4.2.1. User Behavior Clustering...........................8
4.2.2. Binary Prediction..................................8
4.2.3. Spectrum Service Recommendation....................8
4.3. Prediction Results......................................9
5. Working flow............................................ ....9
5.1. Spectrum prediction scenario............................9
5.2. WSDB Commendation Procedure............................10
6. Security Considerations.....................................10
7. IANA Considerations.................................... ....10
8. Conclusions............................................ ....11
9. References............................................. ....11
9.1. Normative References...................................11
10. Acknowledgments...................................... .....11
Authors'Addresses................................ ............12
1. Introduction
Nowadays, the conception of white space allocation and utilization
can come true due to the dynamic spectrum access technology. The
increasing spectrum allocation algorithms and industrial solutions
have been progressively proposed and implemented from lab to reality,
as well as gradually accepted standards presented by IETF working
group PAWS. In PAWS protocol, the Database is responsible for
spectrum allocation to the master device. However, there is an
<Guan> Expires June 12, 2014 [Page 2]
Internet-Draft PAWS Smart Database December 2013
emerged problem that the user behavior of spectrum usage differs
from each other while the Database can exclusively distribute users
with same spectra stored in the server. This would be another kind
of waste due to such imbalanced spectrum usages. From another
perspective, although the white space could realize spectrum usage
diversity through dynamic random access, taking into account the
reasons for fairness allocation and security considerations, some
manual intervention and administrative controls are necessary to
coordinate spectrum resources intensively. Likewise, heavy
information overload caused by competition for one spectrum could be
balanced among multiple white spaces equilibration.
With respect to such diversified spectrum access motive and
management issues, some studies have been undertaken to optimize the
spectrum allocation while seldom would concentrate on the above
issues. The European FP7 FARAMIR project focuses on spectrum
measurement with performance characteristics, to increase the radio
environmental and spectral awareness under dynamic spectrum access
scenarios. Traffic management research and projects are being
carried out in international communications companies to realize
efficient spectrum utilization via cloud computing as user behavior
demand. But relevant standards have not yet appeared and so far
every user is subject to access static spectra with various services.
Obviously one format does not fit all.
Based on the above observation, we propose a Smart Database analysis
and operation mechanism for PAWS. Unlike previous work, our approach
allows to characterize spectrum usage behavior applied to different
purposes flexibly. The smart Database is proposed initially to
enable user behavior recognition and demand-driven spectrum
distribution. By this mechanism the master device and slave device
can get the optimized WSDBs to communicate with better quality of
Experience (QoE) in the regulatory domain. Our protocol is an
expansion of the existing PAWS protocol to boost advanced network
functions and spectrum usage efficiency.
2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [RFC2119].
The terminology from PAWS: problem statement, use cases and
requirements PAWS RQMTS [PAWS RQMTS] is applicable to this document.
White Space Database Analysis Server (WSDB AS):
<Guan> Expires June 12, 2014 [Page 3]
Internet-Draft PAWS Smart Database December 2013
This is a specific smart WSDB with cognitive ability and new
functionalities such as selecting history data to train samples and
learn the user behavior. This server acts as a smart analysis center,
abides by learning, control and coordination for white space,
benefits both to WSDB and clients. The server operates with three
functions: users and service clustering, service prediction with
data learning and recommendation analysis with collaborative
filtering. The primary goal is to provide the proper white space
spectrum towards users?access request. This server can be
integrated with a normal WSDB, or a standalone administrator with
other auxiliary management functions, depending on the regulatory
domain scope and performance requirement.
This draft is in scope for the reason that it could provide a group
of formatted information for querying the Database using a smart
method. Moreover, the device receives a list of available whitespace
frequencies at the specified condition with a probability. The
device can select a spectrum and send an acknowledgment to the
Database. To some extent, the Database can be more cognitive after
we expand the Database functions with regard of learning the user
condition when querying.
3. Procedure Overview
3.1. Problem Description
As previously mentioned, in current PAWS protocol, a typical case is
that if plenty of users are simultaneously allocated with a same
spectrum resource by Database, one that with small telephone traffic
would result in bandwidth surplus while others with video delivery
may suffer great QoS degradation due to interferences or limited
bandwidth. Our goal is to allocate spectrums based on their
attribute and usage behavior.
For instance, some low frequency bands with long wavelength are fit
for coverage, while some others for capacity, or suitable for large
video transmission. Relying on user behavior analysis, a smart
Database can recognize and match, make decisions to select the
appropriate spectrum for users. Such automatic configuring functions
also conform to an especially vital concept in future software-
defined network and software-defined radio trend.
To realize such functions, we attempt to employ machine learning
methods to capture the user behavior pattern based on two reasons.
Firstly, various mobile communication services show different and
subtle characteristics which are hard to analyze by one simple
<Guan> Expires June 12, 2014 [Page 4]
Internet-Draft PAWS Smart Database December 2013
intelligent algorithm. Secondly, the requirement in big data era
make machine learning methods over perform than other learning
methods such as reinforce learning or correlation analysis to some
extent.
We classify the general procedure as sensing, deciding and
recognition. First we will describe the label selection. Then we
will discuss the data mining methods. Correspondingly, the
prediction results will be given later. After the analysis of the
WSDb AS, the protocol interaction with users will be showed along
with newly added optimized parameters.
3.2. Multi-Dimensional Aggregation Policy
For the purpose of management assumption, the AS can be deployed in
different platforms to send the results to Master Device uniformly
or Slave Device directly. Especially for AS location, it can be
deployed on Master Device, telecommunications or Internet
enterprises. To obtain the multi-dimensional user data samples, a
serious of packets inspection or traffic monitor tools and our smart
analysis function can be combined to deeply probe potential demand
of bandwidth and service for users. Further in view of community
benefit, we collect and aggregate data flows by five common
deploying policies:
(1) data flows for users that share a master device. The Database
can be deployed on a base station for real-time analysis and
computing. This is a most basic method to manage the spectrum
occupancy and redistribution.
(2) data flows for users that go to a same master device. On
account of security consideration for traffic volume, we can
allocate some kind of white space such as a trust channel to
users.
(3) data flows that pass through a backbone network or a
telecommunications. This is another common performing method
for commercial value promotion of spectrum and traffic and
bandwidth planning.
(4) data flows on an Internet enterprise such as Youtube, Facebook.
Take Youtube for example, users that request for one video can
be aggregate to cultivate one behavior habit and distribute
them a relatively large bandwidth.
<Guan> Expires June 12, 2014 [Page 5]
Internet-Draft PAWS Smart Database December 2013
(5) data flows for users applying a same application such as
WeChat. The spectrum and traffic demand would vary from one
service to another, or even one application may contain a
serious of service such as video, audio and text. The spectrum
features can be utilized to pack to a bundle of functions.
Although the data can be easily collected by these policies, limited
by respective business area of companies, the potential value of
data cannot be immensely released. Thus, data mining can be explored
more sufficiently based on the cooperation of these entities.
3.3. Data Preprocessing
In future ubiquitous network era, personal traffic volume may be all
kinds of information sources including sound, image, video,
fingerprint, product information, biological information or brain
wave. Those will traverse among countless user equipments and make
it more difficult to organize. In our model, we adopt machine
learning algorithms to abstract user behavior features and predict
the spectrum usage. In this preprocessing step, our goal is to
normalize the messy data into a training dataset.
Firstly, we adopt general cloud technologies such as HDFS and Map-
reduce methods to perform segmented metadata storage. Then the raw
data would be aggregated into a dataset with one policy above. After
cloud processing and data cleaning, different types of data could be
normalized into structured data. In some cases, only a few samples
can be trained to predict for small data size. Otherwise random data
sampling would be required to reduce the big data complexity.
4. Specification
Here is how the Database trains the datasets and predicts a suitable
white space to assign. A general procedure is to abstract features,
train datasets and predict new data results. It would be a great
utilization for scaling and parallelizing machine learning
algorithms on big data inside the cloud.
4.1. Feature Abstraction
The user features and parameters selection comply with a general
unsupervised modeling process. Common feature selection and feature
extraction methods such as Filter, Principal Components Analysis
(PCA) and Singular Value Decomposition (SVD) are feasible to find
significant feature training subsets to some extent. Unlike
traditional wireless resource distribution conditions, the features
<Guan> Expires June 12, 2014 [Page 6]
Internet-Draft PAWS Smart Database December 2013
in white space access would be more complicated. Here we elaborate
several typical features on behalf of user behaviors.
1 Geocation: It is noted that available spectrums are often sensed
in a limited area so that topographic information of slave device
would affect white space quality and selection. Specific geographic
information for the latitude and longitude of the antenna height,
etc., can be quantized into a value as a characteristic for data
learning.
2 Time label: this feature is composed of two variables. On one hand,
the different levels of time scale affect the user behavior pattern.
For example, in the beginning of a month, enough monthly mobile data
plans may not impel users to intensely seek other resources, thus
less frequency hopping in the beginning and similarly more white
space requests in the end of a month. Moreover, the spectrum
requirement varies in one day. On the other hand, the spectrum
occupancy behavior is also influenced by usage time interval.
According to the timestamp, this value could be quantized as
accurate to minute and time scale would be quantized as every hour
of subsection in a month.
3 service types: With respect to numerous applications such as
streaming video, Voice over IP (VoIP), e-commerce, Enterprise
Resource Planning (ERP) and others, we intend to differentiate them
so as to provide a better QoE in addition to best-effort service.
Obviously, different applications have variable demands for delay,
jitter, bandwidth, packet loss, and availability. Referred to the
definition of RFC 4594 and 5127, in view of tolerance to packet loss,
delay and jitter, we classify customer service as four types, ten
classes with priority values. Meanwhile, referred to the standards
of operators and other entities, service types can be classified
more flexibly.
4 roam state: this is also a two dimensional feature which have a
current roaming state and a handoff frequency of one device out and
in a resident area.
It is believable that as increasing mobile apps and services emerge,
more features like biological data will be introduced into training
sets so as to redefine the feature abstraction criterion with
machine learning.
<Guan> Expires June 12, 2014 [Page 7]
Internet-Draft PAWS Smart Database December 2013
4.2. Dataset Training by Machine Learning Methods
This step is to train the established datasets and validate the test
results. The primary goal is to predict a most suitable white space
according to the user behavior condition. Moreover, other suitable
service can be predicted and recommended as well to fulfill the user
potential requirement.
For user behavior analytics in traditional wireless network or small
scale of user quantity, common clustering methods would meet
classification or prediction requirements. With the tremendous
information explosion and growth of data volume, in the light of
different application purposes, it is necessary to utilize more
scalable-parallel machine learning tools and methods aiming at such
big data. Relevant big data and cloud analytics technologies can be
referred to general industry standards. The user data can be also
divided locally based on neighborhood similarity for parallelizing
process on big data by machine learning methods. Likewise, the
Database could be locally distributed in some scale to carry out
dataset training.
The specific methods can be classified according to the following
three analytic models.
4.2.1. User Behavior Clustering
The clustering technologies aim to aggregate several items by
likelihood and similarity. In our protocol, this kind of methods can
be used to aggregate users with similar behavior. Then we execute
same actions to this cluster of users like uniform spectrum
distribution. This is a basic
4.2.2. Binary Prediction
We mainly exploit this learning process model to make decisions and
predict a spectrum with confidence or probability. Muilti-ruleset
data mining tools such as sparse Bayesian methods and kernel based
methods could be prior implemented to give a better prediction
results.
4.2.3. Spectrum Service Recommendation
The goal for this model is to predict and recommend a service for
multiple users with similar behavior. Information filtering
technologies and recommender systems based on similarity could match
users with spectrum and service they most likely to be interested by
<Guan> Expires June 12, 2014 [Page 8]
Internet-Draft PAWS Smart Database December 2013
some kind of scoring mechanism. Muilti-ruleset collaborative
filtering could be implemented to compute these preferences and
recommend spectrums or other user-oriented service such as data
traffic plans. Moreover, such a correlation and filtering mechanism
could monitor the spectrum usage mass activity to prevent malicious
users?cooperative attack.
4.3. Prediction Results
When new spectrum request coming, the Database could abstract user
features mentioned above, predict the spectrum based on the trained
model. Since such a spectrum is the one that most suitable or
frequently-used, the Database can directly response a best candidate
spectrum or spectrum lists with probability, instead of a random
selected available spectrum list. This also ensures to access a
stable and trusted Database out of security consideration. Manual
operation would be permitted and pre-built in Database. Similarly,
other recommended output results can be pushed via spectrum response.
Predicted results could be added to training datasets to improve the
prediction accuracy as well as automatically adjust the false alarm
rate to adapt the fitting.
An alternative recommendation method is that when a requested
spectrum period is expired, a master device quits the spectrum
occupancy and sends a spectrum feedback to the WSDB. This feedback
is marked as an evaluation degree to describe the satisfaction for
this white space access. If the number is frequently higher
statistic, then this spectrum will be top-ranked and prior allocated
to other users for the next time.
5. Working flow
This section we will introduce the system implement architecture.
Our Database should be locally distributed to solve the mobility and
scalability problems. Since node mobility management issues will
involve the related registration and termination problems,
localization can relieve low latency queries and scalability issues.
These also bring advantages that the big data can be learned in
portion and integrated for varigrained analysis freely by
transforming between lower and higher dimensional data space.
5.1. Spectrum prediction scenario
<Guan> Expires June 12, 2014 [Page 9]
Internet-Draft PAWS Smart Database December 2013
+-----------+ +-----------+ +----------+
| | | | | WSDB |
| WSD | | WSDB | | Analysis |
| | | | | Server |
+-----------+ +-----------+ +----------+
| | all users history |
| | feature abstraction |
| |---------------------------|
| | |
| |dataset training & modeling|
| |---------------------------|
| | |
| AVAIL_SPEC_BATCH_REQ | |
|-------------------------->| |
| | feature abstraction |
| | & spectrum prediction |
| |<------------------------->|
|AVAIL_SPEC_BATCH RESP with | |
| predicted spectrum | |
|<--------------------------| |
| | |
Figure 1 Procedures of WSD gets predicted spectrum from WSDB
From the Figure 1 we can see that the Database is no need to check
the current available spectrum for every white space device. Or even
we can trace the user activity behavior and preset the likely used
spectrum for a series of users. In this way, it will shorten the
query delay and resource lookup cost with access to an optimized
spectrum in return.
5.2. WSDB Commendation Procedure
6. Security Considerations
With regard of the security assumption in user case requirements,
the Master Device and the Database may suffer six types of threats.
Without additional message interaction, our protocol will not
introduce new intercept risks. Moreover, a crowd of malicious
attackers could be easily identified since they would act with
similar behavior.
7. IANA Considerations
This document makes no request of IANA.
<Guan> Expires June 12, 2014 [Page 10]
Internet-Draft PAWS Smart Database December 2013
8. Conclusions
This memo discusses a smart Database functions during white space
database access and describes some scenarios.
9. References
9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3339] Klyne, G., Ed. and Newman, C., "Date and Time on the
Internet: Timestamps", RFC 3339, July 2002.
[RFC4594] Babiarz, J., Ed. and Chan, K., "Configuration Guidelines
for DiffServ Service Classes", RFC 4594, August 2006.
[RFC5127] Chan, K., Ed. And Baker, F., "Aggregation of Diffserv
Service Classes", RFC 5127, February 2008.
[I-D.ietf-paws-protocol] Chen, V., Das, S., Zhu, L., Malyar, J., and
P. McCann,"Protocol to Access Spectrum Database",Draft-
ietf-paws-protocol-03(work in progress),February 2013.
[I-D.das-paws-protocol] Das, S., Malyar, J., and D. Joslyn, "Device
to Database Protocol for White Space", draft-das-paws-
protocol-02(work in progress), July 2012.
[I-D.ietf-paws-problem-stmt-usecases-rqmts] Mancuso, A. and B. Patil,
"Protocol to Access White Space (PAWS) Database: Use Cases
and Requirements", draft-ietf-paws-problem-stmt-usecases-
rqmts-12 (work in progress), January 2013.
[I-D.wei-paws-framework] Wei, X., Zhu, L., and P. McCann, "PAWS
Framework", draft-wei-paws-framework-00 (work in progress),
July 2012.
10. Acknowledgments
Thanks to my colleagues for their sincerely contributions and
comments when drafting this document.
<Guan> Expires June 12, 2014 [Page 11]
Internet-Draft PAWS Smart Database December 2013
Authors' Addresses
Jianfeng Guan
State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications,
Beijing, 100876, P.R.China
EMail: jfguan@bupt.edu.cn
Neng Zhang
State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications,
Beijing, 100876, P.R.China
EMail: zn@bupt.edu.cn
Changqiao Xu
State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications,
Beijing, 100876, P.R.China
EMail: cqxu@bupt.edu.cn
Hongke Zhang
<Guan> Expires June 12, 2014 [Page 12]
Internet-Draft PAWS Smart Database December 2013
State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications,
Beijing, 100876, P.R.China
EMail: hkzhang@bupt.edu.cn
<Guan> Expires June 12, 2014 [Page 13]