End-to-End Learning From Spectrum Data: A Deep Learning Approach For Wireless Signal Identification in Spectrum Monitoring Applications

SPECIAL SECTION ON REAL-TIME EDGE ANALYTICS FOR BIG DATA IN INTERNET OF THINGS
Received February 16, 2018, accepted March 14, 2018, date of publication March 26, 2018, date of current version April 23, 2018.
Digital Object Identifier 10.1109/ACCESS.2018.2818794
End-to-End Learning From Spectrum Data:

A Deep Learning Approach for Wireless
Signal Identification in Spectrum
Monitoring Applications
MERIMA KULIN 1 , TARIK KAZAZ 2 , (Student Member, IEEE),
INGRID MOERMAN1 , (Member, IEEE), AND ELI DE POORTER1
1 Department of Information Technology, Ghent University, B-9052 Ghent, Belgium
2 Faculty of EEMCS, Delft University of Technology, 2628 CD Delft, The Netherlands
Corresponding author: Merima Kulin ([email protected])
This work was supported in part by the EU H2020 eWINE Project under Grant 688116, in part by SBO SAMURAI Project, and in part by
the AWS Educate/GitHub Student Developer Pack.
ABSTRACT This paper presents end-to-end learning from spectrum data—an umbrella term for new
sophisticated wireless signal identification approaches in spectrum monitoring applications based on deep
neural networks. End-to-end learning allows to: 1) automatically learn features directly from simple wireless
signal representations, without requiring design of hand-crafted expert features like higher order cyclic
moments and 2) train wireless signal classifiers in one end-to-end step which eliminates the need for complex
multi-stage machine learning processing pipelines. The purpose of this paper is to present the conceptual
framework of end-to-end learning for spectrum monitoring and systematically introduce a generic methodol-
ogy to easily design and implement wireless signal classifiers. Furthermore, we investigate the importance of
the choice of wireless data representation to various spectrum monitoring tasks. In particular, two case studies
are elaborated: 1) modulation recognition and 2) wireless technology interference detection. For each case
study three convolutional neural networks are evaluated for the following wireless signal representations:
temporal IQ data, the amplitude/phase representation, and the frequency domain representation. From our
analysis, we prove that the wireless data representation impacts the accuracy depending on the specifics and
similarities of the wireless signals that need to be differentiated, with different data representations resulting
in accuracy variations of up to 29%. Experimental results show that using the amplitude/phase representation
for recognizing modulation formats can lead to performance improvements up to 2% and 12% for medium
to high SNR compared to IQ and frequency domain data, respectively. For the task of detecting interference,
frequency domain representation outperformed amplitude/phase and IQ data representation up to 20%.
INDEX TERMS Big spectrum data, spectrum monitoring, end-to-end learning, deep learning, convolutional
neural networks, wireless signal identification, IoT.
I. INTRODUCTION It is indisputable that monitoring and understanding the

Wireless networks are currently experiencing a dramatic evo- spectrum resource usage will become a critical asset for 5G in
lution. Some trends observed are the increasing number and order to improve and regulate the radio spectrum utilization.
diversity of wireless devices, with an increasing spectrum However, monitoring the spectrum use in such a complex
demand. wireless system requires distributed sensing over a wide fre-
Unfortunately, the radio frequency spectrum is a scarce quency range, resulting in a radio spectrum data deluge [3].
resource. As a result, particular parts of the spectrum are used Extracting meaningful information about the spectrum usage
heavily whereas other parts are vastly underutilized [1]. For from massive and complex spectrum datasets requires sophis-
example, the unlicensed bands are extremely overutilized and ticated and advanced algorithms. This paves the way for new
suffer from cross-technology interference [2]. innovative spectrum access schemes and the development of
2169-3536
2018 IEEE. Translations and content mining are permitted for academic research only.
18484 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
M. Kulin et al.: End-to-End Learning From Spectrum Data: Deep Learning Approach for Wireless Signal Identification
novel identification mechanisms that will provide awareness output (i.e. the predicted type of signal), is learned purely
about the radio environment. For instance, technology identi- from data [4].
fication, modulation type recognition and interference source
detection are essential for interference mitigation strategies A. SCOPE AND CONTRIBUTIONS
to continue effective use of the scarce spectral resources and This paper provides a comprehensive introduction to end-to-
enable the coexistence of heterogeneous wireless networks. end learning from spectrum data. The main contributions of
In this paper, we investigate end-to-end learning from spec- this paper are as follows:
trum data as a unified approach to tackle various challenges • Potential end-to-end learning use cases for spectrum
related to the problems of inefficient spectrum management, monitoring are identified. In particular, two categories
utilization and regulation that the next generation of wire- are presented. The first category are use cases where
less networks is facing. Whether the goal is to recognize detecting spectral opportunities and spectrum sharing
a technology or a particular modulation type, identify the is necessary such as in cognitive radio and emerging
interference source or an interference-free frequency chan- cognitive IoT networks. The second, are scenarios where
nel, we argue that the various problems may be treated as detecting radio emitters is needed such as in spectrum
a generic problem type that we refer to as wireless signal regulation.
identification, which is a natural target for machine learning • To set a preliminary background on this interdisci-
classification techniques. The term end-to-end implies that plinary topic a brief introduction to machine learning/
the process of extracting wireless signal features and learning deep learning is provided and their role for spectrum
a wireless signal classifier consists of a single learning proce- monitoring is discussed. Then, a reference model for
dure. More general, end-to-end learning refers to processing deep learning for spectrum monitoring applications is
architectures where the entire pipeline, connecting the input defined.
(i.e the data representation of a sensed wireless signal) to the • A conceptual framework for end-to-end learning is pro-
desired output (i.e. the predicted type of signal), is learned posed, followed by a comprehensive overview of the
purely from data [4]. It is indisputable that monitoring and methodology for collecting spectrum data, designing
understanding the spectrum resource usage will become a wireless signal representations, forming training data
critical asset for 5G in order to improve and regulate the radio and training deep neural networks for wireless signal
spectrum utilization. However, monitoring the spectrum use classification tasks.
in such a complex wireless system requires distributed sens- • To demonstrate the approach, experiments are carried
ing over a wide frequency range, resulting in a radio spectrum out for two case studies: (i) modulation recognition
data deluge [3]. Extracting meaningful information about the and (ii) wireless technology interference detection, that
spectrum usage from massive and complex spectrum datasets demonstrate the impact of the choice of wireless data
requires sophisticated and advanced algorithms. This paves representation on the presented results. For modulation
the way for new innovative spectrum access schemes and recognition, the following modulation techniques are
the development of novel identification mechanisms that will considered: BPSK (binary phase shift keying), QPSK
provide awareness about the radio environment. For instance, (quadrature phase shift keying), m-PSK (phase shift key-
technology identification, modulation type recognition and ing, for m = 8), m-QAM (quadrature amplitude modu-
interference source detection are essential for interference lation, for m = 16 and 64), CPFSK (continuous phase
mitigation strategies to continue effective use of the scarce frequency shift keying), GFSK (Gaussian frequency
spectral resources and enable the coexistence of heteroge- shift keying) and m-PAM (pulse amplitude modulation
neous wireless networks. for m = 4). For wireless technology identification, three
In this paper, we investigate end-to-end learning from spec- representative technologies operating in the unlicensed
trum data as a unified approach to tackle various challenges bands are analysed: IEEE 802.11b/g, IEEE 802.15.4 and
related to the problems of inefficient spectrum management, IEEE 802.15.1.
utilization and regulation that the next generation of wire- The rest of the paper is organized as follows. The remain-
less networks is facing. Whether the goal is to recognize der of Section I presents related work. Section II presents
a technology or a particular modulation type, identify the motivating scenarios for the proposed approach. Section III
interference source or an interference-free frequency channel, introduces basic concepts related to machine learning/deep
we argue that the various problems may be treated as a generic learning concluded with a high-level processing pipeline
problem type that we refer to as wireless signal identification, for their application to spectrum monitoring scenarios.
which is a natural target for machine learning classification Section IV presents the end-to-end learning methodology
techniques. The term end-to-end implies that the process of for wireless signal classification. In Section V the method-
extracting wireless signal features and learning a wireless ology is applied to two scenarios and experimental results
signal classifier consists of a single learning procedure. More are discussed. Section VI discusses open challenges related
general, end-to-end learning refers to processing architec- to the implementation and deployment of future end-to-end
tures where the entire pipeline, connecting the input (i.e the spectrum management systems. Section VII concludes the
data representation of a sensed wireless signal) to the desired paper.
VOLUME 6, 2018 18485

B. RELATED WORK applications in wireless communication. Yao et al. [15] pro-

1) TRADITIONAL SIGNAL IDENTIFICATION pose a unified deep learning framework for mobile sensing
Previous research efforts in wireless communication related data. However, none of these studies focuses on spectrum
to signal identification are dominantly based on signal pro- monitoring scenarios and the underlying data models for
cessing tools for communication [5] such as cyclostationary training wireless signal classifiers.
feature detection [6], sometimes in combination with tradi- To remedy these shortcomings, this paper presents end-to-
tional machine learning techniques [7] (e.g. support vector end learning from spectrum data: a deep learning framework
machines (SVM), decision trees, k-nearest neighbors (k-NN), for solving various wireless signal classification problems
neural networks (NNs), etc.). The design of these specialized for spectrum monitoring applications in a unified manner.
solutions have proven to be time-demanding as they typically To the best of our knowledge, this article is the first compre-
rely on manual extraction of expert features for which a hensive work that elaborates in detail the methodology for
significant amount of domain knowledge and engineering is (i) collecting, transforming and representing spectrum data,
required. (ii) designing and implementing data-driven deep learning
classifiers for wireless signal identification problems, and
that (iii) looks at several data representations for differ-
2) DEEP LEARNING FOR SIGNAL CLASSIFICATION
ent classification problems at once. The technical approach
Motivated by recent advances and the remarkable suc- depicted in this paper is deeply interdisciplinary and sys-
cess of deep learning, especially convolutional neural net-
tematic, calling for the synergy of expertise of computer
works (CNN), in a broad range of problems such as image
scientists, wireless communication engineers, signal process-
recognition, speech recognition and machine translation [8],
ing and machine learning experts with the ultimate aim of
wireless communication engineers recently used similar
breaking new ground and raising awareness of this emerg-
approaches to improve on the state of the art in signal iden- ing interdisciplinary research area. Finally, this paper is at
tification tasks in wireless networks. One of the pioneers an opportune time, when (i) recent advances in the field
in the domain were O’Shea et al. [9], who demonstrated
of machine learning, (ii) computational advances and paral-
that CNNs trained on time domain in-phase and quadrature
lelization used to speed up training and (iii) efforts in making
(IQ) data significantly outperform traditional approaches for
large amounts of spectrum data available, have paved the way
automatic modulation recognition based on expert features
for novel spectrum monitoring solutions.
such as cyclic-moment based features, and conventional clas-
sifiers such as decision trees, k-NNs, SVMs, NNs and Naive
Bayes. Selim et al. [10] propose to use amplitude and phase 4) NOTATION AND TERMINOLOGY
difference data to train CNN classifiers able to detect the pres- We indicate a scalar-valued variable with normal font letters
ence of radar signals with high accuracy. Akeret et al. [11] (i.e. x or X ). Matrices will be denoted using bold capitals
propose a novel technique to accurately detect radio fre- such as X. Vectors will be denoted with a bold lower case
quency interference in radio astronomy by training a CNN letter (i.e. x), which may sometimes appear as row or column
on 2D time domain data acquired from a radio telescope. vectors of a matrix (i.e. xk is the k-th column vector). With xi
Schmidt et al. [12] propose a novel method for interfer- and xij we will indicate the entries of x and X, respectively.
ence identification in unlicensed bands using CNNs trained The notation ()T denotes the transpose of a matrix or vec-
on frequency domain data [12]. Several wireless technolo- tor, whileP ()∗ denotes complex conjugation. We indicate by
−1
gies (e.g. Digital Video Broadcasting (DVB), Global System ||x||p = ( N p 1/p the l -norm of vector x.
n=0 |xn | ) p
for Mobile Communications (GSM), Long-Term Evolution
(LTE), etc.) have been classified with high accuracy in [13] II. CHARACTERISTIC USE CASES FOR END-TO-END
using deep learning on averaged magnitude Fast Fourier LEARNING FROM SPECTRUM DATA
Transform (FFT) data.
End-to-end learning from spectrum data is a new approach
These individual works focus on specific deep learning
that can automatically learn features directly from sim-
applications pertaining to wireless signal classification using ple wireless signal representations, without requiring design
particular data representations. They do not provide a detailed of hand-crafted expert features like higher order cyclic
methodology necessary to understand how to apply the same
moments. The term end-to-end refers to the fact that
approach to other potential use cases, neither they provide
the learning procedure can train wireless signal classi-
sufficient information as a guide for selecting a wireless data
fiers in one end-to-end step which eliminates the need for
representations. This information is necessary for someone
complex multi-stage expert machine learning processing
aiming to reproduce existing attempts, build upon it or to pipelines.
generate new application ideas. Before diving deep into the concept of end-to-end
learning from spectrum data, we first consider the archi-
3) DEEP LEARNING FOR WIRELESS NETWORKS tecture presented on Figure 1 with two motivating scenar-
Recently, O’Shea and Hoydis [14] provided an overview ios that illustrate characteristic use-cases for the presented
of the state-of-the art and potential future deep learning approach.
18486 VOLUME 6, 2018

cross-technology interference and scarcity of interference-

free spectrum bands [2], [20]. To address these challenges,
recent research work proposed a CR-based IoT [21], [22]
to enable dynamic spectrum sharing among heterogeneous
wireless networks.
Figure 1 a) depicts this situation. It can be seen that
CR-IoT devices are equipped with cognitive functionali-
ties allowing them to search for interference-free spectrum
bands and accordingly reconfigure their transmission param-
eters. First, CR-IoT devices send spectrum sensing reports
to a CNN-based DC. Then, the DC learns and estimates
the presence of other emitters and uses that information to
detect interference sources and interference-free channels.
This enables smart and effective interference mitigation and
spectrum management strategies for co-existence with CR
and legacy technologies and modulation types.
FIGURE 1. Data-driven CNN-based flexible spectrum management
framework.
B. SPECTRUM MANAGEMENT POLICY AND REGULATION
A. DETECTING SPECTRAL OPPORTUNITIES & Spectrum regulatory bodies continuously monitor the radio
SPECTRUM SHARING frequency spectrum use to prevent users from harmful inter-
1) COGNITIVE RADIO ference and allow optimum use thereof [23]. Interference
may be a result of unauthorized emissions, electromagnetic
The ever-increasing radio spectrum demand combined
interference (EMI) and devices that operate beyond techni-
with the currently dominant fixed spectrum policy assign-
cal specifications. In order to resolve problems associated
ment [16], have inspired the concepts of cognitive radio (CR)
with wireless interference, spectrum managers traditionally
and dynamic spectrum access (DSA) aiming to improve radio
use a combination of engineering analysis and data obtained
spectrum utilization. A CR network (CRN) is an intelligent
from spectrum measurements. However, in the era of today’s
wireless communication system that is aware of its radio
‘‘wireless abundance’’, where various services and wireless
environment, i.e. spectral opportunities, and can intelligently
technologies share the same frequency bands, the identifi-
adapt its operating parameters by interacting and learning
cation of unauthorized transmitters can be very difficult to
from the environment [17]. In this way, the CRN can infer the
achieve. More intelligent algorithms are needed that can auto-
spectrum occupancy to identify unoccupied frequency bands
matically mine the spectrum data and identify interference
(white spaces/spectrum holes) and share them with licensed
sources.
users (primary users (PU)) in an opportunistic manner [18].
Figure 1 b) presents a CNN-based spectrum management
Figure 1 a) shows the basic operational process of a data-
framework for spectrum regulation. Deployed sensor devices,
driven CRN. First, CR users intermittently sense its surround-
e.g. {S1, S2, S3}, collect spectrum measurements and con-
ing radio environment and report their sensing results via a
tribute their observations to a DC to create interference maps.
control channel to a nearby base station (BS). Then, the BS
The DC uses signal processing techniques together with a
forwards the request to a back-end data center (DC), which
CNN model to mine the obtained spectrum data and identify
combines the crowdsourced sensing information from several
existing interferers. The mined patterns are key for ensuring
CR users into a spectrum map. The DC infers the spectrum
compliance with national and international spectrum man-
use in order to determine the presence of PUs (a character-
agement regulations.
istic wireless signal) and diffuses the spectrum availability
information back to the cognitive users. For this purpose,
III. THE ROLE OF DEEP LEARNING IN
the DC first learns a CNN model offline based on the sensing
SPECTRUM MONITORING
reports, and then employs the model to discriminate between There are two goals of this section. The first is to intro-
a spectrum hole and an occupied frequency channel. duce the key ideas underlying machine learning/deep learn-
ing. The second is to derive a reference model for machine
2) COGNITIVE IoT learning/deep learning applications for spectrum monitoring,
The Internet of Things (IoT) paradigm envisioned a management and spectrum regulation.
world of ‘‘always connected’’ devices/objects/things to the
Internet [19]. In this world, heterogeneous wireless tech- A. MACHINE LEARNING
nologies and standards emerge operating in the unlicensed Machine learning (ML) refers to a set of algorithms that learn
frequency bands, which puts enormous pressure on the avail- a statistical model from historical data. The obtained model
able spectrum. The increasing wireless spectrum demand is data-driven rather than explicitly derived using domain
rises several communication challenges such as co-existence, knowledge.
VOLUME 6, 2018 18487

1) PRELIMINARIES to the training data, S, so that f is a good estimator for new

The goal of ML is to find a mathematical function, f , that unseen data, i.e.
defines the relation between a set of inputs X , and a set of y ≈ ŷ = fˆ (xnew ) (7)
outputs Y , i.e.
The predictor f is parametrized by a vector θ ∈ Rn , and
f :X →Y (1) describes a parametric model. In this setup, the problem of
The inputs, X ∈ Rm×n , present a number of distinct data estimating f reduces down to one of estimating the param-
points, samples or observations denoted as eters θ = [θ1 , θ2 , . . . , θn ]T . In most practical applications,
 T the observed data are corrupted versions of the expected val-
x1 ues that would be obtained under ideal circumstances. These
 x2 T  unavoidable corruptions, typically termed noise, prevent the
X= .  (2)
 
 ..  extraction of true parameters from the observations. With this
in regard, the generic data model may be expressed as
xm T
y = f (x) + (8)
where m is the sample size, while xi ∈ Rn is a vector of
n measurements or features for the ith observation called a where f (x) is the model and are additive measurement
feature vector, errors and other discrepancies. The goal of ML is to find the
input-output relation that will ‘‘best’’ match the noisy obser-
xi = [xi1 , xi2 , . . . , xin ]T , i = 1, . . . , m (3) vations. Hence, the vector θ may be estimated by solving a
The outputs, y ∈ Rm , are all the outcomes, labels or target (convex) optimization problem. First, a loss or cost function
values corresponding to the m inputs xi , denoted by l(x, y, θ) is set, which is a (point-wise) measure of the error
between the observed data point yi and the model prediction
y = [y1 , y2 , . . . , ym ]T (4) fˆ (xi ) for each value of θ. However, θ is estimated on the
whole training data, S, not just one example. For this task,
Then the observed data consists of m input-output pairs, the average loss over all training examples called training
called the training data or training set, S, loss, J , is calculated:
S = {(x1 , y1 ), (x2 , y2 ), . . . , (xm , ym )} (5) 1 X
J (θ) ≡ J (S, θ) = l(xi , yi , θ) (9)
m
Each pair (xi , yi ) is called a training example because it is (xi ,yi )∈S
used to train or teach the learning algorithm how to obtain f . where S indicates that the error is calculated on the instances
In machine learning, f is called the predictor whose task is from the training set and i = 1, . . . , m. The vector θ that
to predict the outcome yi based on the input values of xi . There minimizes the training loss J (θ), that is
are two classical data models depending on the prediction
type, described by: argmin J (θ) (10)
θ∈Rn

regressor: if y ∈ R will give the desired model. Once the model is estimated,
f (x) = (6)
classifier: if y ∈ {0, 1} for any given input x, the prediction for y can be made with
ŷ = θT x.
In short, when the output variable y is continuous or quan-
In engineering parlance, the process of estimating the
titative, the learning problem is a regression problem. But, if y
parameters of a model that is a mapping between input and
predicts a discrete or categorical value, it is a classification
output observations is called system identification [43]. Sys-
problem.
tem identification or ML classification techniques are well
suited for wireless signal identification problems.
2) LEARNING THE MODEL
Given a training set, S, the goal of a machine learning algo-
B. DEEP LEARNING
rithm is to learn the mathematical model for f . To make sense
The prediction accuracy of ML models heavily depends on
of this task, we assume there exists a fixed but unknown dis-
the choice of the data representation or features used for
tribution, p(x, y) = pX (x)p(y|x), according to which the data
training. For that reason, much effort in designing ML models
sample is identically and independently distributed (i.i.d).
goes into the composition of pre-processing and data trans-
Here, pX (x) is the marginal distribution that models the uncer-
formation chains that result in a representation of the data
tainty in the sampling of the input points, while p(y|x) is the
that can support effective ML predictions. Informally, this is
conditional distribution that describes the statistical relation
referred to as feature engineering. Feature engineering is the
between the input and output.
process of extracting, combining and manipulating features
Thus, f is some fixed but unknown function that defines
by taking advantage of human ingenuity and prior expert
the relation between X and Y . The depicted ML algorithm
knowledge to arrive at more representative ones, that is
determines the functional form or shape. The unknown func-
tion f is estimated by applying the selected learning method φ(d) : d → x (11)
18488 VOLUME 6, 2018

i.e. the feature extractor φ transforms the data vector d ∈ Rd functions are the hyperbolic tangent function (tanh), g(x) =
2
into a new form, x ∈ Rn , more suitable for making pre- 1+e−2x
− 1, and the sigmoid activation g(x) = 1+e1 −x .
dictions. The importance of feature engineering highlights In order to form a richer representation of the input signal,
the bottleneck of machine learning algorithms: their inability commonly, multiple filters are stacked so that each hidden
to automatically extract the discriminative information from layer consists of multiple feature maps, {h(l) , l = 0, . . . , L}
data. (e.g., L = 64, 128, . . ., etc). The number of filters per layer
Feature learning is a branch of machine learning that is a tunable parameter or hyper-parameter. Other tunable
moves the concept of learning from ‘‘learning the model’’ to parameters are the filter size, the number of layers, etc. The
‘‘learning the features’’. One popular feature learning method selection of values for hyper-parameters may be quite diffi-
is deep learning. In particular, this paper focuses on convolu- cult, and finding it commonly is much an art as it is science.
tional neural networks (CNN). An optimal choice may only be feasible by trial and error. The
Convolutional neural networks perform feature learning filter sizes are selected according to the input data size so as to
via non-linear transformations implemented as a series of have the right level of granularity that can create abstractions
nested layers. The input data is a multidimensional data array, at the proper scale. For instance, for a 2D square matrix input,
called tensor, that is presented at the visible layer. This is typ- such as spectrograms, common choices are 3×3, 5×5, 9×9,
ically a grid-like topological structure, e.g. time-series data, etc. For a wide matrix, such as a real-valued representation of
which can be seen as a 1D grid taking samples at regular time the complex I and Q samples of the wireless signal in R2×N ,
intervals, pixels in images with a 2D layout, a 3D structure suitable filter sizes may be 1 × 3, 2 × 3, 2 × 5, etc.
of videos, etc. Then a series of hidden layers extract several The penultimate layer in a CNN consists of neurons that are
abstract features. Those layers are ‘‘hidden’’ because their fully-connected with all feature maps in the preceding layer.
values are not given. Instead, the deep learning model must Therefore, these layers are called fully-connected or dense
determine which data representations are useful for explain- layers. The very last layer is a softmax classifier, which
ing the relationships in the observed data. Each layer consists computes the posterior probability of each class label over
of several kernels that perform a convolution over the input; K classes as
therefore, they are also referred to as convolutional layers. ezi
Kernels are feature detectors, that convolve over the input ŷi = PK , i = 1, . . . , K (14)
zj
and produce a transformed version of the data at the output. j=1 e
Those are banks of finite impulse response filters as seen in That is, the scores zi computed at the output layer, also called
signal processing, just learned on a hierarchy of layers. The logits, are translated into probabilities. A loss function, l,
filters are usually multidimensional arrays of parameters that is calculated on the last fully-connected layer that measures
are learnt by the learning algorithm [24] through a training the difference between the estimated probabilities, ŷi , and
process called backpropagation. the one-hot encoding of the true class labels, yi . The CNN
For instance, given a two-dimensional input x, parameters, θ, are obtained by minimizing the loss function
a two-dimensional kernel h computes the 2D convolution by on the training set {xi , yi }i∈S of size m,
X
(x ∗ h)i,j = x[i, j] ∗ h[i, j] min l(ŷi , yi ) (15)
θ
i∈S
XX
= x[n, m] · h[i − n][j − m] (12)
n m where l(.) is typically the mean squared error P l(y, ŷ) = ky −
ŷk22 or the categorical cross-entropy l(y, ŷ) = m i=1 yi log(ŷi )
i.e. the dot product between their weights and a small region
for which a minus sign is often added in front to get the
they are connected to in the input.
negative log-likelihood.
After the convolution, a bias term is added and a point-
To control over-fitting, typically regularization is used in
wise nonlinearity g is applied, forming a feature map at the
combination with dropout, which is a new extremely effective
filter output. If we denote the l-th feature map at a given
technique that ‘‘drops out’’ a random set of activations in a
convolutional layer as hl , whose filters are determined by the
layer. Each unit is retained with a fixed probability p, typically
coefficients or weights Wl , the input x and the bias bl , then
chosen using a validation set, or set to 0.5 which has shown
the feature map hl is obtained as follows
to be close to optimal for a wide range of applications [25].
hl i,j = g((W l ∗ x)ij + bl ) (13)
C. DEEP LEARNING FROM SPECTRUM DATA
where ∗ is the 2D convolution defined by Equation 12, Intelligence capabilities will be of paramount importance in
while g(·) is the activation function. Typically, the rectifier the development of future wireless communication systems
activation function is used for CNNs, which is defined by to allow them observe, learn and respond to its complex and
g(x) = max(0, x). Kernels using the rectifier are called ReLU dynamic operating environment. Figure 2 shows a processing
(Rectified Linear Unit) and have shown to greatly acceler- pipeline for realizing intelligent behaviour using deep learn-
ate the convergence during the training process compared ing in an end-to-end learning from spectrum data setup. The
to other activation functions. Others common activation pipeline consists of:
VOLUME 6, 2018 18489

FIGURE 2. Processing pipeline for end-to-end learning from spectrum data.
1) DATA ACQUISITION without causing interference to other users. This process is

Data is a key asset in the design of future intelligent wireless called spectrum decision [18]. In the context of CR-IoTs,
networks [26]. In order to obtain spectrum data, the radio first the decision may relate to an interference mitigation strategy
senses its environment by collecting raw data from various such as back-off for a certain time period. In other commu-
spectrum bands. The raw data consist of n samples, stacked nication scenarios such as spectrum regulation, the decision
into data vectors rk which represent the complex envelope of may relate to a spectrum policy or spectrum compliance
the received wireless signal. These data vectors are the input enforcement applied to a detected source of harmful inter-
for end-to-end learning to obtain models that can reason about ference (e.g. fake GSM tower, rouge access point, etc.).
the presence of wireless signals.
IV. DATA-DRIVEN END-TO-END LEARNING FOR
2) DATA PRE-PROCESSING WIRELESS SIGNAL CLASSIFICATION
Data pre-processing is concerned with the analysis and The next generation (5G) wireless networks are expected to
manipulation of the collected spectrum data with the aim to learn the diverse characteristics of the dynamically changing
arrive at potentially good wireless data representations. The wireless environment and fluctuating nature of the available
raw samples organized into data vectors rk in the previous spectrum, so as to autonomously determine the optimal sys-
block are pipelined as input for signal processing (SP) tools tem configuration or to support spectrum regulation.
that analyze, process and transform the data to arrive at simple This section introduces a data-driven end-to-end learning
data representations such as frequency, amplitude, phase and framework for spectrum monitoring applications in future 5G
spectrum, or more complex features xk such as e.g. cyclo- networks. First, the representation of wireless signals used in
stationary features. In addition, feature learning such as deep digital communication and a data model for wireless signal
learning may be utilized to automatically extract more low- acquisition is introduced. Then, a data model for extracting
level and high-level features. In many ML applications the features, creating training data and designing wireless signal
choice of features is just as important, if not more important classifiers is presented. In particular, deep learning is used for
than the choice of the ML algorithm. extracting low-level and higher level wireless signal features
and for wireless signal classification.
3) CLASSIFICATION
The ‘‘Classification’’ processing block enables intelligence A. WIRELESS SIGNAL MODEL
capabilities to asses the environmental radio context by A wireless communication system transmits information
detecting the presence of wireless signals. This may be the from one point to another though a wireless medium which
type of the emitters that are utilizing the spectrum (spec- is called a channel. At the system level, a wireless communi-
trum access scheme, modulation format, wireless technology, cation model consists of the following parts:
etc.), type of interference, detecting an available spectrum
band, etc. We refer to this process as spectrum learning [27]. 1) TRANSMITTER
In future wireless networks ML algorithms may play a The transmitter transforms the message, i.e. a stream of bits,
key role in automatically classifying wireless signals as a produced by the source of information into an appropriate
step towards intelligent spectrum access and management form for transmission over the wireless channel. Figure 3
schemes. shows the processing chain at the transmitter side. First,
the bits bk ∈ {0, 1} are mapped into a new binary sequence
4) DECISION by a coding technique. The resulting sequence is mapped to
The predictions calculated by the ML model are used as input symbols sk from an alphabet or constellation which might be
for the decision module. In a CR application, a decision may real or complex. This process is called modulation.
be related to the best transmission strategy (e.g. frequency In the modulation step, the created symbols are mapped to
band or transmission power) that will maximize the data rate a discrete waveform or signal via a pulse shaping filter and
18490 VOLUME 6, 2018

FIGURE 3. End-to-end learning processing chain to obtain radio spectrum feature vectors.
sent to the digital to analog converter module (D/A) where and hardware imperfections of the transmitter and receiver.
the waveform is transformed into an analog continuous time Typical hardware related impairments are:
signal, sb (t). The resulting signal is a baseband signal that is • Noise caused by the resistive components such as the
frequency shifted by the carrier frequency fc to produce the receiver antenna. This thermal noise may be modelled as
wireless signal s(t) that is defined by additive white Gaussian noise (AWGN), n ∼ N (0, σ 2 ).
s(t) = <{sb (t)ej2πfc t } • Frequency offset caused by the slightly different local
oscillator (LO) signal frequencies at the transmitter, fc ,
= <{sb (t)} cos(2π fc t) − ={sb (t)} sin(2π fc t) (16)
and receiver, fc 0 .
where s(t) is a real-valued bandpass signal with center fre- • Phase Noise, ϕ(t), caused by the frequency drift in the
quency fc , while sb (t) = <{sb (t)} + j={sb (t)} is the baseband LOs used to demodulate the received wireless signal.
complex envelope of s(t). It causes the angle of the LO signals to drift around its
intended instantaneous phase 2π fc t.
2) WIRELESS CHANNEL • Timing drift caused by the difference in sample rates at
The wireless channel is characterised by the variations of the the receiver and transmitter.
channel strength over time and over frequency. The varia- The received wireless signal model can be given by
tions are modeled as (i) large-scale fading, which charac- r(t) = <{rb (t)ej2πfc t }, where rb (t) is the baseband complex
terizes the path loss of the channel as a function of distance enveloped defined by
and shadowing by large objects such as buildings and hills,
1 0
and (ii) small-scale fading, which models constructive and rb (t) = (sb (t) ∗ hb (t, τ )) ej2π(fc −fc )t+ϕ(t) + n(t) (18)
destructive interference of the multiple propagation paths 2
between the transmitter and receiver. The channel effects can where hb (t, τ ) is the baseband channel equivalent with l dis-
be modeled as a linear time-varying system described by a tinct propagation paths, each characterised by a time varying
complex finite impulse response (FIR) filter h(t, τ ). If r(t) is path attenuation αi (t, τi ) and path delay τi , given by
the signal at the channel output, the input/output relation is l
αi (t, τ )ej2πfc τi (t) δ(τ − τi (t))
X
given by: hb (t, τ ) = (19)
r(t) = s(t) ∗ h(t, τ ) (17) i=0
where h(t, τ ) is the band-limited bandpass channel impulse B. DATA ACQUISITION

response, while ∗ denotes the convolution operation. To derive a machine learning model for wireless signal iden-
tification, adequate training data needs to be collected.
3) RECEIVER Figure 3 summarizes the data acquisition process for col-
The wireless signal at the receiver output will be a corrupted lecting wireless signal features. The received signal, r(t),
version of the transmitted signal due to channel impairments is first amplified, mixed, low-pass filtered and then sent to
VOLUME 6, 2018 18491

the analog to digital (A/D) converter, which samples the Transformation 2 (A/φ vector): The A/φ vector is a
continuous-time signal at a rate fs = 1/Ts samples per second mapping from the raw complex data vector rk ∈ CN into two
and generates the discrete version rn . The discrete signal real-valued vectors, one that represents its phase, φ, and one
rn = r[nTS ] consists of two components, the in-phase, rI , that represents its magnitude A, i.e.
and quadrature component, rQ , i.e. T
A/φ x
xk = AT (26)
rn := r[n] = rI [n] + jrQ [n] (20) xφ
Suppose, we sample for a period T and collect a batch of N A/φ
where xk ∈ R2×N , and the phase, xφ ∈ RN , and
samples. The signal samples r[n] ∈ C, n = 0, . . . , N − 1, magnitude vectors, xA ∈ RN , have the elements
are a time-series of complex raw samples which may be
rq
represented as a data vector. The k-th data vector can be xφ n = arctan( n ),
denoted as rin
xAn = (rq n + ri 2n )1/2 , n = 0, . . . , N − 1
2
(27)
rk = [r[0], . . . , r[N − 1]]T (21)
In short, this may be written as
These data vectors rk are windowed or segmented
representations of the received continuous sample stream, f : CN → R2×N (28)
similarly as is seen in audio signal processing. They carry A/φ
rk 7 → xk (29)
information for assessing which type of wireless signal is
sensed. This may be the type of modulation, the type of Transformation 3 (FFT vector): The FFT vector is a map-
wireless technology, interferer, etc. ping from the raw time-domain complex data vector rk ∈ CN
into its frequency-domain representation vector consisting of
C. WIRELESS SIGNAL REPRESENTATION two sets of real-valued data vectors, one that carries the real
After collecting the k-th data vector the ML receiver base- component of its complex FFT xFre and one that holds the
band processing chain transforms it into a new representation imaginary component of its FFT xFim . That is
suitable for training. That is, the k-th data vector rk ∈ CN is
xFre T

F
translated into the k-th feature vector xk ∈ RN xk = (30)
xFim T
rk 7 → xk (22) The translation to frequency-domain is performed by a Fast
This paper considers three simple data representations. The Fourier Transform (FFT) denoted by F so that
first, is a real-valued equivalent of the raw complex temporal
F : rk 7 → w (31)
wireless signal inspired by the results in [9]. The second,
is based on the amplitude and phase of the raw wireless signal, xFre = <{w} (32)
similar to the one used in the work of Selim et al. [10] for xFim = ={w} (33)
identifying radar signals. The last is a frequency domain rep-
resentation inspired by the work of Danev and Capkun [28] Here, w ∈ CN , xFre , xFim ∈ RN while <{.} and ={.} can
which showed that frequency-based features outperform their be conceived as operators giving the real and imaginary parts
time-based equivalents for wireless device identification. of a complex vector, respectively. Thus, the resulting FFT
Each data representation snapshot has a fixed length of N data vector is xF
k ∈R
2×N . In short, this may be denoted as
points. f : CN → R2×N (34)

For each transformation data is visualized to form some
intuition about which data representation may provide the rk 7 → xF
k (35)
most discriminative features for machine learning. The fol- Figures 4, 5 and 6 visualize examples of IQ, A/φ and FFT
lowing data/signal transformations are used: feature vectors, respectively.
Transformation 1 (IQ vector): The IQ vector is a mapping The visualizations show representations for different
of the raw complex samples, i.e. data vector rk ∈ CN , into two modulation formats passed through a channel model with
sets of real-valued data vectors, one that carries the in-phase impairments as described in IV-A. These are examples
samples xi and one that holds the quadrature component of 128 samples for modulation formats depicted from the
values xq . That is ‘‘RadioML Modulation’’ dataset introduced in Section V-A.
IQ
IQ
T
x Figure 4 shows xk time plots of the raw sampled complex
xk = i T (23) signal at the receiver for different modulation types. Figure 5
xq
shows the amplitude and phase time plots for modulation
IQ
so that xk ∈ R2×N . Mathematically, this may be written as format examples. Figure 6 shows their frequency magnitude
spectrum. It can be seen that the signals are corrupted due to
f : CN → R2×N (24) the wireless channel effects and transmitter-receiver synchro-
IQ
rk 7 → xk (25) nization imperfections, but there are still distinctive patterns
18492 VOLUME 6, 2018

FIGURE 4. I and Q signals time plot for various modulation schemes.

(a) BPSK. (b) QPSK. (c) 8PSK. (d) QAM16. (e) QAM64. (f) CPFSK. (g) GFSK.
(h) PAM4.
that can be used for deep learning to extract high level features
for wireless signal identification.
The motivation behind using these three transformations is
to train three deep learning models where: one will explore
the raw data to discover the patterns and temporal features
solely from raw samples, one will see the amplitude and
phase information in the time domain, while the third will
see the frequency domain representation to perform feature
extraction in the frequency space.
We investigate how the choice of data representation influ-
ences the classification accuracy. The data representations
have been carefully designed so that all of them create a
vector of the same dimension and type in R2×N . The reason FIGURE 5. Constellation diagram, Amplitude and Phase signal time plot
for various modulation schemes. (a) BPSK. (b) QPSK. (c) 8PSK. (d) QAM16.
for that is to obtain a unified vector shape which will allow to (e) QAM64. (f) CPFSK. (g) GFSK. (h) PAM4.
use the same CNN architecture for training on all three data
representations and for different use cases.
as described in Section IV-B. In total, m snapshots for the data
D. WIRELESS SIGNAL CLASSIFICATION vectors rk are collected. These data vectors contain emitting
The problem of identifying the wireless signals from spec- signals that contain distinctive features. In order to extract
trum data can be treated as a data-driven machine learning these features, each data vector is transformed into a feature
classification problem. In order to apply ML techniques to vector, xk , according to the data transformations introduced
this setup, as described in Section III-A the wireless com- in Section IV-C and the results are stacked into an observation
munication problem has to be formulated as a parametric matrix X ∈ Rm×n . Each data vector is further annotated with
estimation problem where certain parameters are unknown the corresponding wireless signal type in form of a discrete
and need to be estimated. one-hot encoded vector yk ∈ RK , k = 1, . . . , m.
Given a set of K wireless signals to be detected, the problem The obtained data pairs, {(xk , yk ), k = 1, . . . , m}, form a
of identifying a signal from this set turns into a K-class clas- dataset suitable to estimate the parameters, θ, that character-
sification problem. Suppose a data measurement point knows ize the wireless signal classifier, f .
the transmitted signal type (e.g. modulation type, interfering It is instructive to note that the training phase presumes
emitter type, etc.) for a time period t = [0, T ) (i.e. a ‘‘training a prior information about the type of wireless signal the was
period’’) and collects several complex baseband time series used on the transmitter. However, once the classifier is trained
of n measurements for each signal type into a data vector rk , this information will no longer be necessary and the signals
VOLUME 6, 2018 18493

for each task three datasets, S, one per data transformation

are created. That is,
IQ
S IQ = {(xk , yk ), k = 1, . . . , m} (36)
A/φ
S A/φ
= {(xk , yk ), k = 1, . . . , m} (37)
S F = {(xF
k , yk ), k = 1, . . . , m} (38)
where m has the order of tens of thousands instances.
A. DATASETS DESCRIPTION
1) RADIO MODULATION RECOGNITION
To evaluate end-to-end learning for radio modulation type
identification, we consider measurements of the received
wireless signal for various modulation formats from the
‘‘RadioML 2016.10a Modulation’’ dataset [9]. Specifically,
for all experiments performed in this paper we used labelled
data vectors for the following digital modulation formats:
BPSK, QPSK, 8-PSK, 16-QAM, 64-QAM, CPFSK, GFSK,
4-PAM, WBFM, AM-DSB, AM-SSB. The data vectors, xk ,
were collected at a sampling rate 1MS/s in N = 128 sample
batches, each containing between 8 and 16 symbols corrupted
by random noise, time offset, phase, and wireless chan-
FIGURE 6. Frequency magnitude spectrum for various modulation
schemes. (a) BPSK. (b) QPSK. (c) 8PSK. (d) QAM16. (e) QAM64. (f) CPFSK.
nel distortions as described by the channel model in IV-A.
(g) GFSK. (h) PAM4. One-hot encoding is used to create a discrete set of 11 class
labels corresponding to 11 considered modulations, so that
the response variable forms a binary 11-vector yk ∈ R11 .
The task of modulation recognition is then a 11-class classi-
may be automatically identified by the model. That is, for
fication problem. In total, 220,000 data vectors xk ∈ R2×128
the i-th spectrum data vector input, xi , the predictor’s last
consisting of I and Q samples are used.
layer can automatically output an estimate of the probability
P(yi = k|xi ; θ ), where k ranges from 0 to K − 1. That is a
2) WIRELESS INTERFERENCE IDENTIFICATION IN ISM BANDS
score class. Finally, the predicted class is then the one with
the highest score, i.e. ŷi = argmax P(yi = k|xi ; θ ). The rise of heterogeneous wireless technologies operating
k in the unlicensed ISM bands has caused severe communica-
tion challenges due to cross-technology interference, which
V. EVALUATION SETUP adversely affects the performance of wireless networks.
To evaluate end-to-end learning from spectrum data, we train To tackle these challenges novel agile methods that can
CNN wireless signal classifiers for two use cases: (i) Radio assess the channel conditions are needed. We showcase end-
signal modulation recognition and (ii) Wireless interference to-end learning as a promising approach that can deter-
identification, for different wireless data representations. mine whether communication is feasible over the wireless
Radio signal modulation recognition relates to the prob- link by accurately identifying cross-technology interference.
lem of identifying the modulation structure of the received Specifically, the ‘‘Wireless interference’’ dataset [12] is used
wireless signal in spectrum monitoring tasks, as a step which consists of measurements gathered from standardized
towards understanding what type of communication scheme wireless communication systems based on IEEE 802.11b/g
and emitter is present. Modulation recognition is vital for (WiFi), IEEE 802.15.4 (Zigbee) and IEEE 802.15.1 (Blue-
radio spectrum regulation and in dynamic spectrum access tooth) standards, operating in the 2.4GHz frequency band.
applications. The dataset is labelled according to the allocated frequency
Wireless interference identification is the task of identi- channel and the corresponding wireless technology, resulting
fying the type of coexisting wireless emitter, that is operating in 15 different classes. Compared to the modulation recogni-
in the same frequency band. This is essential for effective tion dataset, this dataset consists of measurements gathered
interference mitigation and coexistence management in unli- assuming a communication channel model with less channel
censed frequency bands such as, for example, the 2.4GHz impairments. In particular, a flat fading channel with additive
industrial, scientific and medical (ISM) band shared by het- white Gaussian noise was assumed. I and Q samples were
erogeneous wireless communication systems. collected at a sampling rate 10MS/s in batches of 128 each,
For each task the CNNs were trained on three characteristic capturing hereby 1 to 12 symbols for each utilized wire-
data representations: IQ vectors, Amplitude/Phase vectors less technology depending on the symbol duration. In total,
and FFT vectors, as introduced in Section IV-C. As a result 225,225 snapshots were collected.
18494 VOLUME 6, 2018

B. CNN NETWORK STRUCTURE performance can be evaluated on specific subsets. To ensure

The convolutional neural network structure utilized for that the trained CNN can accurately detect signals under time-
end-to-end learning from spectrum data is derived from varying wireless channel conditions, the wireless training
O’Shea et al. [9], i.e the CNN2 network, as it has shown data used as input to the CNN learning process need to be
to significantly outperform traditional signal identification sufficiently large and flexible by means of incorporating vary-
approaches. ing channel distortions on the emitted signal. Once the filter
coefficients of the CNN model are extracted the model may
TABLE 1. CNN structure. in real-time detect the type of the sensed wireless signal. The
detection efficiency depends on the complexity of the CNN
network structure used at prediction time, i.e. the time needed
to calculate the convolutions and activations in all neurons.
We selected the Adaptive moment estimation (Adam) opti-
mizer [30] to estimate the model parameters with a learning
rate α = 0.001 to ensure convergence. To speed up the
model learning and convergence procedure, the input data
was normalized and the ReLU activation units are selected.
The CNNs were trained on 70 epochs and the model with
the lowest validation loss is selected for evaluation. In total,
6 CNNs were trained, i.e. one for each use case and signal
Table 1 provides a summary of the utilized CNN network. representation. Three for modulation recognition: CNNM IQ ,
The visible layer of the network has a unified size of 2 × 128 CNNM A/φ and CNN M , and three for technology identification
F
receiving either IQ, FFT or Amplitude/Phase captured data CNNIF IF IF
IQ , CNNA/φ and CNNF . The training time on the GPU
vectors, xk ∈ R2×128 , that contain sample values of the resulted in a duration of approximately 60s per epoch for the
complex wireless signal. Two hidden convolutional layers CNNs performing interference identification, while 42s for
further extract high-level features from the input wireless the modulation recognition CNNs.
signal representation using kernels and ReLU activation func-
tions. The first convolutional layer consists of 256 stacked D. PERFORMANCE METRICS
filters of size 1×3 that perform a 2D convolution on the input In order to characterize and compare the prediction accuracy
complex signal representation padded such that the output has of the end-to-end wireless signal classification models that
the same length as the original input. These filters generate recognize modulation type or identify interference, we need
256 (2 × 128) feature maps that are fed as input to the second to measure how well their predictions match the true response
layer which has 80 filters of size 2 × 3. To reduce overfitting, value of the observed spectrum data. Therefore, the perfor-
in each layer regularization is used with a Dropout p = 0.6. mance of the end-to-end signal classification methods can be
Finally, a fully connected layer with 256 neurons and ReLU quantified by means of the prediction accuracy on a test data
units is added. The output of this layer is fed to a softmax sample. If the true value and the estimate of the signal classi-
classifier that estimates the likelihood of the input signal, x, fiers for any instance i are given by yi and ŷi , respectively, then
belonging to a particular class, y. That is P(y = k|x; θ ), where the overall classification test error over mtest testing snapshots
k is a one-hot encoded vector so that k ∈ R15 for the wireless can be defined in the following way:
interference identification case, and k ∈ R11 for modulation
mtest
recognition. 1 X
Etest = l(ŷi , yi ) (39)
mtest
i=1
C. IMPLEMENTATION DETAILS
The CNNs were trained and validated using the Keras [29] The classification accuracy is then obtained with 1 − Etest .
library on a high computation platform on Amazon Elas- Furthermore, for each signal snapshot in the test set,
tic Compute (EC) Cloud with the central processing unit intermediate statistics, i.e. the number of true positive (TP),
(CPU) Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, with false positive (FP) and false negative (FN) are calculated as
60GB RAM and the Cuda enabled graphics processing unit follows:
(GPU) Nvidia Tesla K80. For both use cases, 67% randomly • If a signal is detected as being from a particular class and
selected examples are used for training in batch sizes of 1024, it is also annotated as such in the labelled test data, that
and 33% for testing and validation. Hence, for modulation instance is regarded as TP.
recognition 147,400 examples are used for training, while • If a signal is predicted as being from a particular class
72,600 examples for testing and validation. For the task but does not belong to that class according to the labelled
of interference identification, 151,200 examples are training test data, that instance is regarded as FP.
examples, while 74,025 examples are used to test the model. • If a signal is not detected in a particular instance but it
Both sets of examples are uniformly distributed in Signal-to- is present in that instance in the labelled test data, that
Noise Ratio (SNR) from −20dB to +20dB and tagged so that instance is regarded as FN.
VOLUME 6, 2018 18495

The intermediate statistics are accumulated over all under high SNR conditions depending on the used data rep-
instances in the test set and used to derive three further resentation the achieved Pavg , Ravg and F1avg are in the range
performance metrics precision (P), recall (R) and F1 score: of 0.67-0.86. For medium SNR, the performance degrades
more than for the CNNIF models, with a Pavg , Ravg and F1avg
TP TP
P= , R= (40) in the range of 0.59-0.75. Under low SNR, the CNNM models
TP + FN TP + FP show poor performance with the metrics values in the range
precision × recall
F1 score = 2 × (41) of 0.22-0.36.
precision + recall This may be explained by the different channel models
Precision, recall and F1 score are per-class performance used for generating the datasets for the two case studies,
metrics. In order to obtain one measure that quantifies and the type of signals that need to be discriminated in each
the overall performance of the classifier, multiple per-class problem. For instance, for the IF case a simple channel model
performance measures are combined using a prevalence- with flat fading was considered, while for modulation recog-
weighted macro-average across the class metrics, Pavg , Ravg nition the channel model was a time-varying multipath fading
and F1avg . For a detailed overview of the per-class perfor- channel and other transceiver impairments were also taken
mance the confusion matrix is used. into account. Hence, the modulation recognition dataset used
a more realistic channel model in the data collection process.
TABLE 2. Performance comparison for the trained CNN signal classifier However, this impacts the classification performance because
models for three SNR scenarios. it is more challenging to design a robust signal classifier for
this case compared to the channel condition considered in
the IF classification problem. Furthermore, the signals that
are classified for IF detection have different characteristics
by design. In particular, they use different medium access
schemes, channel bandwidth and modulation techniques,
which makes it easier for the classifier to differentiate them.
In contrast, the selected modulation recognition signals are
more similar to each other, because subsets of modulations
are based on similar design principles (e.g. all are single
carrier modulations).
To understand the results better confusion matrices for
the CNNM M M
IQ , CNNA/φ and CNNF models are presented
on Figure 7 for the case of SNR=6dB. It can be seen that
the classifiers shows good performance by discriminating
AM-DSB, AM-SSB, BPSK, CPFSK, GFSK and PAM4 with
high accuracy for all three data representations. The main
discrepancies are that of QAM16 misclassified as QAM64,
which can be explained by the underlying dataset. QAM16 is
a subset of QAM64 making it difficult for the classifier to
differentiate them. It can be further noticed that the ampli-
E. NUMERICAL RESULTS tude/phase information helped the model better discrimi-
1) CLASSIFICATION PERFORMANCE nate QAM16/QAM64, leading to a clearer diagonal for the
The CNN network described in Table 1 is trained on three CNNM M
A/φ compared to CNNIQ . There are further difficulties
data representations for two wireless signal identification in separating AM-DSB and WBFM signals. This confusion
problems. Table 2 provides the averaged performance for may be caused by periods of absence of the signal, as the
the six classifiers. That is, the prevalence-weighted macro- modulated signals were created from real audio streams.
average of precision, recall and F1 score under three SNR In case of using the frequency spectrum data, it can be noticed
scenarios, high (SNR=18dB), medium (SNR=0dB) and low that the CNNM F classifier confuses mostly QPSK, 8PSK,
(SNR=−8dB). QAM16 and QAM16 which is due to their similarities in
We observe that the models for interference classification the frequency domain after channel distortions, making the
show better performance compared to the modulation recog- received symbols indiscernible from each other.
nition case. For high SNR conditions, the CNNIF models
achieve a Pavg , Ravg and F1avg between 0.98 and 0.99. For 2) NOISE SENSITIVITY
medium SNR the metrics are in the range of 0.94 and 0.99, In this section, we evaluate the detection performance for
while under low SNR conditions the performance slightly the CNN signal classifiers under different noise levels. This
degrades to 0.81-0.90. The CNNM models show less robust- allows to investigate the communication range over which the
ness to varying SNR conditions, and in general achieve lower classifiers can be effectively used. To estimate the sensitivity
classification performance for all scenarios. In particular, to noise the same testing sets were used labelled with SNR
18496 VOLUME 6, 2018

FIGURE 8. Performance results for modulation recognition classifiers

vs. SNR.
FIGURE 9. Performance results for interference identification classifiers

vs. SNR.
conditions (> 5dB) the CNNM A/φ model outperforms the

CNNM I /Q and CNN M model with up to 2% and 12% accuracy
F
improvements, respectively. O’Shea et al. [9] used IQ data
and reported higher accuracy than the results we obtained.
We were not able to reproduce their results after various
attempts on the IQ data, which may be due to the difference
in the dataset (e.g. number of training examples), train/test
split and hyper-parameter tuning. However, we noticed that
the amplitude/phase representation helped the model discrim-
inate the modulation formats better compared to raw IQ time-
series data for high SNR scenarios. We regret that results
for amplitude/phase representations were not reported in [9]
FIGURE 7. Confusion matrices for the modulation recognition data for too, as this may had helped improving performance. Using
SNR 6dB. (a) CNNM M M
IQ . (b) CNNA/φ . (c) CNNF . the frequency spectrum data did not improve the classifica-
tion accuracy compared to the IQ data. This is expected as
values from −20dB to +20dB and fed into the signal classi- the underlying dataset has many modulation classes, which
fiers to obtain the estimated values for each SNR. exhibit common characteristics in the frequency domain after
Figures 8 and 9 show the obtained results for the modula- the channel distortion and receiver imperfection effects, par-
tion recognition and IF identification models, respectively. ticularly QPSK, 8PSK, QAM16 and QAM64. This makes
the frequency spectrum a sub-optimal representations for this
a: MODULATION RECOGNITION CASE classification problem.
Figure 8 shows that all three modulation recognition CNN
models have similar performance for very low SNRs b: INTERFERENCE DETECTION CASE
(<−10dB), for medium SNRs the CNNM I /Q outperforms the The IF identification models on Figure 9 show in general
CNNM A/φ and CNN M models by 2-5dB, while for high SNR
F better performance compared to the modulation recognition
VOLUME 6, 2018 18497

classifiers, where the CNNIF F showed best performance dur- VI. OPEN CHALLENGES
ing all SNR scenarios. In particular, for low SNR scenar- Despite the encouraging research results, a deep learning-
ios significant improvements can be noticed compared to based end-to-end learning framework for spectrum utilization
the CNNIF IF
A/φ and CNNI /Q models with a performance gain optimization is still in its infancy. In the following we discuss
improvement of at least ∼ 4dB, and classification accuracy some of the most important challenges posed by this exciting
improvement of at least ∼ 9%. Schmidt et al. [12] used interdisciplinary field.
IQ and FFT data representations and reported similar results
as our CNNIF IF
I /Q and CNNF models. However, again we A. SCALABLE SPECTRUM MONITORING
noticed that the amplitude/phase representation is beneficial The first requirement for a cognitive spectrum monitoring
for discriminating signals compared to raw IQ data. But the framework is to have an infrastructure that will support scal-
IF identification classifier performed best on FFT data repre- able spectrum data collection, transfer and storage. In order
sentations. This may be explained by the fact that the wireless to obtain a detailed overview of the spectrum use, the end-
signals from the ISM band standards (ZigBee, WiFi and Blue- devices will be required to perform distributive spectrum
tooth) have more expressive features in the frequency domain sensing [32] over a wide frequency range and cover the area
as they have different frequency spectrum characteristics in of interest. In order to limit the data overhead caused by
terms of bandwidth and modulation/spreading method. huge amounts of I and Q samples that are generated by
Examples of other existing research attempts that study monitoring devices, the predictive models can be pushed to
the application of CNNs to radio signal identification the end devices itself. Recently, [33] proposed Electrosense,
are [10] and [11]. Selim et al. [10] trained a CNN with an initiative for large-scale spectrum monitoring in different
5 convolutional and 2 fully connected layers to identify radar regions of the world using low-cost sensors and providing
signals based on amplitude and phase shifts data. Compared the processed spectrum data as textitopen spectrum data.
to the methodology presented in our work, Selim et al. [10] Access to large datasets is crucial for evaluating research
solved a binary classification problem, and as such the model advances and enabling a playground for wireless communi-
is evaluated using as metric the probability of radar pulse cation researchers interested to acquire a deeper knowledge
detection. Akeret et al. [11] train a CNN based on the of spectrum usage and to extract meaningful knowledge that
U-Net [31] architecture to detect RF interference in radio can be used to design better wireless communication systems.
astronomy applications. They use different performance met-
rics, such as the Area under curve (AUC) and receiver oper-
B. SCALABLE SPECTRUM LEARNING
ating curve (ROC) without a noise sensitivity performance
analysis (model accuracy vs. SNR). The heterogeneity of technologies operating in different radio
bands requires to continuously monitor multiple frequency
bands making the volume and velocity of radio spectrum data
3) TAKEAWAYS several orders of magnitude higher compared to the typical
End-to-end learning is a powerful tool for data-driven spec- data seen in other wireless communication systems such as
trum monitoring applications. It can be applied to various wireless sensor networks (e.g. temperature, humidity reports,
wireless signals to effectively detect the presence of radio etc.). In order to handle this large volume of data and extract
emitters in a unified way without requiring design of expert meaningful information over the entire spectrum, a scalable
features. Experiments have shown that the performance of platform for processing, analysing and learning from big
wireless signal classifiers depends on the used data repre- spectrum data has to be designed and implemented [3], [34].
sentation. This suggests that investigating several data rep- Efficient data processing and storage systems and algorithms
resentations is important to arrive at accurate wireless signal for massive spectrum data analytics [35] are needed to extract
classifiers for a particular task. Furthermore, the choice of valuable information from such data and incorporate it into
data representation depends on the specifics of the problem, the spectrum decision/policy process in real-time.
i.e. the considered wireless signal types for classification.
Signals within a dataset that exhibit similar characteristics C. FLEXIBLE SPECTRUM MANAGEMENT
in one data representation are more difficult to discriminate, One of the main communication challenges for 5G will be
which puts a higher burden on the model learning proce- inter-cell and cross-technology interference. To support spec-
dure. Choosing the right wireless data representation can trum decisions and policies in such complex system, 5G net-
notably increase the classification performance, for which works need to support an architecture for flexible spectrum
domain knowledge about the specifics of the underlying management.
signals targeted in the spectrum monitoring application can Software-ization at the radio level will be a key enabler
assist. Additionally, the performance of the classifier can be for flexible spectrum management as it allows automation
improved by increasing the quality of the wireless signal for the collection of spectrum data, flexible control and
dataset, by adding more training examples, more variation reconfiguration of cognitive radio elements and parameters.
among the examples (e.g. varying channel conditions), and There are several individual works that focused on this issue.
tuning the model hyper-parameters. Some initiatives for embedded devices are WiSCoP [36],
18498 VOLUME 6, 2018

Atomix [37] and [38]. Recently, there is also a growing inter- learning and function approximation techniques, well-suited
est in academia and industry to apply Software Defined Net- for different wireless signal classification problems. Further-
working (SDN) and Network Function Virtualization (NFV) more, the presented results indicated that for the wireless
to wireless networks [39]. Initiatives such as SoftAir [40], communication domain investigating different wireless data
Cloud RAN [41], OpenRadio [42] and several others are representations is important to determine the right representa-
still at the conceptual or prototype level. To bring flexible tion that exhibits discriminative characteristics for the signals
spectrum management strategies into realization and the com- that need to be classified. Specifically, in the modulation
mercial perspective a great deal of standardization efforts is recognition case study for medium-high SNR the CNN model
still required. trained on amplitude/phase representations outperformed the
other two models with a 2% and 10% performance improve-
D. SPECTRUM PRIVACY ment, while for low SNR conditions the model trained on IQ
The introduction of intelligent wireless systems raises several data representations showed best performance. For the task of
privacy issues. The spectrum will be monitored via hetero- detecting interference, the model trained on FFT data outper-
geneous radios including wireless sensor networks (WSNs), formed amplitude/phase and IQ data representation models
radio-frequency identification (RFID), cellular phones and by up to 20% for low SNR conditions, while for medium-
others, which may lead to misuse of the applications and high SNR up to 5% classification accuracy improvements.
cause severe privacy-related threats. Therefore, privacy is These results demonstrate the importance of both choos-
required at the spectrum data collection level. As spectrum ing the correct data representation and machine learning
data may be shared along the way, privacy has to be main- approach, both of which are systematically introduced in
tained also at data sharing levels. Thus, data anonymization, this paper. By following the proposed methodology, deeper
restricted data access, proper authentication and strict control insights can be obtained regarding the optimality of data
of intelligent radio users is required. representations for different research domains. As such,
we envisage this paper to empower and guide machine
VII. CONCLUSION learning/signal processing practitioners and wireless engi-
This paper presents a comprehensive and systematic intro- neers to design new innovative research applications of end-
duction to end-to-end learning from spectrum data - a deep to-end learning from spectrum data that address issues related
learning based unified approach for realizing various wire- to cross-technology coexistence, inefficient spectrum utiliza-
less signal identification tasks, which are the main build- tion and regulation.
ing blocks of spectrum monitoring systems. The approach
develops around the systematic application of deep learning ACKNOWLEDGEMENTS
techniques to obtain accurate wireless signal classifiers in an The authors would like to thank Associate
end-to-end learning pipeline. In particular, convolutional neu- Prof. Gerard J. M. Janssen for his insightful comments and
ral networks (CNNs) lend themselves well to this setting, Schmidt et al. [12] for sharing the ‘‘Wireless interference’’
because they consist of many layers of processing units capa- dataset.
ble to (i) automatically extract non-linear and more abstract REFERENCES
wireless signal features that are invariant to local spectral and [1] M. Höyhtyä et al., ‘‘Spectrum occupancy measurements: A survey and
temporal variations, and (ii) train wireless signal classifiers use of interference maps,’’ IEEE Commun. Surveys Tuts., vol. 18, no. 4,
that can outperform traditional approaches. pp. 2386–2414, 4th Quart., 2016.
[2] A. Hithnawi, H. Shafagh, and S. Duquennoy, ‘‘Understanding the impact
With the aim to raise awareness of the potential of this of cross technology interference on IEEE 802.15.4,’’ in Proc. 9th ACM Int.
emerging interdisciplinary research area, first, machine learn- Workshop Wireless Netw. Testbeds, Experim. Eval. Characterization, 2014,
ing, deep learning and CNNs were briefly introduced and a pp. 49–56.
[3] G. Ding, Q. Wu, J. Wang, and Y.-D. Yao. (2014). ‘‘Big spectrum data:
reference model for their application for spectrum monitoring The new resource for cognitive wireless networking.’’ [Online]. Available:
scenarios was proposed. Then, a framework for end-to-end https://arxiv.org/abs/1404.6508 2014
learning from spectrum data was presented. In particular, [4] U. Müller, J. Ben, E. Cosatto, B. Flepp, and Y. L. Cun, ‘‘Off-road obstacle
avoidance through end-to-end learning,’’ in Proc. Adv. Neural Inf. Process.
wireless data collection, the design of wireless signal features Syst., 2006, pp. 739–746.
and classifiers suitable for several wireless signal identifi- [5] E. Axell, G. Leus, E. G. Larsson, and H. V. Poor, ‘‘Spectrum sensing
cation tasks are elaborated. Three common wireless signal for cognitive radio: State-of-the-art and recent advances,’’ IEEE Signal
Process. Mag., vol. 29, no. 3, pp. 101–116, May 2012.
representations were defined, the raw IQ temporal wireless
[6] K. Kim, I. A. Akbar, K. K. Bae, J.-S. Um, C. M. Spooner, and J. H. Reed,
signal, the time domain amplitude and phase information ‘‘Cyclostationary approaches to signal detection and classification in cog-
data, and the spectral magnitude representation. The pre- nitive radio,’’ in Proc. 2nd IEEE Int. Symp. New Frontiers Dyn. Spectr.
sented methodology was validated on two active wireless Access Netw. (DySPAN), Apr. 2007, pp. 212–215.
[7] A. Fehske, J. Gaeddert, and J. H. Reed, ‘‘A new approach to signal
signal identification research problems: (i) modulation recog- classification using spectral correlation and neural networks,’’ in Proc.
nition crucial for dynamic spectrum access applications and 1st IEEE Int. Symp. New Frontiers Dyn. Spectr. Access Netw. (DySPAN),
(ii) wireless interference identification essential for effec- Nov. 2005, pp. 144–150.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
tive interference mitigation strategies in unlicensed bands. with deep convolutional neural networks,’’ in Proc. Adv. Neural Inf. Pro-
Experiments have shown that CNNs are promising feature cess. Syst., 2012, pp. 1097–1105.
VOLUME 6, 2018 18499

[9] T. J. O’Shea, J. Corgan, and T. C. Clancy, ‘‘Convolutional radio modulation [34] A. Zaslavsky, C. Perera, and D. Georgakopoulos. (2013). ‘‘Sensing as a
recognition networks,’’ in Proc. Int. Conf. Eng. Appl. Neural Netw., 2016, service and big data.’’ [Online]. Available: https://arxiv.org/abs/1301.0159
pp. 213–226. [35] A. Sandryhaila and J. M. F. Moura, ‘‘Big data analysis with signal pro-
[10] A. Selim, F. Paisana, J. A. Arokkiam, Y. Zhang, L. Doyle, and cessing on graphs: Representation and processing of massive data sets with
L. A. DaSilva. (2017). ‘‘Spectrum monitoring for radar bands using deep irregular structure,’’ IEEE Signal Process. Mag., vol. 31, no. 5, pp. 80–90,
convolutional neural networks.’’ [Online]. Available: https://arxiv.org/abs/ Sep. 2014.
1705.00462 [36] T. Kazaz, X. Jiao, M. Kulin, and I. Moerman, ‘‘Demo: WiSCoP—Wireless
[11] J. Akeret, C. Chang, A. Lucchi, and A. Refregier, ‘‘Radio frequency sensor communication prototyping platform,’’ in Proc. Int. Conf. Embed-
interference mitigation using deep convolutional neural networks,’’ Astron. ded Wireless Syst. Netw., 2017, pp. 246–247.
Comput., vol. 18, pp. 35–39, Jan. 2017. [37] M. Bansal, A. Schulman, and S. Katti, ‘‘Atomix: A framework for deploy-
[12] M. Schmidt, D. Block, and U. Meier. (2017). ‘‘Wireless interference ing signal processing applications on wireless infrastructure,’’ in Proc.
identification with convolutional neural networks.’’ [Online]. Available: NSDI, 2015, pp. 173–188.
https://arxiv.org/abs/1703.00737 [38] T. Kazaz, C. Van Praet, M. Kulin, P. Willemen, and I. Moerman, ‘‘Hard-
[13] S. Rajendran, W. Meert, D. Giustiniano, V. Lenders, and S. Pollin. ware accelerated SDR platform for adaptive air interfaces,’’ in Proc. Work-
(2017). ‘‘Distributed deep learning models for wireless signal classification shop Future Radio Technol. (ETSI) Air Interfaces, 2016, pp. 1–26.
with low-cost spectrum sensors.’’ [Online]. Available: https://arxiv.org/ [39] Z. Zaidi, V. Friderikos, Z. Yousaf, S. Fletcher, M. Dohler, and
abs/1707.08908 H. Aghvami. (2017). ‘‘Will SDN be part of 5G?’’ [Online]. Available:
[14] T. O’Shea and J. Hoydis, ‘‘An introduction to deep learning for the physical https://arxiv.org/abs/1708.05096
layer,’’ IEEE Trans. Cogn. Commun. Netw., vol. 3, no. 4, pp. 563–575, [40] I. F. Akyildiz, P. Wang, and S.-C. Lin, ‘‘Softair: A software defined
Dec. 2017. networking architecture for 5G wireless systems,’’ Comput. Netw., vol. 85,
[15] S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. Abdelzaher, ‘‘Deepsense: pp. 1–18, Jul. 2015.
A unified deep learning framework for time-series mobile sensing data [41] A. Checko et al., ‘‘Cloud RAN for mobile networks—A technology
processing,’’ in Proc. 26th Int. Conf. World Wide Web, 2017, pp. 351–360. overview,’’ IEEE Commun. Surveys Tuts., vol. 17, no. 1, pp. 405–426,
[16] D. Chen, S. Yin, Q. Zhang, M. Liu, and S. Li, ‘‘Mining spectrum usage 1st Quart., 2015.
data: A large-scale spectrum measurement study,’’ in Proc. 15th Annu. Int. [42] M. Bansal, J. Mehlman, S. Katti, and P. Levis, ‘‘Openradio: A pro-
Conf. Mobile Comput. Netw., 2009, pp. 1–12. grammable wireless dataplane,’’ in Proc. 1st Workshop Hot Topics Softw.
Defined Netw., 2012, pp. 109–114.
[17] S. Haykin, ‘‘Cognitive radio: Brain-empowered wireless communica-
[43] A.-J. Van Der Veen, E. F. Deprettere, and A. L. Swindlehurst, ‘‘Subspace-
tions,’’ IEEE J. Sel. Areas Commun., vol. 23, no. 2, pp. 201–220, Feb. 2005.
based signal analysis using singular value decomposition,’’ Proc. IEEE,
[18] I. F. Akyildiz, W.-Y. Lee, M. C. Vuran, and S. Mohanty, ‘‘A survey on
vol. 81, no. 9, pp. 1277–1308, Sep. 1993.
spectrum management in cognitive radio networks,’’ IEEE Commun. Mag.,
vol. 46, no. 4, pp. 40–48, Apr. 2008.
[19] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, ‘‘Internet of
Things (IoT): A vision, architectural elements, and future directions,’’
Future Generat. Comput. Syst., vol. 29, no. 7, pp. 1645–1660, 2013.
[20] S. Gollakota, F. Adib, D. Katabi, and S. Seshan, ‘‘Clearing the RF
smog: Making 802.11 n robust to cross-technology interference,’’ ACM
SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 170–181, 2011. MERIMA KULIN received the M.Sc. degree
[21] A. A. Khan, M. H. Rehmani, and A. Rachedi, ‘‘Cognitive-radio-based (summa cum laude) in electrical engineering from
Internet of Things: Applications, architectures, spectrum related function- the Department for Telecommunications, Univer-
alities, and future research directions,’’ IEEE Wireless Commun., vol. 24, sity of Sarajevo, in 2012. She is currently pursuing
no. 3, pp. 17–25, Jun. 2017. the Ph.D. degree with Ghent University. In 2013,
[22] Q. Wu et al., ‘‘Cognitive Internet of Things: A new paradigm beyond she joined JSC Elektroprivreda, as an Expert
connection,’’ IEEE Internet Things J., vol. 1, no. 2, pp. 129–143, Apr. 2014. Associate for ICT Management and Maintenance.
[23] G. Staple and K. Werbach, ‘‘The end of spectrum scarcity [spectrum In 2015, she started her research activities with the
allocation and utilization],’’ IEEE Spectr., vol. 41, no. 3, pp. 48–52, Department of Information Technology (INTEC),
Mar. 2004. Ghent University. She was actively involved in the
[24] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. EU H2020 WiSHFUL, eWINE, and SBO SAMURAI research projects. Her
Cambridge, MA, USA: MIT Press, 2016. [Online]. Available: http://www. main research interests include Internet of Things, network architectures and
deeplearningbook.org protocols, cognitive radio networks, data mining, machine learning, and self-
[25] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and learning networks.
R. Salakhutdinov, ‘‘Dropout: A simple way to prevent neural networks
from overfitting,’’ J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958,
2014.
[26] M. Kulin, C. Fortuna, E. De Poorter, D. Deschrijver, and I. Moerman,
‘‘Data-driven design of intelligent wireless networks: An overview and
tutorial,’’ Sensors, vol. 16, no. 6, p. 790, 2016.
[27] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo, ‘‘Machine TARIK KAZAZ received the M.Sc. degree (cum
learning paradigms for next-generation wireless networks,’’ IEEE Wireless laude) in electrical engineering from the Depart-
Commun., vol. 24, no. 2, pp. 98–105, Apr. 2017.
ment for Telecommunications, University of Sara-
[28] B. Danev and S. Capkun, ‘‘Transient-based identification of wireless sen-
jevo, in 2012. He is currently pursuing the Ph.D.
sor nodes,’’ in Proc. Int. Conf. Inf. Process. Sensor Netw., 2009, pp. 25–36.
research with the Circuits and Systems Research
[29] F. Chollet et al. (2015). Keras. [Online]. Available: https://github.com/
fchollet/keras
Group, Delft University of Technology. In 2013, he
[30] D. P. Kingma and J. Ba. (2014). ‘‘Adam: A method for stochastic optimiza-
joined BH Mobile, where he was a Radio Access
tion.’’ [Online]. Available: https://arxiv.org/abs/1412.6980 Network Engineer, while at the same time he was
[31] O. Ronneberger, P. Fischer, and T. Brox, ‘‘U-Net: Convolutional networks a part-time Teaching Assistant with the Faculty
for biomedical image segmentation,’’ in Proc. Int. Conf. Med. Image of Electrical Engineering, University of Sarajevo.
Comput. Comput.-Assist. Intervent, 2015, pp. 234–241. In 2015, he joined the Department of Information Technology, Ghent Univer-
[32] W. Liu et al., ‘‘Heterogeneous spectrum sensing: challenges and method- sity, as a Ph.D. Researcher. He was active in several national and international
ologies,’’ EURASIP J. Wireless Commun. Netw., vol. 2015, p. 70, research projects, including EU H2020 ORCA, WiSHFUL, iMinds’ IoT
Dec. 2015. Strategic Research Program, and NWO SuperGPS. His main research inter-
[33] S. Rajendran et al. (2017). ‘‘Electrosense: Open and big spectrum data.’’ ests are wireless networks, signal processing for communications, software-
[Online]. Available: https://arxiv.org/abs/1703.09989 defined radio and cognitive radio, and hardware-software co-design.
18500 VOLUME 6, 2018

INGRID MOERMAN received the degree in elec- ELI DE POORTER received the master’s degree in
trical engineering and the Ph.D. degree from Ghent computer science engineering from Ghent Univer-
University, in 1987 and 1992, respectively. She sity, Belgium, in 2006, and the Ph.D. degree from
was a part-time Professor with Ghent University the Department of Information Technology, Ghent
in 2000. She is a Staff Member at IDLab, a core University, in 2011. After obtaining his Ph.D., he
research group of imec with research activities received the FWO Post-Doctoral Research Grant
embedded in Ghent University and University of and is currently a Professor at the same research
Antwerp. She is coordinating the research activ- group, where he is currently involved in and/or
ities on mobile and wireless networking. She is research coordinator of several national and inter-
leading a research team of about 30 members at national projects. He is currently a Professor with
Ghent University. She has a longstanding experience in running and coordi- Ghent University. He has authored or co-authored over 100 papers in
nating national and EU research funded projects. At the European level, she international journals or the proceedings of international conferences. His
is in particular very active in the Future Connectivity Systems research area, main research interests include wireless network protocols, IoT, network
where she has coordinated and is coordinating several FP7/H2020 projects architectures, wireless sensor and ad hoc networks, future Internet, machine
(CREW, WiSHFUL, eWINE, and ORCA). She has authored or co-authored learning and self-learning networks, and indoor localization. He is part of the
over 700 publications in international journals or conference proceedings. program committee of several conferences.
Her main research interests include Internet of Things, low power wide area
networks, high-density wireless access networks, collaborative and coop-
erative networks, intelligent cognitive radio networks, real-time software-
defined radio, flexible hardware/software architectures for radio/network
control and management, and experimentally supported research.
VOLUME 6, 2018 18501

End-to-End Learning From Spectrum Data: A Deep Learning Approach For Wireless Signal Identification in Spectrum Monitoring Applications

Uploaded by

Copyright:

Available Formats

End-to-End Learning From Spectrum Data: A Deep Learning Approach For Wireless Signal Identification in Spectrum Monitoring Applications

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

End-to-End Learning From Spectrum Data: A Deep Learning Approach For Wireless Signal Identification in Spectrum Monitoring Applications

Uploaded by

Copyright:

Available Formats

SPECIAL SECTION ON REAL-TIME EDGE ANALYTICS FOR BIG DATA IN INTERNET OF THINGS

End-to-End Learning From Spectrum Data:

I. INTRODUCTION It is indisputable that monitoring and understanding the

VOLUME 6, 2018 18485

B. RELATED WORK applications in wireless communication. Yao et al. [15] pro-

18486 VOLUME 6, 2018

cross-technology interference and scarcity of interference-

VOLUME 6, 2018 18487

1) PRELIMINARIES to the training data, S, so that f is a good estimator for new

18488 VOLUME 6, 2018

VOLUME 6, 2018 18489

FIGURE 2. Processing pipeline for end-to-end learning from spectrum data.

1) DATA ACQUISITION without causing interference to other users. This process is

18490 VOLUME 6, 2018

where h(t, τ ) is the band-limited bandpass channel impulse B. DATA ACQUISITION

VOLUME 6, 2018 18491

points. f : CN → R2×N (34)

18492 VOLUME 6, 2018

FIGURE 4. I and Q signals time plot for various modulation schemes.

VOLUME 6, 2018 18493

for each task three datasets, S, one per data transformation

18494 VOLUME 6, 2018

B. CNN NETWORK STRUCTURE performance can be evaluated on specific subsets. To ensure

VOLUME 6, 2018 18495

18496 VOLUME 6, 2018

FIGURE 8. Performance results for modulation recognition classifiers

FIGURE 9. Performance results for interference identification classifiers

conditions (> 5dB) the CNNM A/φ model outperforms the

VOLUME 6, 2018 18497

18498 VOLUME 6, 2018

VOLUME 6, 2018 18499

18500 VOLUME 6, 2018

VOLUME 6, 2018 18501

You might also like