Master Thesis Rodrigo Verissimo

Download as pdf or txt
Download as pdf or txt
You are on page 1of 121

PCI Conflict and RSI Collision Detection in LTE Networks

Using Supervised Learning Techniques

Rodrigo Miguel Martins Diz Miranda Veríssimo

Thesis to obtain the Master of Science Degree in:


Electrical and Computer Engineering

Supervisor(s): Doctor António José Castelo Branco Rodrigues


Doctor Maria Paula dos Santos Queluz Rodrigues
Doctor Pedro Manuel de Almeida Carvalho Vieira

Examination Committee
Chairperson: Doctor José Eduardo Charters Ribeiro da Cunha Sanguino
Supervisor: Doctor Pedro Manuel de Almeida Carvalho Vieira
Member of the Committee: Doctor Pedro Joaquim Amaro Sebastião

November 2017
Acknowledgments

First of all, I would like to thank my supervisor, Professor António Rodrigues, and my co-supervisors, Pro-
fessor Pedro Vieira and Professor Maria Paula Queluz, for all the support and insights given throughout
the Thesis. I would also like to thank CELFINET for the unique opportunity to work in a great environ-
ment while doing this project, specially Eng. João Ferraz, who helped me understand the discussed
network conflicts and the database structure. Additionally, I would like to express my gratitude to Eng.
Luzia Carias for helping me in the data gathering process, and also to Eng. Marco Sousa for discussing
ideas related to Data Science and Machine Learning.
I would like to thank the instructors from the Lisbon Data Science Starters Academy for their discus-
sions and guidance related to this Thesis and Data Science in general, namely Eng. Pedro Fonseca,
Eng. Sam Hopkins, Eng. Hugo Lopes and João Ascensão.
To all my friends and colleagues that helped me through these last 5 years in Técnico, by studying
and collaborating in course projects, or by just being great people to be with. Namely, André Rabaça,
Bernardo Gomes, Diogo Arreda, Diogo Marques, Eric Herji, Filipe Fernandes, Francisco Franco, Fran-
cisco Lopes, Gonçalo Vilela, João Escusa, João Ramos, Jorge Atabão, José Dias, Luı́s Fonseca, Miguel
Santos, Nuno Mendes, Paul Schydlo, Rúben Borralho, Rúben Tadeia, Rodrigo Zenha and Tomás Alves.

iii
iv
Abstract

Nowadays, mobile networks are rapidly changing, which makes it difficult to maintain good and clean
Physical Cell Identity (PCI) and Root Sequence Index (RSI) plans. These are essential for the Quality
of Service (QoS) and mobility of Long Term Evolution (LTE) mobile networks, since bad PCI and RSI
plans can introduce wireless network problems such as failed handovers, service drops and failed ser-
vice establishments and re-establishments. Thereupon, it is possible in theory to identify PCI and RSI
conflicting cells through the analysis of relevant Key Performance Indicators (KPI) to both problems. To
do so, each cell must be labeled in accordance to configured cell relations. Machine Learning (ML)
classification can then be applied in these conditions.
This thesis aims to present ML approaches to classify time series data from mobile network KPIs,
detect the most relevant KPIs to PCI and RSI conflicts, construct ML models to classify PCI and RSI
conflicting cells with a minimum False Positive (FP) rate and near real time performance, as well as
their test results. To achieve these goals, three hypotheses were tested in order to obtain the best
performing ML models. Furthermore, bias was reduced by testing five different classification algorithms,
namely Adaptive Boosting (AB), Gradient Boost (GB), Extremely Randomized Trees (ERT), Random
Forest (RF) and Support Vector Machines (SVM). The obtained models were evaluated in accordance
to their average Precision and peak Precision metrics. Lastly, the used data was obtained from a real
LTE network.
The best performing models were obtained by using each KPI measurement as an individual fea-
ture. The highest average Precision obtained for PCI confusion detection was 31% and 26% for the 800
MHz and 1800 MHz frequency bands, respectively. No conclusions were taken concerning PCI collision
detection, due to the marginally low number of 6 PCI collisions in the dataset. The highest average Pre-
cision obtained for RSI collision detection was 61% and 60% for the 800 MHz and 1800 MHz frequency
bands, respectively.

Keywords: Wireless Communications, LTE, Machine Learning. Classification, PCI Conflict,


RSI Collision.

v
vi
Resumo

Atualmente, as redes móveis estão a ser modificadas rapidamente, o que dificulta a manutenção de
bons planos de Physical Cell Identity (PCI) e de Root Sequence Index (RSI). Estes dois parâmetros são
essenciais para uma boa Qualidade de Serviço (QoS) e mobilidade de redes móveis Long Term Evolu-
tion (LTE), pois maus planos de PCI e de RSI poderão levar a problemas de redes móveis, tais como
falhas de handovers, de estabelecimento e de restabelecimento de serviços, e quedas de serviços.
Como tal, é possı́vel, em teoria, identificar conflitos de PCI e colisões de RSI através da análise de
Key Performance Indicators (KPI) relevantes a cada problema. Para tal, cada célula LTE necessita de
ser identificada como conflituosa ou não conflituosa de acordo com as relações de vizinhança. Nestas
condições, é possı́vel aplicar algoritmos de classificação de Aprendizagem Automática (ML).
Esta Tese pretende apresentar abordagens de ML para classificação de séries temporais prove-
nientes de KPIs de redes móveis, obter os KPIs mais relevantes para a deteção de conflitos de PCI
e de RSI, construir modelos de ML com um número mı́nimo de Falsos Positivos (FP) e desempenho
em quase tempo real. Para alcançar estes objetivos, foram testadas três hipóteses de modo a obter
os modelos de ML com melhor desempenho. Foram testados cinco algoritmos de classificação distin-
tos, nomeadamente Adaptive Boosting (AB), Gradient Boost (GB), Extremely Randomized Trees (ERT),
Random Forest (RF) e Support Vector Machines (SVM). Os modelos obtidos foram avaliados de acordo
com as Precisões médias e picos de Precisão. Por último, os dados foram obtidos de uma rede LTE
real.
Os melhores modelos foram obtidos ao utilizar cada medição de KPI como uma variável individual.
A maior Precisão média obtida para confusões de PCI foi de 31% e de 26% para as bandas de 800 MHz
a de 1800 MHz, respetivamente. Devido ao número bastante baixo de seis colisões de PCI presentes
nos dados obtidos, não foi possı́vel retirar nenhuma conclusão relativamente à sua deteção. A maior
Precisão média obtida para colisões de RSI foi de 61% e de 60% para as bandas de 800 MHz e de
1800 MHz, respetivamente.

Palavras Chave: Comunicações Móveis, LTE, Aprendizagem Automática, Classificação,


Conflito de PCI, Colisão de RSI.

vii
viii
Contents

Acknowledgments iii

Abstract v

Resumo vii

List of Figures xiv

List of Tables xv

List of Symbols xviii

Acronyms xxiii

1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 LTE Background 3
2.1 Introduction to LTE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 LTE Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Core Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Radio Access Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Multiple Access Techniques Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 OFDMA Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.2 SC-FDMA Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.3 MIMO Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Physical Layer Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.1 Transport Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.2 Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.3 Downlink User Data Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

ix
2.4.4 Uplink User Data Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5.1 Idle Mode Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.2 Intra-LTE Handovers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.3 Inter-system Handovers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Performance Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6.1 Performance Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6.2 Key Performance Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6.3 Configuration Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Machine Learning Background 27


3.1 Machine Learning Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Machine Learning Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4 Underfitting and Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5 Dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.6 Feature Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.7 More Data and Cleverer Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.8 Classification in Multivariate Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.9 Proposed Classification Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.9.1 Adaptive Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.9.2 Gradient Boost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.9.3 Extremely Randomized Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.9.4 Random Forest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.9.5 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.10 Classification Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4 Physical Cell Identity Conflict Detection 47


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Key Performance Indicator (KPI) Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Network Vendor Feature Based Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Global Cell Neighbor Relations Based Detection . . . . . . . . . . . . . . . . . . . . . . . 52
4.4.1 Data Cleaning Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.2 Classification Based on Peak Traffic Data . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.3 Classification Based on Feature Extraction . . . . . . . . . . . . . . . . . . . . . . 61
4.4.4 Classification Based on Raw Cell Data . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 Preliminary Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

x
5 Root Sequence Index Collision Detection 71
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Key Performance Indicator Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.3 Global Cell Neighbor Relations Based Detection . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.1 Data Cleaning Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.3.2 Peak Traffic Data Based Classification . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.3 Feature Extraction Based Classification . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.4 Raw Cell Data Based Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4 Preliminary Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6 Conclusions 87
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

A PCI and RSI Conflict Detection 91

Bibliography 97

xi
xii
List of Figures

2.1 The EPS network elements (adapted from [6]). . . . . . . . . . . . . . . . . . . . . . . . . 4


2.2 Overall E-UTRAN architecture (adapted from [6]). . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Frequency-domain view of the LTE multiple-access technologies (adapted from [6]). . . . 7
2.4 MIMO principle with two-by-two antenna configuration (adapted from [4]). . . . . . . . . . 8
2.5 Preserving orthogonality between sub-carriers (adapted from [5]). . . . . . . . . . . . . . 8
2.6 OFDMA transmitter and receiver (adapted from [4]). . . . . . . . . . . . . . . . . . . . . . 10
2.7 SC-FDMA transmitter and receiver with frequency domain signal generation (adapted
from [4]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.8 OFDMA reference symbols to support two eNB transmit antennas (adapted from [4]). . . 12
2.9 LTE modulation constellations (adapted from [4]). . . . . . . . . . . . . . . . . . . . . . . . 14
2.10 Downlink resource allocation at eNB (adapted from [4]). . . . . . . . . . . . . . . . . . . . 14
2.11 Uplink resource allocation controlled by eNB scheduler (adapted from [4]). . . . . . . . . . 17
2.12 Data rate between TTIs in the uplink direction (adapted from [4]). . . . . . . . . . . . . . . 17
2.13 Intra-frequency handover procedure (adapted from [4]). . . . . . . . . . . . . . . . . . . . 20
2.14 Automatic intra-frequency neighbor identification (adapted from [4]). . . . . . . . . . . . . 21
2.15 Overview of the inter-RAT handover from E-UTRAN to UTRAN/GERAN (adapted from [4]). 22

3.1 Procedure of three-fold cross-validation (adapted from [32]). . . . . . . . . . . . . . . . . . 30


3.2 Bias and variance in dart-throwing (adapted from [18]). . . . . . . . . . . . . . . . . . . . . 31
3.3 Bias and variance contributing to total error. . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 A learning curve showing the model accuracy on test examples as function of the number
of training examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Example of a Decision Tree to decide whether a football match should be played based
on the weather (adapted from [45]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.6 Left: The training and test percent error rates using boosting on an Optical Character
Recognition dataset that do not show any signs of overfitting [25]. Right: The training
and test percent error rates on a heart-disease dataset that after five iterations reveal
overfitting [25]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7 A general tree ensemble algorithm classification procedure. . . . . . . . . . . . . . . . . . 39
3.8 Data mapping from the input space (left) to a high-dimensional feature space (right) to
obtain a linear separation (adapted from [21]). . . . . . . . . . . . . . . . . . . . . . . . . . 42

xiii
3.9 The hyperplane constructed by SVMs that maximizes the margin (adapted from [21]). . . 42

4.1 PCI Confusion (left) and PCI Collision (right). . . . . . . . . . . . . . . . . . . . . . . . . . 48


4.2 Time series analysis of KPI values regarding 4200 LTE cells over a single day. . . . . . . 50
4.3 Boxplots of total null value count for each cell per day for three KPIs. . . . . . . . . . . . . 54
4.4 Absolute Pearson correlation heatmap of peak traffic KPI values and the PCI conflict
detection label. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Smoothed Precision-Recall curves for peak traffic PCI confusion detection. . . . . . . . . 59
4.6 Learning curves for peak traffic PCI confusion detection. . . . . . . . . . . . . . . . . . . . 60
4.7 The CPVE for PCI confusion detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.8 Smoothed Precision-Recall curves for statistical data based PCI confusion detection. . . 63
4.9 Learning curves for statistical data based PCI confusion detection. . . . . . . . . . . . . . 64
4.10 The CPVE for PCI collision detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.11 The CPVE for PCI confusion detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.12 Smoothed Precision-Recall curves for raw cell data based PCI confusion detection. . . . 67
4.13 Learning curves for raw cell data PCI confusion detection. . . . . . . . . . . . . . . . . . . 68
4.14 Precision-Recall curves for raw cell data PCI collision detection. . . . . . . . . . . . . . . 68

5.1 Time series analysis of KPI values regarding 23500 LTE cells over a single day. . . . . . . 74
5.2 Boxplots of total null value count for each cell per day for two KPIs. . . . . . . . . . . . . . 76
5.3 Absolute Pearson correlation heatmap of peak traffic KPI values and the RSI collision
detection label. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.4 Smoothed Precision-Recall curves for peak traffic RSI collision detection. . . . . . . . . . 79
5.5 Learning curves for peak traffic RSI collision detection. . . . . . . . . . . . . . . . . . . . . 80
5.6 The CPVE for RSI collision detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.7 Smoothed Precision-Recall curves for statistical data based RSI collision detection. . . . 82
5.8 Learning curves for statistical data based RSI collision detection. . . . . . . . . . . . . . . 83
5.9 The CPVE for RSI collision detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.10 Smoothed Precision-Recall curves for raw cell data RSI collision detection. . . . . . . . . 85
5.11 Learning curves for raw cell data RSI collision detection. . . . . . . . . . . . . . . . . . . . 86

A.1 PCI and RSI Conflict Detection Flowchart. . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

xiv
List of Tables

2.1 Downlink peak data rates [5]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16


2.2 Uplink peak data rates [4]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Differences between both mobility modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Description of the KPI categories and KPI examples. . . . . . . . . . . . . . . . . . . . . . 24
2.5 Netherlands P3 KPI analysis done in 2016 [16]. . . . . . . . . . . . . . . . . . . . . . . . . 24

3.1 The three components of learning algorithms (adapted from [18]). . . . . . . . . . . . . . 29


3.2 Confusion Matrix (adapted from [31]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1 Chosen Accessibility and Integrity KPIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48


4.2 Chosen Mobility, Quality and Retainability KPIs. . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 The obtained cumulative Confusion Matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 The obtained Model Evaluation metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Resulting dataset composition subsequent to data cleaning. . . . . . . . . . . . . . . . . . 55
4.6 Average importance given to each KPI by each Decision Tree based classifier. . . . . . . 57
4.7 Peak traffic PCI Confusion classification results. . . . . . . . . . . . . . . . . . . . . . . . 58
4.8 PCI Confusion classification training and testing times in seconds. . . . . . . . . . . . . . 60
4.9 Statistical data based PCI confusion classification results. . . . . . . . . . . . . . . . . . . 62
4.10 Statistical data based PCI confusion classification training and testing times in seconds. . 64
4.11 Raw cell data PCI confusion classification results. . . . . . . . . . . . . . . . . . . . . . . 66
4.12 Raw cell data PCI confusion classification training and testing times in seconds. . . . . . 67

5.1 Chosen Accessibility and Mobility KPIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72


5.2 Chosen Quality and Retainability KPIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3 Average importance given to each KPI by each Decision Tree based classifier. . . . . . . 78
5.4 Peak traffic RSI collision classification results. . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.5 RSI collision classification training and testing times in seconds. . . . . . . . . . . . . . . 80
5.6 Statistical data based RSI collision classification results. . . . . . . . . . . . . . . . . . . . 81
5.7 RSI collision classification training and testing times in seconds. . . . . . . . . . . . . . . 82
5.8 Raw cell data RSI collision classification results. . . . . . . . . . . . . . . . . . . . . . . . 84
5.9 RSI collision classification training and testing times in seconds. . . . . . . . . . . . . . . 85

xv
xvi
List of Symbols

Srxlevel Rx level value of a cell.


Qrxlevelmeas Reference Signal Received Power from a cell.
Qrxlevmin Minimum required level for cell camping.
Qrxlevelminof f set Offset used when searching for a Public Land Mobile Network of preferred network operators.
SServingCell Rx value of the serving cell.
Sintrasearch Rx level threshold for the User Equipment to start making intra-frequency measurements.
Snonintrasearch Rx level threshold for the User Equipment to start making inter-system measurements.
Qmeas Reference Signal Received Power measurement for cell re-selection.
Qhyst Power domain hysteresis in order to avoid the ping-pong fenomena between cells.
Qof f set Offset control parameter to deal with different frequencies and cell characteristics.
Treselection Time limit to perform cell re-selection.
T hreshhigh Higher threshold for a User Equipment to camp on a higher priority layer.
T hreshlow Lower threshold for a User Equipment to camp on a low priority layer.
x Input vector for a Machine Learning model.
y Output vector that a Machine Learning model aims to predict.
yb Output vector that a Machine Learning model predicts.
2
σab Covariance matrix of variable vectors a and b.
λ Eigenvalue of a Principal Component.
Wt Weight array of t iterations.
θt Parameters of a classification algorithm of t iterations.
αt Weight of a hypothesis of t iterations.
Zt Normalization factor of t iterations.
H Machine Learning model.
f Functional dependence between input and output vectors.
fb Estimated functional dependence.
ψ Loss function.
gt Negative gradient of a loss function of t iterations.
Ey Expected prediction loss.
ρt Gradient step size of t iterations.
K Number of randomly selected features.

xvii
nmin Minimum sample size for splitting a Decision Tree node.
M Total number of Decision Trees to grow in an ensemble.
S Data subset.
S
fmax Maximal value of a variable vector in a data subset S.
S
fmin Minimal value of a variable vector in a data subset S.
fc Random cut-point of a variable vector.
 Optimization problem for Support Vector Machines.
C Positive regularization constant for Support Vector Machines.
ξ Slack variable that states whether a data sample is on the correct side of a hyperplane.
α Lagrange multiplier.
#SV Number of Support Vectors.
K(·, ·) Support Vector Machines kernel function.
σ Free parameter.
γ Positive regularization constant for Support Vector Machines.
β Weight constant for defining importance for either Precision or Recall metrics.
Q1 First quartile.
Q3 Third quartile.
Nrows Number of sequences needed to generate the 64 Random Access Channel preambles.

xviii
Acronyms

1NN One Nearest Neighbor

3GPP Third Generation Partnership Project

4G Fourth Generation

AB Adaptive Boosting

AuC Authentication Centre

BCH Broadcast Channel

BPSK Binary Phase Shift Keying

CM Configuration Management

CNN Convolutional Neural Network

CQI Channel Quality Indicator

CPVE Cumulative Proportion of Variance Explained

CRC Cyclic Redundancy Check

CS Circuit-Switched

DFT Discrete Fourier Transform

DL-SCH Downlink Shared Channel

EDGE Enhanced Data for Global Evolution

eNB Evolved Node B

EPC Evolved Packet Core

EPS Evolved Packet System

E-SMLC Evolved Serving Mobile Location Centre

ERT Extremely Randomized Tree

E-UTRA Evolved UMTS Terrestrial Radio Access

xix
E-UTRAN Evolved UMTS Terrestrial Radio Access Network

FDMA Frequency Division Multiple Access

FFT Fast Fourier Transform

FN False Negative

FP False Positive

FTP File Transfer Protocol

GB Gradient Boost

GERAN GSM EDGE Radio Access Network

GMLC Gateway Mobile Location Centre

GPRS General Packet Radio Service

GSM Global System for Mobile Communications

GTP GPRS Tunneling Protocol

GW Gateway

HARQ Hybrid Adaptive Repeat and Request

HSPA High Speed Packet Access

HSDPA High Speed Downlink Packet Access

HSS Home Subscriber Server

HSUPA High Speed Uplink Packet Access

ID Identity

IDFT Inverse Discrete Fourier Transform

IEEE Institute of Electrical and Electronics Engineers

IFFT Inverse Fast Fourier Transform

IP Internet Protocol

IQR Interquartile Range

ITU International Telecommunication Union

kNN k-Nearest Neighbor

KPI Key Performance Indicators

LCS LoCation Services

xx
LSTM Long Short Term Memory

LTE Long Term Evolution

MAC Medium Access Control

MCH Multicast Channel

ME Mobile Equipment

MIB Master Information Block

MIMO Multiple-Input Multiple-Output

ML Machine Learning

MME Mobility Management Entity

MNO Mobile Network Operators

MT Mobile Termination

NaN Not a Number

NE Network Element

NR Network Resource

OAM Operations, Administration and Management

OFDM Orthogonal Frequency Division Multiplexing

OFDMA Orthogonal Frequency Division Multiple Access

OS Operations System

PAPR Peak-to-Average Power Ratio

PAR Peak-to-Average Ratio

PBCH Physical Broadcast Channel

PC Principal Component

PCA Principal Component Analysis

PCCC Parallel Concatenated Convolution Coding

PCH Paging Channel

PCI Physical Cell Identity

PCRF Policy Control and Charging Rules Function

PDCCH Physical Downlink Control Channel

xxi
PDN Packet Data Network

PDSCH Physical Downlink Shared Channel

PLMN Public Land Mobile Network

PM Performance Management

PMCH Physical Multicast Channel

PRACH Physical Random Access Channel

PRB Physical Resource Block

P-GW Packet Data Network Gateway

PR Precision-Recall

PS Packet-Switched

PS HO Packet-Switched Handover

PUCCH Physical Uplink Control Channel

PUSCH Physical Uplink Shared Channel

QAM Quadrature Amplitude Modulation

QoS Quality of Service

QPSK Quadrature Phase Shift Keying

RACH Random Access Channel

RAT Radio Access Technology

RBF Radial Basis Function

RBS Radio Base Station

RF Random Forest

RLC Radio Link Control

ROC Receiver Operator Characteristic

RRC Radio Resource Control

RSI Root Sequence Index

RSRP Reference Signal Received Power

RSRQ Reference Signal Received Quality

RSSI Received Signal Strength Indicator

xxii
SAE System Architecture Evolution

SAE GW SAE Gateway

SC-FDMA Single-Carrier Frequency Division Multiple Access

SDU Service Data Unit

S-GW Serving Gateway

SIB System Information Block

SIM Subscriber Identity Module

SNMP Simple Network Management Protocol

SNR Signal-to-Noise Ratio

SON Self-Organizing Network

SQL Structured Query Language

SVM Support Vector Machines

TDMA Time Division Multiple Access

TE Terminal Equipment

TMN Telecommunication Management Network

TN True Negative

TP True Positive

TTI Transmission Time Interval

UE User Equipment

UICC Universal Integrated Circuit Card

UL-SCH Uplink Shared Channel

UMTS Universal Mobile Telecommunications System

URSI International Union of Radio Science

USIM Universal Subscriber Identity Module

UTRAN UMTS Terrestrial Radio Access Network

V-MIMO Virtual Multiple-Input Multiple-Output

VoIP Voice over IP

WCDMA Wideband Code Division Multiple Access

WCNC Wireless Communications and Networking Conference

xxiii
xxiv
Chapter 1

Introduction

This chapter aims to deliver an overview of the presented work. It includes the context and motivation
that led to the development of this work, as well as its objectives and overall structure.

1.1 Motivation

Two of the major concerns of Mobile Network Operators (MNO) are to optimize and to maintain network
performance. However, maintaining performance has proved the be challenging mainly for large and
complex networks. In the long term, changes made in the networks may increase the number of internal
conflicts and inconsistencies. These modifications include changing the antenna tilting, changing the
cell’s power or even changes that cannot be controlled by the MNOs, such as user mobility and radio
channel fading.
In order to assess the network performance, quantifiable performance metrics, known as Key Perfor-
mance Indicators (KPI), are typically used. KPIs can report network performance such as the handover
success rate and the channel interference averages of each cell, and are calculated periodically, result-
ing in time series.
In order to automatically detect the network fault causes, some work has been done by using KPI
measurements with unsupervised techniques, as in [1]. This thesis focuses on applying supervised
techniques for two known Long Term Evolution (LTE) network conflicts, namely Physical Cell Identity
(PCI) conflicts and Root Sequence Index (RSI) collisions.

1.2 Objectives

This thesis aims to create Machine Learning (ML) models that can correctly classify PCI conflicts and
RSI collisions with a minimum False Positive (FP) rate and with a near real time performance. To achieve
this goal, three hypotheses to obtain the best models were tested:

1. PCI conflicts and/or RSI collisions are better detected by using KPI measurements in the daily
peak traffic instant of each cell;

1
2. PCI conflicts and/or RSI collisions are better detected by extracting statistical calculations from
each KPI daily time series and using them as features;

3. PCI conflicts and/or RSI collisions are better detected by using each cell’s KPI measurements in
each day as an individual feature.

These three hypotheses were tested by taking into account the average Precisions and the peak
Precisions obtained from testing the models, as well as their training and testing durations. In order to
reduce bias from this study, five different classification algorithms were set, namely Adaptive Boosting
(AB), Gradient Boost (GB), Extremely Randomized Tree (ERT), Random Forest (RF) and Support Vector
Machines (SVM). The aim of the classifiers was to classify cells as either nonconflicting or conflicting,
depending on the detection use case. The used classification algorithm implementations were obtained
from the Python Scikit-Learn library [2].

1.3 Structure

This work is divided into four main chapters. Chapter 2 presents a technical background of LTE and
Chapter 3 addresses ML concepts as well as more specific ones, such as how time series can be
classified to reach the thesis’ objectives and a technical overview of the proposed classification algo-
rithms. These two aforementioned chapters deliver the necessary background to understand the work
in Chapters 4 and 5.
Chapter 4 introduces the LTE PCI network parameter, how PCI conflicts can occur, perform hypoth-
esis testing and present the respective hypotheses’ results. Additionally, it includes sections focused on
data cleaning, KPI selection and preliminary conclusions. Chapter 5 has the same structure as Chapter
4, but it is focused on RSI collisions.
Finally, in Chapter 6, conclusions are drawn and future work is suggested.

1.4 Publications

Two scientific papers were written in the context of this Thesis, namely:

• ”PCI and RSI Conflict Detection in a Real LTE Network Using Supervised Techniques” written by
R. Verı́ssimo, P. Vieira, M. P. Queluz and A. Rodrigues. This paper was submitted to the 2018
Institute of Electrical and Electronics Engineers (IEEE) Wireless Communications and Networking
Conference (WCNC), Barcelona, Spain 15th-18th April 2018.

• ”Deteção de Conflitos de PCI e de RSI Numa Rede Real LTE Utilizando Aprendizagem Au-
tomática” written by R. Verı́ssimo, P. Vieira, M. P. Queluz and A. Rodrigues. This paper was
submitted to the 11th International Union of Radio Science (URSI) Congress, Lisbon, Portugal
24th November 2017.

2
Chapter 2

LTE Background

This chapter provides an overview of the LTE standard [3], aiming for a better understanding of the
work that will be developed under the Thesis scope. Section 2.1 presents a brief introduction to LTE and
Section 2.2 delivers an architectural overview of this system. Section 2.3 presents a succinct overview of
the multiple access techniques that are used in LTE. The physical layer design is introduced in Section
2.4. Section 2.5 addresses how mobility is handled in LTE. Finally, Section 2.6 describes how data
originated from telecommunication networks is typically collected and evaluated.
The content of this chapter is mainly based on the following references: [4, 5] in Section 2.1; [6, 7] in
Section 2.2; [6, 4, 5] in Section 2.3; [4, 5] in Section 2.4; [4] in Section 2.5; [8, 9] in Section 2.6.

2.1 Introduction to LTE

LTE is a Fourth Generation (4G) wireless communication standard developed by the Third Generation
Partnership Project (3GPP); it resulted from the development of a packet-only wideband radio system
with flat architecture, and was specified for the first time in the 3GPP Release 8 document series.
The downlink in LTE uses Orthogonal Frequency Division Multiple Access (OFDMA) as its multiple
access scheme and the uplink uses Single-Carrier Frequency Division Multiple Access (SC-FDMA).
Both of these solutions result in orthogonality between the users, diminishing the interference and en-
hancing the network capacity. The resource allocation in both uplink and downlink is done in the fre-
quency domain, with a resolution of 180 kHz and consisting in twelve sub-carriers of 15 kHz each. The
high capacity of LTE is due to its packet scheduling being carried out in the frequency domain. The
main difference between the resource allocation on the uplink and on the downlink is that the former
is continuous, in order to enable single carrier transmission, whereas the latter can freely use resource
blocks from different parts of the spectrum. Resource blocks are frequency and time resources that
occupy 12 subcarriers of 15 kHz each and one time slot of 0.5 ms. By adopting the uplink single carrier
solution, LTE enables efficient terminal power amplifier design, which is essential for the terminal battery
life. Depending on the available spectrum, LTE allows spectrum flexibility that can range from 1.4 MHz
up to 20 MHz. In ideal conditions, the 20 MHz bandwidth can provide up to 172.8 Mbps downlink user

3
data rate with 2x2 Multiple-Input Multiple-Output (MIMO) and 340 Mbps with 4x4 MIMO; the uplink peak
data rate is 86.4 Mbps.

2.2 LTE Architecture

In contrast to the Circuit-Switched (CS) model of previous cellular systems, LTE is designed to only
support Packet-Switched (PS) services, aiming to provide seamless Internet Protocol (IP) connectivity
between the User Equipment (UE) and the Packet Data Network (PDN), without disrupting the end users’
applications during mobility. LTE corresponds to the evolution of radio access through the Evolved UMTS
Terrestrial Radio Access Network (E-UTRAN) alongside an evolution of the non-radio aspects, named
as System Architecture Evolution (SAE), which includes the Evolved Packet Core (EPC) network. The
combination of LTE and SAE forms the Evolved Packet System (EPS), which provides the user with IP
connectivity to a PDN for accessing the Internet, as well as running different services simultaneously,
such as File Transfer Protocol (FTP) and Voice over IP (VoIP).

The features offered by LTE are supported through several EPS network elements with different roles.
Figure 2.1 shows the global network architecture that encompasses both the network elements and the
standardized interfaces. The network comprises of the core network (i.e. EPC) and the access network
(i.e. E-UTRAN). The access network consists of one node, the Evolved Node B (eNB), which connects
to the UEs. The network elements are inter-connected through interfaces that are standardized in order
to allow multivendor interoperability.

E-SMLC GMLC

SLg
HSS
SLs
S6a
S1-MME
MME Gx Rx
PCRF
S11

LTE-Uu S1-U Serving S5/S8 PDN SGi Operator s


UE eNB IP services
Gateway Gateway

Figure 2.1: The EPS network elements (adapted from [6]).

The UE is the interface through which the subscriber is able to communicate with the E-UTRAN; it is
composed by the Mobile Equipment (ME) and by the Universal Integrated Circuit Card (UICC). The ME
is essentially the radio equipment that is used to communicate; it can also be divided into both Mobile
Termination (MT) — which conducts all the communication functions — and Terminal Equipment (TE)
— that terminates the streams of data. The UICC is a smart card, informally known as the Subscriber
Identity Module (SIM) card; it runs the Universal Subscriber Identity Module (USIM), which is an appli-
cation that stores user-specific data (e.g. phone number and home network identity). Additionally, it also
employs security procedures through the security keys that are stored in the UICC.

4
2.2.1 Core Network Architecture

The EPC corresponds to the core network and its role is to control the UE and to establish the bearers
– paths that user traffic uses when passing an LTE transport network. The EPC has as main logical
nodes, the Mobility Management Entity (MME), the Packet Data Network Gateway (P-GW), the Serving
Gateway (S-GW) and the Evolved Serving Mobile Location Centre (E-SMLC). Furthermore, there are
other logical nodes that also belong to the EPC such as the Home Subscriber Server (HSS), the Gateway
Mobile Location Centre (GMLC) and the Policy Control and Charging Rules Function (PCRF). These
logical nodes are described in the following points:

• MME is the main control node in the EPC. It manages user mobility in the corresponding service
area through tracking, and also manages the user subscription profile and service connectivity by
cooperating with the HSS. Moreover, it is the sole responsible for security and authentication of
users in the network.

• P-GW is the node that interconnects the EPS with the PDNs. It acts as an IP attachment point
and allocates the IP addresses for the UE. Yet, this allocation can also be performed by a PDN
where the P-GW tunnels traffic between the UE and the PDN. More so, it handles traffic gating
and filtering functions required for the services being used.

• S-GW is a network element that not only links user plane traffic between the eNB and the P-GW,
but also retains information about the bearers when the UE is in idle state.

• E-SMLC has the responsibility to manage both the scheduling and to coordinate the resources
necessary to locate the UE. Furthermore, it estimates the UE speed and corresponding accuracy
through the final location that it assesses.

• HSS is a central database that holds information regarding all the network operator’s subscribers
such as their Quality of Service (QoS) profile and any access restrictions for roaming. It not only
holds information about the PDNs to which the user is able to connect, but also stores dynamic
information (e.g. the identity of the MME to which the user is currently attached or registered). Ad-
ditionally, the HSS is also allowed to integrate the Authentication Centre (AuC) which is responsible
to generate the vectors used for both authentication and security keys.

• GMLC incorporates the fundamental functionalities to support LoCation Services (LCS). After
being authorized, it sends positioning requests to the MME and collects the final location estimates.

• PCRF is responsible for managing the users’ QoS and data charges. The PCRF is connected to
the P-GW and sends information to it for enforcement.

2.2.2 Radio Access Network Architecture

The E-UTRAN represents the radio component of the architecture. It is responsible to connect the UEs
to the EPC and subsequently connects UEs between themselves and also to PDNs (e.g. the Internet).

5
Composed solely of eNBs, the E-UTRAN is a mesh of interconnected eNBs through X2 interfaces
(that can be either physical or logical links). These nodes are intelligent radio base stations that cover
one or more cells and that are also capable of handling all the radio related protocols (e.g. handover).
Unlike in Universal Mobile Telecommunications System (UMTS), there is no centralized controller in
E-UTRAN for normal user traffic and hence its architecture is flat, which can be observed in Figure 2.2.

Figure 2.2: Overall E-UTRAN architecture (adapted from [6]).

The eNB has two main responsibilities: firstly, it sends radio transmissions to all its mobile devices
on the downlink and also receives transmissions from them on the uplink; secondly, it controls the low-
level operation of all its mobile devices through signalling messages (e.g. handover commands) that are
related to those same radio transmissions. The eNBs are normally connected with each other through an
interface called X2 and also to the EPC through the S1 interface. Additionally, the eNBs are connected
to the MME by means of the S1-MME interface and also to the S-GW through the S1-U interface.
The key functions of E-UTRAN can be summarized as:

• managing the radio link’s resources and controlling the radio bearers;

• compressing the IP headers;

• encrypting all data sent over the radio interface;

• routing user traffic towards the S-GW and delivering user traffic from the S-GW to the UE;

• providing the required measurements and additional data to the E-SMLC in order the find the UE
position;

• handling handover between connected eNBs through X2 interfaces;

• signalling towards the MME and also the bearer path towards the S-GW.

The eNBs are responsible for all these functions on the network side, where one single eNB can
manage multiple cells. One key differentiation factor from previous generations is that LTE assigns

6
the radio controller function to the eNB. This strategy reduces latency and improves the efficiency of
the network due to the closer interaction between the radio protocols and the radio access network.
There is no need for a centralized data-combining function in the network, as LTE does not support
soft-handovers. The removal of the centralized network requires that, as the UE moves, the network
transfers all information related to the UE towards another eNB.
The S1 interface has an important feature that allows for a link between the access network and
the core network (i.e. S1-flex). This means that multiple core network nodes can serve a common
geographical area, being connected by a mesh network to the set of eNBs in that area. Thus, an eNB
can be served by multiple MME/S-GWs, as happens for the eNB#2 in Figure 2.2. This allows UEs in
the network to be shared between multiple core network nodes through an eNB, and hence eliminating
single points of failure for the core network nodes and also allowing for load sharing.

2.3 Multiple Access Techniques Overview

In order to fulfil all the requirements defined for LTE, advances were made to the underlying mobile radio
technology. More specifically, to both the multicarrier and multiple-antenna technology.
The first major design choice in LTE was to adopt a multicarrier approach. Regarding the downlink,
the nominated schemes were OFDMA and Multiple Wideband Code Division Multiple Access (WCDMA),
with OFDMA being the selected one. Concerning the uplink, the suggested schemes were SC-FDMA,
OFDMA and Multiple WCDMA, resulting in the selection of SC-FDMA. Both of these selected schemes
presented the frequency domain as a new dimension of flexibility that introduced a potent new way to
improve not only the system’s spectral efficiency, but also to minimize both the fading problems and
inter-symbol interference. These two selected schemes are represented in Figure 2.3.

Figure 2.3: Frequency-domain view of the LTE multiple-access technologies (adapted from [6]).

Before delving into the basics of both OFDMA and SC-FDMA, it is important to present some basic
concepts first:

• for single carrier transmission in LTE, a single carrier is modulated in phase and/or amplitude. The
spectrum wave form is a filtered single carrier spectrum that is centered on the carrier frequency.

• in a digital system, the higher the data rate, the higher the symbol rate and thereupon the larger

7
the bandwidth required for the same modulation. In order to carry the desired number of bits per
symbols, the modulation can be changed by the transmitter.

• in a Frequency Division Multiple Access (FDMA) system, the system can be accessed simultane-
ously by different users through the use of different carriers and sub-carriers. In this last system,
it is crucial to avoid excessive interference between carriers without adopting long guard bands
between users.

• in the research for even better spectral efficiencies, multiple antenna technologies were considered
as a way to exploit another new dimension — the spatial domain. As such, the first LTE Release
led to the introduction of the MIMO operation that includes spatial multiplexing and also pre-coding
and transmit diversity. The basic principle of MIMO is presented in Figure 2.4 where different
streams of data are fed to the pre-coding operation and forwarded to signal mapping and OFDMA
signal generation.

Signal
Modulation
Mapping &
Layer Generation MIMO
Demux Mapping and
Signal Decoding
Pre-coding
Mapping &
Modulation
Generation

Figure 2.4: MIMO principle with two-by-two antenna configuration (adapted from [4]).

2.3.1 OFDMA Basics

OFDMA consists of narrow and mutually orthogonal sub-carriers that are separated typically by 15 kHz
from adjacent sub-carriers, regardless of the total transmission bandwidth. Orthogonality is preserved
between all sub-carriers in every sampling instant of a specific sub-carrier, as all other sub-carriers have
a zero value, which can be observed in Figure 2.5.

Figure 2.5: Preserving orthogonality between sub-carriers (adapted from [5]).

As stated in the beginning of Section 2.3, OFDMA was selected over Multiple WCDMA. The key
characteristics that led to that decision [7, 10, 11] are:

8
• low-complexity receivers even with severe channel conditions;

• robustness to time-dispersive radio channels;

• immunity to selective fading;

• resilience to narrow-band co-channel interference and both inter-symbol and inter-frame interfer-
ence;

• high spectral efficiency;

• efficient implementation with Fast Fourier Transform (FFT).

Meanwhile, OFDMA also presents some challenges, such as [7, 10, 11]:

• higher sensitivity to carrier frequency offset caused by leakage of the Discrete Fourier Transform
(DFT), relatively to single carrier systems;

• high Peak-to-Average Power Ratio (PAPR) of the transmitted signal, which requires high linearity
in the transmitter, resulting in poor power efficiency;

• sensitivity to Doppler shift, that was solved in LTE by choosing a sub-carrier spacing of 15 kHz and
hence providing a relatively large tolerance;

• sensitivity to frequency synchronization problems.

The OFDMA implementation is based on the use of both DFT and Inverse Discrete Fourier Transform
(IDFT) in order to move between time and frequency domain representation. Furthermore, the practical
implementation uses the FFT, which moves the signal from time to frequency domain representation;
the opposite operation is done through the Inverse Fast Fourier Transform (IFFT).
The transmitter used by an OFDMA system contains an IFFT block that acts on each sub-carrier to
convert the signal to the frequency domain. The input of the previous block results from the serial-to-
parallel conversion of the data source. Finally, a cyclic extension is added to the output signal of the IFFT
block, which aims to avoid inter-symbol interference. By contrast, inverse operations are implemented
in the receiver with the addition of an equalisation block between the FFT and the demodulation blocks.
The architecture of the OFDMA transmitter and receiver is presented in Figure 2.6.
The cyclic extension is performed by copying the final part of the symbol to its beginning. This method
is preferable to adding a guard interval because the Orthogonal Frequency Division Multiplexing (OFDM)
signal is periodic. When the symbol is periodic, the impact of the channel corresponds to a multiplication
by a scalar, assuming that the cyclic extension is long enough. Moreover, this periodicity of the signal
allows for a discrete Fourier spectrum, enabling the use of both DFT and IDFT in the receiver and
transmitter respectively.
An important advantage of the use of OFDMA in a base station transmitter is that it can allocate any
of its sub-carriers to users in the frequency domain, allowing the scheduler to benefit from frequency
diversity. Yet, the signalling resolution caused by the resulting overhead prevents the allocation of a

9
Transmitter

Serial to . Cyclic
Bits Modulator . IFFT
Parallel . Extension

Receiver
Total Radio Bandwidth (eg. 20 MHz)

Remove Cyclic Serial to .


. FFT Equaliser Demodulator
Extension Parallel .

Bits

Figure 2.6: OFDMA transmitter and receiver (adapted from [4]).

single sub-carrier, forcing the use of a Physical Resource Block (PRB) consisting of 12 sub-carriers.
As such, the minimum bandwidth that can be allocated is 180 kHz. This allocation in the time-domain
corresponds to 1 ms, also known as Transmission Time Interval (TTI), although each PRB only lasts for
0.5 ms. In LTE, each PRB can be modulated either through Quadrature Phase Shift Keying (QPSK) or
Quadrature Amplitude Modulation (QAM), namely 16-QAM and 64-QAM.

2.3.2 SC-FDMA Basics

Although OFDMA works well on the LTE downlink, it has one drawback: the transmitted signal power
is subjected to large variations. This results in high PAPR, which in turn can cause problems for the
transmitter’s power amplifier. In the downlink, the base station transmitters are large and expensive
devices that can use expensive power amplifiers. The same does not happen in the uplink, where the
mobile transmitter has to be cheap. This makes OFDMA unsuitable for the LTE uplink.
Hence, it was decided to use SC-FDMA for multiple access. Its basic form could be perceived as
equal to the QAM modulation, where each symbol is sent one at a time, similarly to Time Division
Multiple Access (TDMA) systems, such as Global System for Mobile Communications (GSM). The
frequency domain generation of the signal, which can be observed in Figure 2.7, adds the OFDMA
property of good spectral waveform. This eliminates the need for guard bands between different users,
similarly to OFDMA downlink. A cyclic extension is also added periodically to the signal, as happens in
OFDMA with the exception of not being added after each symbol. This is due to the symbol rate being
faster than in OFDMA. The added cyclic extension prevents inter-symbol interference between blocks
of symbols and also simplifies the receiver design. The remaining inter-symbol interference is handled
by running the receiver equalizer in the receiver for a block of symbols, until reaching the cyclic prefix.
While the transmission occupies the whole spectrum allocated to the user in the frequency domain,
the system has a 1 ms resolution allocation. For instance, when the resource allocation is doubled,
so is the data rate, assuming the same level of overhead. Hence, the individual transmission gets
shorter in the time domain, however gets wider in the frequency domain. The allocations do not need
to have frequency domain continuity, but can take any set of continuous allocation of frequency domain

10
Transmitter
Sub-carrier . Cyclic
Bits Modulator DFT .. IFFT
Mapping Extension

Receiver Total Radio Bandwidth (eg. 20 MHz)

Remove Cyclic MMSE


FFT IDFT Demodulator
Extension Equaliser
Bits

Figure 2.7: SC-FDMA transmitter and receiver with frequency domain signal generation (adapted from
[4]).

resources. The allowed amount of 180 kHz resource blocks – the minimum resource allocation based on
the 15 kHz sub-carrier spacing of OFDMA downlink – that can be allocated are defined by the practical
signaling constraints. The maximum allocated bandwidth can go up to 20 MHz, but tends to be smaller
as it is required to have a guard band towards the neighboring operator.
As the transmission is only done in the time domain, the system retains its good envelope prop-
erties and the waveform characteristics are highly dependent of the applied modulation method. Thus,
SC-FDMA is able to reach a very low signal Peak-to-Average Ratio (PAR). Moreover, it facilitates efficient
power amplifiers in the devices, saving battery life.
Regarding the base station receiver for SC-FDMA, it is slightly more complex than the OFDMA
receiver. This is even more complex if it needs equalizers that are able to perform as well as OFDMA
receivers. Yet, this disadvantage is far outweighed by the benefits of the uplink range and device battery
life that can be reached with SC-FDMA. Furthermore, by having a dynamic resource usage with a 1 ms
resolution means that there is no base-band receiver per UE on standby and those who do have data
to transmit use the base station in a dynamic fashion. Lastly, the most resource consuming process in
both uplink and downlink receiver chains is the channel decoding with increased data rates.

2.3.3 MIMO Basics

The MIMO operation is one of the fundamental technologies that the first LTE release brought, despite
being included earlier in WCDMA specifications [5]. However, in WCDMA, the MIMO operates differently
from LTE, where a spreading operation is applied.
In the first LTE release, MIMO includes spatial diversity, pre-coding and transmit diversity. Spatial
multiplexing consists in the signal transmission from two or more different antennas with different data
streams, with further separation through signal processing in the receiver. Thus, in theory, a 2-by-2
antenna configuration doubles the peak data rates, or quadruples it if applied with a 4-by-4 antenna
configuration. Pre-coding handles the weighting of the signals transmitted from different antennas, in
order to maximize the received Signal-to-Noise Ratio (SNR). Lastly, transmit diversity is used to exploit

11
the gains from independent fading between different antennas through the transmission of the same
signal from various antennas with some coding.

Figure 2.8: OFDMA reference symbols to support two eNB transmit antennas (adapted from [4]).

In order to allow the separation, at the receiver, of the MIMO streams transmitted by different an-
tennas, reference symbols are assigned to each antenna. This eliminates the possibility of existing
corruption in the channel estimation from another antenna, because each stream sent by each antenna
is unique. This principle can be observed in Figure 2.8 and can be applied by two or more antennas,
having the first LTE Release specified up to four antennas. Furthermore, as the number of antennas
increases, the same happens to the required SNR, to the complexity of the transmitters and receivers
and to the reference symbol overhead.
MIMO can also be used in LTE uplink, despite not being possible to increase the single user data
rate in mobile devices that only have a single antenna. Yet, the cell level maximum data rate can be
doubled through the allocation of two devices with orthogonal reference signals, i.e. Virtual Multiple-
Input Multiple-Output (V-MIMO). Accordingly, the base station handles this transmission as a MIMO
transmission, separating the data streams by means of the MIMO receiver. This operation does not bring
any major implementation complexity on the device perspective as only the reference signal sequence is
altered. On the other hand, additional processing is required from the network side in order to separate
the different users. Lastly, it is also important to mention that SC-FDMA is well compatible with MIMO,
as the users are orthogonal between them inside the same cell and the local SNR may be very high for
the users close to the base station.

2.4 Physical Layer Design

After covering the OFDMA and SC-FDMA principles, it is now possible to describe the physical layer of
LTE. This layer is characterized by the design principle of resource usage based solely on dynamically
allocated shared resources, instead of having dedicated resources reserved for a single user. Further-
more, it has a key role in defining the resulting capacity and thus allows for a comparison between
different systems for expected performance. This section will introduce the transport channels and how
they are mapped to the physical channels, the available modulation methods for both data and control
channels and the uplink/downlink data transmission.

12
2.4.1 Transport Channels

As there is no reservation of dedicated resources for single users, LTE contains only common transport
channels; these channels have the role of connecting the Medium Access Control (MAC) layer to the
physical layer. The physical channels carry the transport channel and it is the processing applied to
those physical channels that characterizes the transport channel. Moreover, the physical layer needs
to provide dynamic resource assignment both for data rate variation and for resource division between
users. The transport channels and their mapping to the physical channels are described in the following
points:

• Broadcast Channel (BCH) is a downlink broadcast channel that is used to broadcast the required
system parameters to enable devices accessing the system.

• Downlink Shared Channel (DL-SCH) carries the user data for point-to-point connections in the
downlink direction. All the information transported in the DL-SCH is intended only for a single user
or UE in the RRC CONNECTED state.

• Paging Channel (PCH) transports the paging information in the downlink direction aimed for the
device in order to move it from a RRC IDLE to a RRC CONNECTED state.

• Multicast Channel (MCH) is used in the downlink direction to carry multicast service content to
the UE.

• Uplink Shared Channel (UL-SCH) transfers both the user data and the control information from
the device in the uplink direction in the RRC CONNECTED state.

• Random Access Channel (RACH) acts in the uplink direction to answer to the paging messages
as well as to initiate the move from or towards the RRC CONNECTED state according to the UE
data transmission needs.

The mentioned RRC IDLE and RRC CONNECTED states are described in Section 2.5.
In the uplink direction, the UL-SCH and RACH are respectively transported by the Physical Uplink
Shared Channel (PUSCH) and Physical Random Access Channel (PRACH).
In the downlink direction, the PCH and the BCH are mapped to the Physical Downlink Shared Chan-
nel (PDSCH) and the Physical Broadcast Channel (PBCH), respectively. Lastly, the DL-SCH is mapped
to the PDSCH and MCH is mapped to the Physical Multicast Channel (PMCH).

2.4.2 Modulation

Both the uplink and downlink directions use the QAM modulator, namely 4-QAM (also known as QPSK),
16-QAM and 64-QAM, whose symbol constellations can be observed in Figure 2.9. The first two are
available in all devices, while the support for 64-QAM in the uplink direction depends upon the UE class.
QPSK modulation is used when operating at full transmission power as it allows for good transmitter
power efficiency. For 16-QAM and 64-QAM modulations, the devices use a lower maximum transmitter
power.

13
QPSK 16-
QAM 64-QAM
2bi
t
s/sy
mbol 4bit
s/sy
mbol 6bi
t
s/symbol

Figure 2.9: LTE modulation constellations (adapted from [4]).

Binary Phase Shift Keying (BPSK) has been specified for control channels, which can opt between
BPSK or QPSK for control information transmission. Additionally, uplink control data is multiplexed along
with the user data, both type of data use the same modulation (i.e. QPSK, 16-QAM or 64-QAM).

2.4.3 Downlink User Data Transmission

The user data is carried on the PDSCH in the downlink direction with a 1 ms resource allocation. More-
over, the sub-carriers are allocated to resource units of 12 sub-carriers, totalling to 180 kHz allocation
units. Thus, the user data rate depends on the number of allocated sub-carriers; this allocation of re-
sources is managed by the eNB and it is based on the Channel Quality Indicator (CQI) obtained from
the terminal. Similarly to what happens in the uplink, the resources are allocated in both the time and
frequency domain, as it can be observed in Figure 2.10. The bandwidth can be allocated between 0 and
20 MHz with continuous steps of 180 kHz.

Figure 2.10: Downlink resource allocation at eNB (adapted from [4]).

The Physical Downlink Control Channel (PDCCH) notifies the device about which resources are

14
allocated to it in a dynamic fashion and with a 1 ms allocation granularity. PDSCH data can occupy
between 3 and 6 symbols per 0.5 ms slot, depending on both the PDCCH and on the cyclic prefix length
(i.e. short or extended). In the 1 ms subframe, the first 0.5 ms are used for control symbols (for PDCCH)
and the following 0.5 ms are used solely for data symbols (for PDSCH). Furthermore, the second 0.5
ms slot can fit 7 symbols if a short cyclic prefix is used.
Not only the available resources for user data are reduced by the control symbols, but they also
have to be shared with broadcast data and with reference and synchronization signals. The reference
symbols are distributed evenly in the time and frequency domains in order to reduce the overhead
needed. This distribution of reference symbols requires rules to be defined in order to both the receiver
and the transmitter can understand the mapping. The common channels, such as the BCH, also need
to be taken into account for the total resource allocation space.
The channel coding chosen for LTE user data was turbo coding, which uses the same encoder
(i.e. Parallel Concatenated Convolution Coding (PCCC)) type turbo encoder as used in WCDMA/High
Speed Packet Access (HSPA) [5]. The turbo interleaver of WCDMA was also modified to better fit the
LTE properties and slot structures, as well as to allow higher flexibility for implementing parallel signal
processing with increasing data rates. The channel coding consists in 1/3-rate turbo coding for user data
in both uplink and downlink directions. To reduce the processing load, the maximum block size for turbo
coding is limited to 6144 bits and higher allocations are then segmented to multiple encoding blocks.
In the downlink there is not any multiplexing to the same physical layer resources with PDCCH as they
have their own separate resources during the 1 ms subframe.
LTE uses physical layer retransmission combining, also commonly referred as Hybrid Adaptive Re-
peat and Request (HARQ). In such an operation, the receiver also stores packets with failed Cyclic
Redundancy Check (CRC) checks and combines the received packet with the previous one when a
retransmission is received.
After the data is encoded, it is scrambled and then modulated. The scrambling is done in order to
avoid cases where a device decodes data that is aimed for another device that has the same resource
allocation. The modulation mapper applies the intended modulation (i.e. QPSK, 16-QAM or 64-QAM)
and the resulting symbols are fed for layer mapping and pre-coding. For multiple transmit antennas,
the data is divided into two or four data streams (depending if two of four antennas are used) and then
mapped to resource elements available for PDSCH followed by the OFDM signal generation. For a
single antenna transmission, the layer mapping and pre-coding functionalities are not used.
Thus, the resulting instantaneous data rate for downlink depends on the:

• modulation method applied, with 2, 4 or 6 bits per modulated symbol depending on the modulation
method of QPSK, 16-QAM and 64-QAM, respectively;

• allocated amount of sub-carriers;

• channel encoding rate;

• number of transmit antennas with independent streams and MIMO operation.

15
Assuming that all the resources are allocated for a single user and counting only the physical layer
resources available, the instantaneous peak data rate for downlink ranges between 0.9 and 86.4 Mbps
with a single stream, that can rise up to 172.8 Mbps with 2 x 2 MIMO. For 4 x 4 MIMO it can also reach
a theoretical instantaneous peak data rate of 340 Mbps. The single stream and 2 x 2 MIMO bandwidths
can be observed on Table 2.1.

Table 2.1: Downlink peak data rates [5].

Peak bit rate per sub-carrier [Mbps] / bandwidth combination [MHz]


72/1.4 180/3.0 300/5.0 600/10 1200/20
QPSK 1/2 Single stream 0.9 2.2 3.6 7.2 14.4
16-QAM 1/2 Single stream 1.7 4.3 7.2 14.4 28.8
16-QAM 3/4 Single stream 2.6 6.5 10.8 21.6 43.2
64-QAM 3/4 Single stream 3.9 9.7 16.2 32.4 64.8
64-QAM 4/4 Single stream 5.2 13.0 21.6 43.2 86.4
64-QAM 3/4 2 x 2 MIMO 7.8 19.4 32.4 64.8 129.6
64-QAM 4/4 2 x 2 MIMO 10.4 25.9 43.2 86.4 172.8

2.4.4 Uplink User Data Transmission

The user data in the uplink direction is carried on the PUSCH, which has a 10 ms frame structure and
is based on the allocation of time and frequency domain resources with 1 ms and 180 kHz resolution,
respectively. The scheduler that handles this allocation of resources is located in the eNB, as can
be observed in Figure 2.11. Only random access resources can be used without prior signalling from
the eNB and there are no fixed resources for the devices. Accordingly, the device needs to provide
information for the uplink scheduler of its transmission requirements as well as its available transmission
power resources.
The frame structure uses a 0.5 ms slot and an allocation period of two 0.5 ms slots (i.e. subframe).
Similarly to what was discussed in the previous subsection concerning the downlink direction, user data
has to share the data space with reference symbols and signalling. The bandwidth can be allocated
between 0 and 20 MHz with steps of continuous 180 kHz, similarly to downlink transmission. The slot
bandwidth adjustment between consecutive TTIs can be observed in Figure 2.12, in which doubling the
data rate results in also doubling the bandwidth being used. It needs to be noted that the reference
signals always occupy the same space in the time domain and, consequently, higher data rate also
corresponds to a higher data rate for the reference symbols.
The cyclic prefix used in uplink can also either be short or extended, where the short cyclic prefix
allows for a bigger data payload. The extended prefix is not frequently used, as the benefit of having
seven data symbols is greater than the possible degradation that can result from inter-symbol interfer-
ence caused by channel delay spread higher than the cyclic prefix.
The channel coding for user data in the uplink direction is also 1/3-rate turbo coding, the same as in
the downlink direction. Besides the turbo coding, the uplink also has the physical layer HARQ with the
same combining methods as in the downlink direction.

16
Figure 2.11: Uplink resource allocation controlled by eNB scheduler (adapted from [4]).

Figure 2.12: Data rate between TTIs in the uplink direction (adapted from [4]).

Thus, the resulting instantaneous uplink data rate depends on the:

• modulation method applied, with the same methods available in the downlink direction;

• bandwidth applied;

• channel coding rate;

• time domain resource allocation.

Similarly to the previous subsection, assuming that all the resources are allocated for a single user
and counting only the physical layer resources available, the instantaneous peak data rate for uplink
ranges between 900 kbps and 86.4 Mbps, as shown in Table 2.2. As discussed in subsection 2.3.3, the
cell or sector specific maximum total data throughput can be increased with V-MIMO.

17
Table 2.2: Uplink peak data rates [4].

Peak bit rate per sub-carrier [Mbps] / bandwidth combination [MHz]


72/1.4 180/3.0 300/5.0 600/10 1200/20
QPSK 1/2 Single stream 0.9 2.2 3.6 7.2 14.4
16-QAM 1/2 Single stream 1.7 4.3 7.2 14.4 28.8
16-QAM 3/4 Single stream 2.6 6.5 10.8 21.6 43.2
16-QAM 4/4 Single stream 3.5 8.6 14.4 28.8 57.6
64-QAM 3/4 Single stream 3.9 9.7 16.2 32.4 64.8
64-QAM 4/4 Single stream 5.2 13.0 21.6 43.2 86.4

2.5 Mobility

This section presents an overview of how LTE mobility is managed for Idle and Connected modes,
as mobility is crucial in any telecommunications system; mobility has many clear benefits, such as
maintaining low delay services (e.g. voice or real time video connections) while moving in high speed
transportations and switching connections to the best serving cell in areas between cells. However, this
comes with an increased network complexity. That being said, the LTE radio network aims to provide
seamless mobility while minimizing network complexity.

Table 2.3: Differences between both mobility modes.

RRC IDLE RRC CONNECTED


Cell reselections done automatically by the UE Network controlled handovers
Based on UE measurements Based on UE measurements
Controlled by broadcasted parameters
Different priorities can be assigned to frequency layers

There are two procedures in which mobility can be divided, idle and connected mode mobility. The
former is based on UE being active and autonomously reselecting cells in accordance to parameters
sent by the network, without being connected to it; in the latter, the UE is connected to the network
(i.e. transmitting data) and the E-UTRAN makes the decision of whether or not to trigger an handover
according to the reports sent by the UE. These two states correspond respectively to the RRC IDLE
and RRC CONNECTED mode, whose differences are summarized in Table 2.3.
It is also important to mention these measurements that are performed by the UE for mobility in LTE:

• Reference Signal Received Power (RSRP), which is the averaged power measured in a cell
across receiver branches of the resource elements that contain reference signals specific to the
cell;

• Reference Signal Received Quality (RSRQ), which is the ratio of the RSRP and the Evolved
UMTS Terrestrial Radio Access (E-UTRA) Received Signal Strength Indicator (RSSI) for the refer-
ence signals;

• RSSI, which is the total received wideband power on a specific frequency and it includes noise
originated from interfering cells and other sources of noise. Moreover, it is not individually mea-
sured by the UE, yet it is used in calculating the RSRQ value inside the UE.

18
2.5.1 Idle Mode Mobility

In Idle mode, the UE chooses a suitable cell based on radio measurements (i.e. cell selection). When-
ever a UE selects a cell, it is camped in that same cell. The cell is required to have good radio quality
and not be blacklisted. Specifically, it must fulfil the S-criterion:

Srxlevel > 0, (2.1)

where
Srxlevel > Qrxlevelmeas − (Qrxlevmin − Qrxlevelminof f set ), (2.2)

and Srxlevel corresponds to the Rx level value of the cell, Qrxlevelmeas is the RSRP, Qrxlevmin is the
minimum required level for cell camping and Qrxlevelminof f set is an offset used when searching for a
higher priority Public Land Mobile Network (PLMN) corresponding to preferred network operators. The
aforementioned offset is used because LTE allows to set priority levels for PLMNs in order to specify
preferred network operators in cases such as roaming.
As the UE stays camped in a cell, it will be continuously trying to find better cells as candidates for
reselection in accordance to the reselection criteria. Furthermore, the network can also block the UE to
consider specific cells for reselection (i.e. cell blacklisting). To reduce the amount of measurements, it
was defined that if the Rx level value of the serving cell (i.e. SServingCell ) is high enough, the UE does
not need to make any intra-frequency, inter-frequency or inter-system measurements. The measure-
ments for intra-frequency and inter-frequency start respectively once that SServingCell ≤ Sintrasearch and
SServingCell ≤ Snonintrasearch , where Sintrasearch and Snonintrasearch refer to the serving cell’s Rx level
thresholds for the UE to start making intra-frequency and inter-system measurements, respectively.
For intra-frequency and equal priority E-UTRAN frequency cell selection, a cell ranking is made on
the Rs criterion for the serving cell and Rn criterion for the neighboring cells:

Rs = Qmeas,s + Qhyst , (2.3)

Rn = Qmeas,n + Qof f set , (2.4)

where Qmeas is the RSRP measurement for cell re-selection, Qhyst is the power domain hysteresis in
order to avoid the ping-pong phenomena between cells, Qof f set is an offset control parameter to deal
with different frequencies and/or cell specific characteristics (e.g. propagation properties and hierarchi-
cal cell structures). The reselection occurs to the highest ranking neighbor cell that is better ranked than
the serving cell for longer than Treselection , in order to avoid frequently made reselections. Through the
hysteresis provided by Qhyst , a neighboring cell needs to be better than the serving cell by a configurable
amount in order to perform reselection. Lastly, the Qof f set allows bias for the reselection of particular
cells and/or frequencies.
Regarding both inter-frequency and inter-system reselection in LTE, they are based on the method
labeled as layers. Layers were designed to allow the operators to control how the UE prioritizes camping
on different Radio Access Technology (RAT)s or frequencies. This method is known as absolute priority

19
based reselection, where each layer is appointed a specific priority and the UE attempts to camp on the
highest priority layer that can provide a decent service. The UE will camp on a higher priority layer if it is
above a threshold T hreshhigh — that is defined by the network — for longer than the Treselection period.
Furthermore, the UE will camp on a layer with lower priority only if the higher priority layer drops below
the aforementioned threshold and if the lower priority layer overcomes the threshold T hreshlow .

2.5.2 Intra-LTE Handovers

As mentioned previously, the UE mobility is only controlled by the handovers when the Radio Resource
Control (RRC) connection is established. The handovers are based on UE measurements and are also
controlled by the E-UTRAN, which decides when to perform the handover and what the target cell will
be. In order to perform lossless handovers, packet forwarding is used between the source and the target
eNB. In addition, the S1 connection in the core network is only updated once the radio handover is
completed (i.e. Late path switch) and the core network has no control over the handovers.

Figure 2.13: Intra-frequency handover procedure (adapted from [4]).

The intra-frequency handover operation can be observed in Figure 2.13. In the beginning, the UE
has a user plane connection to the source eNB and also to the SAE Gateway (SAE GW). Besides that,
there is a S1 signalling connection between the MME and the eNB. Once the target cell fulfills the
measurement threshold, the UE sends the measurement report to the source eNB, which will establish
a signaling connection and GPRS Tunneling Protocol (GTP) tunnel towards the target cell. When the
target eNB has the required available resources, the source eNB sends an handover command towards
the UE. Once that is done, the UE can then switch from the source to the targeted eNB, resulting in a
successful update of the core network connection.
Before the Late path switching is completed, there is a brief moment when the user plane packets
in downlink are forwarded from the source eNB towards the target eNB through the X2 interface. In
the uplink, the eNB forwards all successfully received uplink Radio Link Control (RLC) Service Data

20
Unit (SDU) to the packet core and, furthermore, the UE re-transmits the unacknowledged RLC SDUs
from the source eNB.
Regarding the handover measurements, the UE must identify the target cell through its synchroniza-
tion signals before it can send the measurement report. Once the reporting threshold is fulfilled, the UE
sends handover measurements to the source eNB.

Figure 2.14: Automatic intra-frequency neighbor identification (adapted from [4]).

The UE in E-UTRAN can detect the intra-frequency neighbors automatically, which in turn resulted
in both a simpler network management and better network quality. The correct use of this functionality is
important as call drops due to missing neighbors are common. It can be observed in Figure 2.14, where
the UE approaches a new cell and receives its PCI through the synchronization signals. The UE then
sends a measurement report to the eNB once the handover report threshold has been reached. On the
other hand, the eNB does not have an X2 connection to that cell and the physical cell Identity (ID) is
not enough to uniquely identify that cell, as the maximum number of physical cell IDs is only 504 and
large networks can extend to tens of thousands of cells. Thereupon, the serving eNB requests the UE to
decode the global cell ID from the broadcast channel of the target cell, as it uniquely identifies that same
cell. Through the global cell ID, the serving eNB can now find the transport layer address alongside the
information sent by the MME and, thus, set up a new X2 connection, allowing the eNB to proceed with
the handover.
The generation of the intra-frequency neighborlist is simpler than creating inter-frequency or inter-
RAT neighbors, as the UE can easily identify all the cells within the same frequency. For inter-frequency
and inter-RAT neighbor creation, the eNB not only must ask the UE to make specific measurements for
them, but must also schedule gaps in the signal to allow the UE to proceed with the measurements.

2.5.3 Inter-system Handovers

LTE allows for inter-system handovers, also called inter-RAT handovers, between the E-UTRAN and
GSM EDGE Radio Access Network (GERAN), UMTS Terrestrial Radio Access Network (UTRAN) or
cdma2000 .
R The inter-RAT handover is controlled by the source access system in order to start the

21
measurements and to decide to perform or not the handover. This handover is carried out backwards
as a normal handover, due to the resources being reserved in the target systems prior to the handover
command being sent to the UE. Regarding the GERAN system, it does not support Packet-Switched
Handover (PS HO) as the resources are not reserved before the handover. The core network is respon-
sible for the signalling, because there are not any direct interfaces between these different radio systems.
The inter-RAT handover is similar to the one of intra-LTE where the packet core node is changed.
The information from the target system is transported to the UE in a transparent fashion through the
source system. To avoid the loss of user data, the user data can be forwarded from the source to the
target system. The UE does not perform any signalling to the core network and, thus, speeds up the
execution of the handover. Furthermore, the security and QoS context is transferred from the source
to the target system. Additionally, the Serving Gateway (GW) can be used as the mobility anchor for
inter-RAT handovers. An overview of the inter-system handover is represented in Figure 2.15.

Figure 2.15: Overview of the inter-RAT handover from E-UTRAN to UTRAN/GERAN (adapted from [4]).

2.6 Performance Data Collection

As telecommunication networks are becoming more and more complex, new monitoring and managing
operations need to be developed. There is now a set of methods that allows for data collection originated
from the networks. These methods not only grant a better planning and optimization of the networks,
but also allow to know if they are delivering the required quality to the users.

2.6.1 Performance Management

Performance Management (PM) consists on evaluating and reporting both the behaviour and effective-
ness of the network elements by gathering statistical information, maintaining and examining historical
logs, determining system performance and modifying the system modes of operation [12]. It was one of
the added concepts to the Telecommunication Management Network (TMN) framework defined by the

22
International Telecommunication Union (ITU), to manage telecommunication networks and services in
order to handle the growing complexity of the networks. The other concepts consist on security, fault,
accounting and configuration.
Performance Management (PM) involves the following:

• configuring data-collection methods and network testing;

• collecting performance data;

• optimizing network service and response time;

• proactive management and reporting;

• managing the consistency and quality of network services.

PM is the measurement of both network and application traffic in order to deliver a consistent and
predictable level of service at a given instance and across a defined period of time. PM enables the
vendors and operators to detect the deteriorating trend in advance and thus solve potential threats,
preventing faults [13]. The architecture of a PM system consists on four layers:

• Data Collection and Parsing Layer - where data is collected from Network Element (NE)’s using
a network specific protocol (e.g. FTP and Simple Network Management Protocol (SNMP));

• Data Storage and Management Layer - consisting on a data warehouse that stores the parsed
data;

• Application Layer - that processes the collected and stored data;

• Presentation Layer - which aims to provide a web-based user interface by presenting the gener-
ated PM results in the form of dashboards and real-time graphs and charts.

It is challenging to perform an efficient administration of PM consisting on collection, sorting, pro-


cessing and aggregating massive volumes of performance measurement data that are collected over
time periods. There is another challenge of performance measurements not having an unified structure,
as each NE’s manufacturer has proprietary protocols and data structures to gauge performance in their
devices.

2.6.2 Key Performance Indicators

A KPI is a quantifiable metric of the performance of essential operations and/or processes in an organi-
zation. In other words, it consists on network performance measurements. KPIs result from statistical
calculations based on counters installed on NEs that can register several indicators (e.g. failed han-
dovers, handover types and number of voice calls). They assist in identifying the strategic value drivers
in PM analysis and also in verifying if all elements across several levels of the network are using consis-
tent strategies to achieve the shared goals. With a careful analysis, this allows to precisely identify where
an action must be taken in order to improve the network’s performance [14]. While defining KPIs, it is

23
crucial to understand the metrics that are going to be measured and also their measurement frequency,
complexity and benchmark.
According to [15], KPIs can be divided into three types:

• MEAN - KPIs produced to reflect a mean measurement based on a number of sample results;

• RATIO - KPIs produced to reflect the percentage of a specific case occurrence to all the cases;

• CUM - KPIs produced to reflect a cumulative measurement which is always increasing.

Table 2.4: Description of the KPI categories and KPI examples.

Categories Description Examples


Accessibility KPIs that show the probability to Call Setup Success Rate
provide a service to an end-user at Random Access Success Rate
request.
Retainability KPIs that show how often an end-user ERAB-Retainabilty
abnormally loses a service connection Voip ERAB-Retainabilty
across its duration.
Integrity KPIs that show how much the services Downlink Traffic [MBytes]
are impaired once established. Uplink Traffic [MBytes]
Availability KPIs that show how the percentage of Availability
time that the cells are available.
Mobility KPIs that show how well handovers LTE Intra Mobility Success Rate
are being performed. Single Radio Voice Call Continuity
Quality KPIs that show how well the services Average Uplink Power Resource
are being delivered to the end-user. Blocks
% Of Uplink Power Resource Blocks

KPIs specific to telecommunication networks can be classified into five categories: Accessibility, Re-
tainability, Integrity, Availability and Mobility - in order to divide the measurements from distinct sectors
[15]. Vendors can also have another additional category, which is Quality, according to vendor docu-
mentation. These categories are summarized on Table 2.4.

Table 2.5: Netherlands P3 KPI analysis done in 2016 [16].

Voice KPIs - Drive Test T-Mobile KPN Vodafone Tele2


Big Cities
Call Success Ratio (%) 99.7 99.3 98.8 99.2
Call Setup Time (s) 3.7 5.1 5.3 4.9
Speech Quality (MOS-LQ0) 3.7 3.5 3.6 2.8
Small Cities
Call Success Ratio (%) 99.2 99.1 98.7 99.2
Call Setup Time (s) 3.8 5.1 5.4 4.9
Speech Quality (MOS-LQ0) 3.6 3.6 3.5 2.8

The KPI values must be within defined thresholds, depending on the environment (i.e. urban, sub-
urban or rural), in order to fulfil the network performance requirements as well as service level agree-
ments. Take for instance the Call Setup Success Rate KPI, that indicates how many call setups were
successful in percentage. Depending on the environment, the absolute number of successful call setups

24
will vary - it will be lesser in rural environments and higher in urban environments -, affecting its ratio,
slightly. Thus, the KPIs should not be the same for different environments.
There is an annual benchmark report between different operators in distinct countries which are
made by a widely respected consultancy firm - P3. The benchmarks are produced through the use of
KPIs from different environments. The firm did a study in the Netherlands for 2016 named The Mobile
Network Test in the Netherlands [16] using four mobile operators - T-Mobile, KPN, Vodafone and Tele2.
Table 2.5 shows a comparison between those four operators. The aforementioned table shows a slight
KPI difference between two different environments in all the operators’ networks that were tested.

2.6.3 Configuration Management

Configuration Management (CM) provides the operator with the ability to assure correct and effective
operation of the network as it evolves. Configuration Management (CM) actions aim to both control
and monitor the active configuration on the NEs and Network Resource (NR)s. These actions can be
initiated by the operator or by functions in the Operations Systems (OS) or NEs.
CM actions can be taken as part of an implementation programme (e.g. additions and deletions), an
optimisation programme (e.g. modifications) and to maintain the overall QoS. These actions can either
target a single NE of the network or several NEs, as part of a complex procedure [17].

CM Service Components
Whenever a network is first installed and activated, it is posteriorly enhanced and adapted to fulfill short
and long term requirements and also to satisfy customer needs. In order to cover these aspects, the CM
provides the operator with a set of capabilities, such as initial system installation, system operation to
adapt the system to short term requirements, system update to overcome software bugs or equipment
faults and system upgrade to enhance or extend the network by features or equipment respectively.
These capabilities are provided by the management system through its service components – sys-
tem modification and system monitoring. The former is used whenever it is necessary to adapt the
system data to a new requirement due to optimisation or new network configurations, while the latter
allows the operator to receive reports on the configuration of the entire network or parts of it when there
is an autonomous change of its states or values.

CM Functions
The requirements of CM led to system modification functions, such as the creating, deletion and condi-
tioning of NEs and NRs. All these functions apply the following requirements:

• minimal network disturbance by only taking the affected resources out of service if needed;

• independent physical modifications from related logical modifications;

• all the required actions should be finished before the resources are brought back to service;

• data consistency checks should be taken.

25
26
Chapter 3

Machine Learning Background

This chapter provides an overview of ML concepts and algorithms which will allow a better understanding
of the work that will be developed along the Thesis. Section 3.1 gives a brief introduction about ML and
its three main components – representation, evaluation and optimization – are presented and explained
in Section 3.2. Section 3.3 focuses on the main goal of ML – generalization. Sections 3.4 and 3.5 present
the three biggest challenges of ML – underfitting, overfitting and dimensionality – and how it is possible
to mitigate them. Section 3.6 explains the concept of feature engineering and its importance for ML
problems. Section 3.7 compares the benefits of having more data versus more intelligent algorithms.
Section 3.8 addresses the classification problem and how it can be applied to time series using ML.
Lastly, Section 3.10 explains how ML classification models are evaluated.
The content of this chapter is mainly based on the following references: [18, 19] in Section 3.1; [18]
in Sections 3.2 and 3.3; [18, 20] in Sections 3.4 and 3.5; [18] in Section 3.6; [21, 22, 23, 24] in Section
3.8; [25, 26, 27, 28, 29, 30, 21] in Section 3.9; [31] in Section 3.10.

3.1 Machine Learning Overview

To solve a problem on a computer, an algorithm is typically needed. However, it is not possible to build
an algorithm for some tasks, such as differentiating spam emails from legitimate emails. In this case,
both the input – email documents that consist on files with characters – and the output – a yes or no,
indicating if whether the message is spam or not are known; what is not known is how to transform the
input to the output.
It is believed that there is a process that explains the observed data, but it is not possible to identify
it completely. What is possible to do is to make a good and useful approximation, which may not
explain everything, but can at least explain some part of the data. This is done through the detection of
certain patterns or regularities which can help to understand the process or to make predictions. These
predictions are made under the assumption that if the near future will not be much different from the past
when the sample data was collected, then future predictions can also be right.
ML uses the theory of statistics in building mathematical models, for making an inference from data

27
samples. The procedure starts with an initial model with some pre-defined parameters which are opti-
mized through a learning algorithm using a set of training data. The model may be:

• supervised – to make predictions in the future;

• unsupervised – to gain knowledge from data;

• semi-supervised – to gain knowledge from data in order to perform predictions.

It is important to mention two aspects of using ML: first, in training, efficient algorithms are needed
to solve the optimization problem, as well as to process the massive amount of data that generally is
available; second, once a model is learned, its representation and algorithm solution for inference needs
to be efficient. In specific applications, the efficiency of the learning or inference algorithms (i.e. its
memory space and time complexity) may be as important as its predictive accuracy.
The previous example of email spam differentiation is a mature type of ML, called classification,
and it is a case of supervised learning. A classifier is a system that typically accepts a vector of discrete
and/or continuous feature values and outputs a single discrete value – the class; a feature is an individual
measurable property of an observed phenomenon. The spam filter classifies the email messages into
”spam” or ”not spam” and its input might be a Boolean vector x = (x1 , ..., xj , ...xd ), where xj = 1 if the
j th word in the dictionary appears in the email and xj = 0 otherwise. A classifier learns from a training
set of examples (xi , yi ), where xi = (xi,1 , ..., xi,d ) is an observed input corresponding to an email’s
word dictionary Boolean vector and yi is the corresponding output of whether that email is spam or not,
resulting in the output of a model. The classifier is tested whether the model produces the correct output
yt for future examples xt . For the spam filter, this means that it will test whether it correctly classifies
unseen emails as ”spam” or ”not spam”.

3.2 Machine Learning Components

The first problem that is faced when a possible application for ML is found, is the large variety of available
learning algorithms. This problem consists on combinations of three components:

• representation – a classifier must be represented in a formal language that is recognizable and


manageable by a computer. Choosing a representation for a classifier is equivalent to choosing
the set of classifiers that it can learn. This set is called the hypothesis space of the classifier and
if a model is not in it, then it cannot be learned.

• evaluation – an evaluation function is needed to score different models. The evaluation function
used internally by the algorithm may be different from the external one, which the classifier tries to
optimize, for ease of implementation.

• optimization – the method to search for the highest-scoring model amongst other models. The
choice of an optimization technique is crucial to the efficiency of the classifier and also allows to
verify if the evaluation function in the produced model has more than one optimum solution.

28
Table 3.1: The three components of learning algorithms (adapted from [18]).

Representation Evaluation Optimization


Instances Accuracy/Error rate Combinatorial optimization
k-nearest neighbor Precision and recall Greedy search
Support vector machines Squared error Beam search
Hyperplanes Likelihood Branch-and-bound
Naive Bayes Posterior probability Continuous optimization
Logistic regression Information gain Unconstrained
Decision trees K-L divergence Gradient descent
Sets of rules Cost/Utility Conjugate gradient
Propositional rules Margin Quasi-Newton methods
Logic programs Constrained
Neural networks Linear programming
Graphical models Quadratic programming
Bayesian networks
Conditional random fields

Table 3.1 shows common examples of each of the three aforementioned components. For instance,
k-Nearest Neighbor (kNN) classifies a test example by finding the k most similar training examples and
predicting the majority class among them. Hyperplane-based methods form a linear combination of the
features per class and predict the class with the highest-valued combination. Decision trees test one
feature at each internal node, with one branch for each feature value, and have class predictions at the
leaves. It is also important to add that not all combinations of one component from each column of
Table 3.1 make equal sense, as discrete representations tend to go with combinatorial optimization and
continuous ones with continuous optimization.

3.3 Generalization

ML aims mainly to generalize beyond the examples in the training set, as it may be unlikely to find these
exact examples in testing sets. A common mistake that is made when beginning to study ML is to test on
the training data and have the illusion of success. In fact it is easy to have good results on training sets,
since the classifier just has to memorize the examples; if tested on new data, sometimes the results are
not better than random guessing.
In order to generalize a ML model beyond the examples in the training set, a common and simple
procedure is to separate all the available data into two non-overlapping sets – the training and testing
sets – with a size ratio of 4:1, or 9:1 respectively. The aforementioned ratios depend on the amount
of data available, since increasing the fraction corresponding to the training set allows the creation of a
more generalized model at the cost of having a smaller test set to validate the resulting model. However,
the test data can influence the classifier indirectly, such as tuning the classifier parameters through
the analysis of the test data results (classifier parameter tuning is a fundamental step in developing
successful models). This leads to a need to have a holdout set for classifier parameter tuning at the
cost of reducing the amount of available data for training. Thankfully, this penalty can be mitigated by
doing k-fold cross-validation. Through this method, the training data is randomly divided into k equally

29
sized subsets, holding out each one while training on the remaining k − 1 subsets and testing each
learned classifier on the subset not used for training. After iterating k times, the k-fold cross validation
algorithm averages the results to evaluate the classifier parameter settings. For instance, three-fold
cross-validation is represented in Figure 3.1. This method can lead to even more reliable results by
running k-fold cross-validation multiple times and averaging end results at the end – Repeated k-fold
cross-validation. In this last method, the data is reshuffled and divided into k new subsets for each k-fold
cross-validation run [32].

Figure 3.1: Procedure of three-fold cross-validation (adapted from [32]).

3.4 Underfitting and Overfitting

When the data and associated knowledge leads to a model with a lower than expected test accuracy
score – ratio of correct predictions from all predictions made – it is possible to then erroneously lead
to the creation of a model that is not grounded in reality. For instance, after continuous parameter
adjustments it leads to a 100% accuracy score for training data and it seems to be a great model.
However, after testing the model on test data it revealed to have a worse accuracy score than in the
beginning. This problem is called overfitting and it starts to happen when the model is tuned to a point it
starts to decrease its test accuracy score and begins to learn some random regularity contained in the
set of training patterns. The opposite can also happen when the model can still be further fine tuned
to have an even better test accuracy score than before. This last case is called underfitting and occurs
when the model is incapable of capturing the variability of the data.
Overfitting is a known problem and its many forms are not immediately obvious. Therefore, it is easier
to decompose generalization error into bias and variance [33]. Bias can be defined as a classifier’s
tendency to consistently learn the same wrong thing. Variance is the tendency to learn random things
irrespective of the real signal. Figure 3.2 illustrates Bias and Variance through an analogy with throwing
darts at a board. For instance, a linear model has high bias, because when the frontier between two
classes is not a hyperplane the model is unable to induce it. However, decision trees do not have this
problem as they are able to represent any Boolean function at the cost of high variance – decision trees
learned from different training sets generated by the same phenomenon are often very different, when
they should be the same. Similar reasoning can be applied to optimization methods – beam search has

30
Figure 3.2: Bias and variance in dart-throwing (adapted from [18]).

lower bias, but higher variance compared to greedy search as it tries more hypotheses.

Cross-validation can also be used to counter overfitting. To give an illustration, it can be used to
choose the best size of decision tree to be learned, preventing it to be overly complex; hence, it gener-
alizes better. However, too many parameter choices might lead to overfitting [34].

There are more methods to counter overfitting besides cross-validation. The most popular is done by
adding a regularization term to the evaluation function which penalizes models with more structure, fa-
voring smaller ones with less room to overfit. Ridge and Lasso regressions are two examples of popular
regularization methods [35]. An alternative is to perform a statistical significance test such as chi-square
before adding new structure to see how much the class changes with or without that new structure. That
being said, skepticism is needed towards claims of techniques that solve overfitting because it is easy
to avoid overfitting (high variance) by going towards the opposite problem of underfitting (high bias).

Figure 3.3: Bias and variance contributing to total error.

The ideal case of a null variance and of a null bias is not possible to achieve in practice as there is a
tradeoff between them. With this tradeoff, the optimum model complexity is attained for the minimum of
both bias and variance. This tradeoff is represented in Figure 3.3. In the optimum model complexity, the
model starts to overfit if the complexity increases and starts to underfit if the complexity decreases.

31
3.5 Dimensionality

The biggest problem in ML after overfitting and underfitting is the ”curse of dimensionality” (i.e. the
number of used features). This expression was coined by Bellman in 1961 to express the fact that many
algorithms that work fine in low dimensions become unmanageable when the input is high-dimensional
[36]. In ML this problem is even bigger because generalizing correctly becomes exponentially harder as
the dimensionality of the examples grows. This is due to a fixed-size training set covering a very small
fraction of the input space.

Thankfully, there is an effect that partially counteracts this problem, which is non-uniformity of the
data. In most applications, examples are not spread uniformly throughout the instance space, but are
concentrated on or near a lower-dimensional manifold. Learners can then implicitly take advantage of
this lower effective dimension. There are also algorithms for reducing the dimensionality of the data [37].

In order to reduce dimensionality without losing much information, some analysis techniques were
developed such as Principal Component Analysis (PCA). PCA provides a roadmap for how to reduce
a complex data set to a lower dimension to reveal the sometimes hidden, simplified dynamics that
often underlie it [38]. The data noise is measured by the SNR which is determined by calculating
data variances. The variance between variables allows to quantify their redundancy by measuring the
spread between them – the higher the spread, the lower the redundancy. With a = [a1 a2 ... an ] and
2
b = [b1 b2 ... bn ] as two variable vectors, the covariance, σab , between the two is represented by the
following dot product:

2 1
σab = abT , (3.1)
n−1

where the beginning term is a constant for normalization. By building a covariance matrix between all
variables, the sum of the diagonal values yields in the overall variability. PCA then replaces the original
variables with new ones, called Principal Components. Principal Components are orthogonal and have
variances (called eigenvalues) in decreasing order while maintaining the overall variability from the co-
variance matrix. Thus, it is possible to explain all the variance in the data by having all the eigenvalues.
In order to reduce the data’s dimensionality and choose the most relevant Principal Components, what
can be done is to choose the Principal Components which the sum of eigenvalues satisfies a defined
threshold. This sum of eigenvalues of Principal Components is a cumulative function called Cumulative
Proportion of Variance Explained (CPVE). The CPVE of the first k Principal Components in a dataset
with n variables is given as follows:

Pk
λi
CP V Ek = Pi=0
n , (3.2)
i=0 λi

where λ is an eigenvalue of a Principal Component and k ≤ n. By defining a threshold, of say 98%, it


is possible to choose the Principal Components that reduce the data dimensionality while losing a small
fraction of the original variance.

32
3.6 Feature Engineering

Feature engineering is the process of using domain knowledge of the data to create features that make
ML algorithms work. The most important factor that makes ML projects succeed is the features selected.
If many independent variables that correlate well with the class are present, the learning is easy. How-
ever, learning may not be possible if the class is a very complex function of the features. Most of the
effort in ML projects is to construct features from raw data, because data in its raw form usually does not
allow for learning to happen.
In order to have a successful ML project, the data should be relevant, well processed and plenty. That
is why most of the work and time invested in this kind of projects is in gathering, integrating, cleaning and
pre-processing data in addition to the trial and error that goes into feature design – evaluating models
with different features and combinations. ML consists in an iterative process of running the classifier,
analysing the result and modifying the data and/or the classifier. Learning is often the quickest part,
while feature engineering is more difficult because it is domain-specific.

3.7 More Data and Cleverer Algorithms

If the best set of features is obtained, but the models are not accurate as intended, then there are
two ways to build better models – to conceive a better learning algorithm or gathering more data. ML
researchers strive for the former, however the quickest path to build better classifiers is to simply get
more data. Pragmatically, a simple algorithm with enormous amounts of data can beat an intelligent one
with a more modest amount of data. This brings another problem – scalability. In the 1980s, the main
bottleneck was data, while today is time. There are now enormous amounts of data available, however
there is not enough time to process such amounts (within wanted requirements). Therefore, part of it is
not used. In order to use more data in a shorter time window, faster ways to learn complex classifiers
are being conceived [39].
Smarter algorithms have a small payoff because, as a first approximation, they all do the same. All
classifiers essentially work by grouping nearby examples into the same class. The key difference is their
meaning of ”nearby”. With non-uniformly distributed data, classifiers can produce very different class
separating planes, while still approximately making the same predictions (if enough training examples
are given).
As a rule of thumb, it is better to try the simplest classifiers first (e.g. naive Bayes before logistic
regression and kNN before SVM). More sophisticated classifiers are seductive, but usually harder to
use as they have more parameters to tune in order to get good results.
Classifiers can be divided into two types: those whose representation has a fixed size, such as
linear classifiers, and those whose representation can grow with the data, like decision trees. Fixed size
classifiers can only take relevant advantage of the amount of data up to a certain point (with more and
more data, their accuracy asymptotes to a certain value). Variable-size classifiers can in principle learn
any function given sufficient data, but not in practice due to the algorithm’s limitations (e.g. greedy search

33
falls into local optimum solutions, not returning the global optimum solution) and to computational cost.
Moreover, there still is the curse of dimensionality where no existing amount of data may be enough.
Thus, clever algorithms often have a higher payoff if better designed. This is why ML projects have a
significant component of classifier design [40].
A good approach to test how much the obtained models would improve if more data was added to the
training set is through observation of learning curves. A learning curve shows a measure of predictive
performance on a given domain as a function of some measure of varying amounts of learning effort
[41]. Learning curves are most often presented with the predictive accuracy of the test samples as a
function of the training examples as in Figure 3.4.

倀爀搀挀
椀漀

琀渀䄀挀
  甀
挀爀挀
愀礀

洀戀
一甀 爀
攀漀
  
昀爀
吀愀渀
椀渀
椀最䔀砀
  洀瀀
愀 攀
氀猀
Figure 3.4: A learning curve showing the model accuracy on test examples as function of the number of
training examples.

3.8 Classification in Multivariate Time Series

In ML, classification is the problem of identifying to which set of categories or classes a new observation
belongs to, on the basis of a training set that consists of previous observations from the known class [19].
In order words, a classification task involves separating data into training and testing sets where each
observation has a target value, known as class labels, and one or more attributes, known as features.
The classification is done through an algorithm that is called a classifier, whose role is to produce a
model based on the training set that predicts the classes of the test data given only the features of the
testing set.
A time series is a set of observations measured sequentially through time [42]. Time series analysis
has turned into one of the most popular branches of statistics. Recent developments in computing have
provided the basic infrastructure for fast access to vast amounts of data which allowed the analysis of
time series in various sectors, such as telecommunications, medical and financial sectors.
Time series can be either univariate or multivariate. The former refers to a time series of single
observations recorded sequentially through time while the latter consists of sequences of values of
several contemporaneous variables changing with time [43].

34
In order to perform time series classification, one has to choose between two options that depend on
the type of the data:

• apply the One Nearest Neighbor (1NN) classifier with a distance metric such as Euclidean Distance
or Dynamic Time Warping in order to classify a time series as the most similar one in the training
set;

• extract features from each time series and classify these features with an adequate classifier, such
as SVM.

While Dynamic Time Warping is known to be one of the best performing univariate time series clas-
sification techniques, it is also very slow as each comparison between two time series of length N has
O(N 2 ) complexity [44]. For M long multivariate time series with K different time series in the testing
set and a big training set with T multivariate time series, it will result in a complexity of O(M T KN 2 ) and
thus will be very slow as it will compare each time series to the entire training set. Due to this limitation,
Dynamic Time Warping is normally used with methods that can reduce the amount of time series to
compare with the cost of a slight reduction to the accuracy.
The second option can also deliver good accuracy scores and is usually much faster than the first
option. The most common approach is to apply statistical calculations in order to extract data such as
the mean and standard deviation that can best characterize each time series to its class. This extracted
data will be used as features of each time series and then posteriorly learned and classified by a chosen
algorithm. One of the best algorithms for that process is SVM.
As this work will involve working with large testing and training sets under time constraints, the second
option will be focused as it is the fastest one.

3.9 Proposed Classification Algorithms

This section provides an in depth explanation of each proposed classification algorithm that will be used
in this Thesis. The first four classifiers belong to the family of ensemble classifiers, more specifically tree
ensemble classifiers, and the last one is a conventional classifier.

3.9.1 Adaptive Boosting

Ensemble methods are a ML approach based on the concept of creating a highly accurate classifier by
combining several weak and inaccurate classifiers. One of the ensemble techniques is boosting that
consists in a two step approach. Firstly, it uses subsets of the original data to produce weak performing
models (high bias, low variance) and then boosts their performance through combining them together
based on a chosen cost function. Namely, AB or AdaBoost was the first practical boosting algorithm and
remains one of the most used and studied classifiers [26].
The AB algorithm pseudocode is shown in Algorithm 1. There are m labeled training samples
(x1 , y1 ), ..., (xm ym ) where xi belongs to some domain X and yi ∈ {−1; +1}. A weight array Wt (i) is

35
Algorithm 1 The boosting algorithm AdaBoost (adapted from [25]).
Input: training data (x, y)
Output: the resulting model H
1: function A DAPTIVE B OOSTING ((x, y))
2: H←Ø
3: for i = 1, ..., m do
4: W1 (i) ← 1/m
5: for t = 1, ..., T do
6: randomly choose a data subset Dt from the training data set according to the samples’ weight
7: fit a weak learner ht (x, θt ) from data subset Dt
8: measure the performance of ht (x, θt ) by its weighted error Wt
9: calculate hypothesis weight αt
10: for i = 1, ..., m do
11: update Wt+1 (i)
12: H ← H ∪ {αt ht (x, θt )}
13: return H

initialized over the m training samples in order to all to have the same starting weight. A data subset
Dt is randomly obtained from the training set considering the weights of each sample (samples with
higher weights are more likely to be chosen) for each iteration t = 1, ..., T . A weak classifier is then fitted
ht (x, θt ) : X → {−1; +1}, with θt being its parameters, and the aim of the weak classifier is to find a
weak hypothesis with low weighted error εt relative to Wt (i), where:

X
εt = P ri∼Wt [ht (xi , θt ) 6= yi ] = Wt (i). (3.3)
i:ht (xi ,θt )6=yi

Afterwards, a weight αt is assigned to the resulting hypothesis from ht (x, θt ):

1 1 − εt
αt = ln( ). (3.4)
2 εt

In the end of each iteration the weight of each sample Wt (i) is updated proportionally to its weighted
classification error:

Wt (i)exp(−αt yi ht (xi , θt ))
Wt+1 (i) = , (3.5)
Zt

where Zt is a normalization factor. The final hypothesis obtained from model H computes the sign of a
weighted combination of weak hypotheses, working as a majority vote of the weak hypotheses ht (x, θt )
with weight αt :

T
!
X
H(x) = sign αt ht (x, θt )) . (3.6)
t=1

The AB algorithm is often used with Decision Trees as weak learners. A Decision Tree classifies a
sample in accordance to which node it goes to, after passing through several conditions in each branch
split or decision point. These conditions are based on comparisons regarding the features of the sample
and they determine the path taken by the sample across the tree. A simple example of a Decision Tree
is shown in Figure 3.5, where a decision if a football match should be played depends on the weather

36
conditions. Decision Trees are used because they are easily interpretable; they allow for nonlinear data
classification; they give different importances to features, performing feature selection; they are fast to
classify data. Moreover, Decision Trees have a key disadvantage which shows up whenever a Decision
Tree does not have a growth limit – it easily overfits to the training data. Boosting a Decision Tree
increases its resistance to overfitting if the weak learners’ accuracy is higher than random guessing.
Two situations of resistance to overfitting and occurrence of overfitting in boosting are set in Figure 3.6.

Pl
ayaf
oot
bal
lma
tch

Wea
ther

Ra
in Ov
erc
ast S
unny

Wi
nd Y
es Humi
dit
y

Wea
k S
trong Nor
mal Hi
gh

Y
es No Y
es No

Figure 3.5: Example of a Decision Tree to decide whether a football match should be played based on
the weather (adapted from [45]).

If using Decision Trees, AB has as regularization parameters: the maximum depth limit of the tree,
the minimum number of samples required to create a leaf node and the minimum number of samples
required to split an internal node. AB can also use another regularization parameter which is the learning
rate with the aim of shrinking the contribution of each weak trained model to the ensemble model. This
previous regularization technique is known as shrinkage and it has shown to dramatically increase test
set accuracy because it leads to less steps that allow for obtaining the loss function minimum more
precisely.

Figure 3.6: Left: The training and test percent error rates using boosting on an Optical Character Recog-
nition dataset that do not show any signs of overfitting [25]. Right: The training and test percent error
rates on a heart-disease dataset that after five iterations reveal overfitting [25].

37
3.9.2 Gradient Boost

GB is another popular boosting algorithm for creating collections of classifiers. To make a quick dis-
tinction between GB and AB, the latter varies each classifier’s training set to have samples with higher
weighted error in order to minimize the overall classification error. GB calculates a negative gradient of
a loss function (direction of quickest improvement) and picks a weak learner that is the closest to that
gradient to add to the model [28].

Algorithm 2 The boosting algorithm Gradient Boost (adapted from [27]).


Input: training data (x, y)
Output: the function estimate fb
1: function G RADIENT B OOST((x, y))
2: fb0 ← Ø
3: for t = 1, ..., T do
4: compute the negative gradient gt (x)
5: fit a new weak learner h(x, θt )
6: find the best gradient descent step-size ρt
7: update the function estimate fbt
8: return fb

The GB algorithm pseudocode is shown in Algorithm 2. Let f be the unknown functional dependence
f
x → y where x is an input variable and y its respective label. The goal is to obtain an estimate fb (i.e. a
model) in order to minimize a loss function ψ(y, f ):

fb(x) = y,
(3.7)
fb(x) = argmin ψ(y, f (x)).
f (x)

In order to minimize the loss function ψ(y, f ), what GB does is to choose a weak learner h(x, θt ) that
is closest to a negative gradient gt (xi ) along the training data in each iteration t = 1, ..., T :

 
∂ψ(y, f (x))
gt (x) = Ey |x , (3.8)
∂f (x) f (x)=fd
t−1 (x)

where Ey is the expected y loss. Instead of searching for the general solution for the boost increment in
the function space, one can choose the new function increment to be the most correlated with −gt (x).
This allows for the replacement of a potentially complex optimization task with the simple and classic
least-square minimization task:

M
X
(ρt , θt ) = argmin [−gt (xi ) + ρh(xi , θ)]2 , (3.9)
ρ,θ
i=1

and

M
X
ρt = argminρ ψ[yi , fd
t−1 (xi ) + ρh(xi , θt )]. (3.10)
i=1

where M is the total number of samples in the training set and ρt is the gradient step size of iteration t.
In the end of each iteration, for instance iteration t, the function estimate fbt is updated as follows:

38
fbt = fd
t−1 + ρh(x, θt ). (3.11)

The performance of GB depends heavily on the chosen loss function ψ(y, f (x)) and weak learner
h(x, θ). The weak learner chosen for this study will be Decision Trees, for the reasons explained in the
last section. A common loss function is the mean squared error:

M
X
ψ(y, f (x)) = [yi − ybi ]2 , (3.12)
i=1

where ybi is predicted output. Another popular loss function is logistic loss for logistic regression:

M
X
ψ(y, f (x)) = yi )) + (1 − yi )ln(1 + exp(b
[yi ln(1 + exp(−b yi ))]. (3.13)
i=1

By using Decision Trees as weak learners, GB can avoid overfitting by having as regularization
parameters the maximum depth limit of the tree, the minimum number of samples required to create a
leaf node and the minimum number of samples required to split an internal node. Similarly to AB, GB
also allows for shrinkage and its learning rate can be changed to reduce overfitting to data.

3.9.3 Extremely Randomized Trees

Inside the family of ensemble methods, there is another type of techniques besides boosting, which are
bagging inspired algorithms. In general, these algorithms aim to control generalization error through
perturbation and averaging of weak learners (e.g. Decision Trees). One of those algorithms is the
ERT algorithm which belongs to the family of tree ensemble algorithms and stands out by strongly
randomizing both feature and cut-point choice while splitting a tree node. In the extreme case, it builds
fully randomized and grown trees from the whole training set (low bias, high variance) whose structures
are independent of the output values of the learning sample. A general classification procedure of a tree
ensemble algorithm is shown in Figure 3.7 where a prediction y is obtained by a majority vote of all the
generated Decision Trees with sample x as input.

Figure 3.7: A general tree ensemble algorithm classification procedure.

The ERT classifier builds an ensemble of unpruned decision trees according to the classical top-

39
down procedure. ERT differs from other tree-based ensemble methods as it splits nodes by choosing
cut-points fully or partially at random and it uses the whole training dataset to grow the trees.

Algorithm 3 The Extremely Randomized Trees splitting algorithm (adapted from [29]).
Input: the local learning subset S corresponding to the node we want to split
Output: a split [f < fc ] or nothing
1: function S PLIT A NODE (S)
2: if Stop Split(S) is TRUE then return NULL
3: else
4: select K features f1 , ..., fK among all non constant (in S) candidate features
5: draw K splits s1 , ..., sK , where si = Pick a random split(S, fi ), ∀i = 1, ..., K
6: return a split s∗ such that Score(s∗ , S) = maxi=1,...,K Score(si , S)
Inputs: a subset S and a feature f
Output: a split
1: function P ICK A RANDOM SPLIT(S, f )
S S
2: let fmax and fmin denote the maximal and minimal value of f in S
S S
3: draw a random cut-point fc uniformly in [fmin , fmax ]
4: return the split [f < fc ]
Inputs: a subset S
Output: a boolean
1: function S TOP SPLIT(S, f )
2: if |S| < nmin then return TRUE
3: else if all attributes are constant in S then return TRUE
4: else if the output is constant in S then return TRUE
5: else return FALSE

The ERT splitting algorithm pseudocode is shown in Algorithm 3. It has as main parameters: K,
the number of randomly selected features at each node; nmin , the minimum sample size for splitting
a node; M , the total number of trees to grow in the ensemble. Each grown tree uses the full training
set to generate the ensemble model. The K parameter determines the strength of the feature selection
process, nmin the strength of averaging output noise, and M the effectiveness of the variance reduction
of the ensemble model aggregation. In the end, all the predictions of the trees are aggregated to return
the final prediction through a majority vote.
The ERT classifier aims to strongly reduce variance through a fully randomization of the cut-point
and feature combined with ensemble averaging compared to the weaker randomization schemes used
by other methods. By training each weak learner with the full training set instead of data subsets, ERT
thus minimizes bias. Regarding computational performance, the tree growing complexity is similar to a
simple Decision Tree. However, as each node splitting procedure is totally random, ERT is expected to
have faster performance compared to other tree ensemble methods which locally optimize cut-points.
Being based on Decision Trees as weak learners, ERT can avoid overfitting by having as regulariza-
tion parameters the maximum depth limit of the tree, the minimum number of samples required to create
a leaf node and the minimum number of samples required to split an internal node.

3.9.4 Random Forest

RF is another bagging inspired algorithm in the family of tree ensemble algorithms. Similarly to ERT, the
basic premise of RF is that building a small Decision Tree with few features is a computationally cheap

40
process. Furthermore, several small and weak trees can be grown in parallel and these set of Decision
Trees then result in a strong classifier algorithm by averaging or by majority vote, which can be observed
once more in Figure 3.7.

Algorithm 4 The Random Forest algorithm.


Input: the training set S, features F and number of trees in forest B
Output: the resulting model H
1: function R ANDOM F OREST (S, F, B)
2: H←Ø
3: for i ∈ 1, ..., B do
4: S i ← a data subset from S
5: hi ← Randomized Tree Learn(S i , F )
6: H ← H ∪ {hi }
7: return H
Inputs: a subset S i and features F
Output: a learned tree hi
1: function R ANDOMIZED T REE L EARN (S i , F )
2: hi ← Ø
3: for each generated node n in tree hi do
4: f ← a very small subset of F
5: split on best feature in f
6: return the learned tree hi

RF is similar to ERT with the exception of two steps. First, it uses data subsets for growing its trees
(where ERT uses the whole training dataset). Second, it uses a very small subset of features to be
chosen on splitting a node (where ERT chooses a random feature from all features).
The RF pseudocode is shown in Algorithm 4. The resulting RF model is firstly initialized and then
for each grown tree in the ensemble , a data subset S i for the it h tree is used from the training set S.
Each Decision Tree is grown using a modified learning algorithm where it only uses a small subset of
all features f ⊂ F to perform a node split, where F is the total set of features. By limiting the split on
a small subset of features, RF allows for a drastically higher learning compared to standard Decision
Trees, because it is the most computationally expensive step in Decision Tree growing. Additionally,
by using small subsets of features f , RF increases the chance of growing uncorrelated weak learners,
because standard Decision Trees based ensembles result in splits made by the same features, resulting
in more correlated outcomes. By growing more uncorrelated weak learners, the better the ensemble
algorithm is at predicting outcomes.
By using Decision Trees as weak learners, RF can also increase its resistance to overfitting by having
as regularization parameters the maximum depth limit of the tree, the minimum number of samples
required to create a leaf node and the minimum number of samples required to split an internal node.

3.9.5 Support Vector Machines

SVMs aim to separate data points of different classes through the use of hyperplanes that define decision
boundaries. They are capable of handling linear and non-linear classification tasks. The main idea
behind SVMs is to map the original observations from the input space into a high-dimensional feature

41
space such that the classification problem becomes simpler. The mapping is performed by a suitable
choice of a kernel function and is represented in Figure 3.8.

Figure 3.8: Data mapping from the input space to a high-dimensional feature space to obtain a linear
separation (adapted from [21]).

Considering a training data set {xi , yi }N d


i=1 , with xi ∈ R being the input vectors and yi ∈ {−1, +1}

the class labels. SVMs map the d-dimensional input vector x from the input space to the dh -dimensional
feature space using a linear or nonlinear function ϕ(·) : Rd → Rdh . The hyperplane that separates the
classes in the feature space is defined as wT ϕ(x) + b = 0, with b ∈ R and w an unknown vector with
the same dimensions as ϕ(x). An observation x is assigned to the first class if f (x) = sign(wT ϕ(x) + b)
equals +1 or to the second class if f (x) equals -1.

Figure 3.9: The hyperplane constructed by SVMs that maximizes the margin (adapted from [21]).

SVMs are based on the maximum margin principle and aim at constructing a hyperplane with max-
imum distance between the two classes, which can be seen in Figure 3.9. However, the data of both
classes from most real life applications are overlapped, which makes a perfect linear separation impos-
sible. Thus, there should be a certain number of tolerated misclassifications around the margin. The
resulting optimization problem  for SVMs where the violation of the constraints is penalized is written as

42
N
1 X
min (w, ξ) = wT w + C ξi , (3.14)
w,ξ,b 2 i=1

such that

yi (wT ϕ(xi ) + b) ≥ 1 − ξi , i = 1, ..., N, (3.15)

and

ξi ≥ 0, i = 1, ..., N, (3.16)

where C is a positive regularization constant and ξi is a slack variable that states whether a sample
xi is between the margin and the correct side of the hyperplane or not. The regularization constant C
in the cost function defines the trade-off between a large margin and misclassification error. A low C
results in a smooth decision boundary whilst a high C aims at classifying all training examples correctly.
SVMs respect the principle of structural risk minimization that balances model complexity (i.e. first term
in (3.14)) and empirical error (i.e. second term in (3.14)), through regularization. Regarding the distance
of xi to the decision boundary:

• ξ ≥ 1 : yi (wT ϕ(xi ) + b) < 0 implies that the decision function and the target have a different sign,
meaning that xi is misclassified;

• 0 < ξi < 1 : xi is correctly classified, but is located inside the margin;

• ξi = 0 : xi is correctly classified and is either located outside the margin or on the margin boundary.

The optimization problem in (3.14) to (3.16) is typically referred to as the primal optimization problem.
The optimization problem for SVMs can be written in the dual space using the Lagrange multipliers
αi ≥ 0 for the first set of constraints (3.15). The solution for the Lagrange multipliers is obtained by
solving a quadratic programming problem, which leads to the SVM classifier taking the form

#SV
!
X
f (x) = sign αi yi K(x, xi ) + b , (3.17)
i=1

where #SV represents the number of support vectors and the kernel function K(·, ·)) is positive definite
and satisfies Mercer’s conditions, i.e. K(·, ·) = ϕ(x)T ϕ(xi ). While solving the optimization problem,
only K(·, ·) is used which is related to ϕ(·). Accordingly, this allows SVMs to work in a high-dimensional
feature space, without performing calculations in it. One can choose one of several types of kernels,
such as

• Linear SVM: K(x, z) = xT z,

• Polynomial SVM of degree d: K(x, z) = (τ + xT z)d , τ ≥ 0,


2
• Radial Basis Function (RBF): K(x, z) = exp(− kx−zk
2σ 2 ),

43
where K(·, ·) is positive definite for all σ values in the RBF kernel case and τ ≥ 0 values in the polynomial
2
case. For the RBF case, kx − zk refers to the squared euclidean distance between two feature vectors
1
and σ is a free parameter. Furthermore, the RBF can also have a simpler definition if γ = 2σ 2 and
it defines how much weight a single training example has on the decision boundary. Regarding the
aforementioned kernels, they result in global and unique solutions for (3.14) to (3.16).
The SVM classifier has a notable property which is called sparseness and it means that a number
of the resulting Lagrange multipliers αi equals zero. Therefore, the resulting classifier in (3.17) only
takes over all nonzero αi values (i.e. support values) instead of all data points. Vectors xi are referred
to as support vectors and these data points are located close to the decision boundary and aid in the
construction of the separating hyperplane.
Succinctly, SVMs main strengths lie in its scalability to high dimensional data, its regularization pa-
rameter γ (to avoid over-fitting) and its model training easiness (unexistence of local optima), while its
main weakness lies in its dependence on a suitable kernel to function properly [46].

3.10 Classification Model Evaluation

Current research done in ML has moved away from simply presenting accuracy results when performing
an empirical validation of new algorithms. Accuracy scores can simply be obtained by dividing the
number of correct predictions of a classifier by the total amount of examples in the test set – the closer
to 1, the better it is.
It was argued that accuracy scores can be misleading, being recommended to use Receiver Operator
Characteristic (ROC) curves for binary decision problems [47]. ROC curves show how the number of
correctly classified positive examples varies with the number of incorrectly classified negative examples.
However, ROC curves can show an overly optimistic view of an algorithm’s performance if there is a
large skew in the class distribution.
Precision-Recall (PR) curves, often used in Information Retrieval [48, 49], have been cited as an
alternative to ROC curves for tasks with a large skew in the class distribution [50, 51, 52, 53, 54, 55].
In a binary decision problem, a classifier labels examples as either positive or negative and its de-
cisions can be represented in a structure known as confusion matrix. The confusion matrix has four
categories:

• True Positive (TP) – positive examples correctly labeled as positives;

• False Positive (FP) – negative examples incorrectly labeled as positives;

• True Negative (TN) – negative examples correctly labeled as negatives;

• False Negative (FN) – positive examples incorrectly labeled as negatives.

The confusion matrix is shown in Table 3.2 and is useful to construct a point in PR space. The Recall,
Precision and Accuracy metrics are defined as:

44
Table 3.2: Confusion Matrix (adapted from [31]).

Actual positive Actual negative


Predicted positive TP FP
Predicted negative FN TN
Total TP + FN FP + TN

TP
Recall = (3.18)
TP + FN

TP
P recision = (3.19)
TP + FP

TP + TN
Accuracy = (3.20)
TP + TN + FP + FN

where Recall measures the fraction of positive examples that are correctly labeled, Precision measures
the fraction of examples classified as positive as truly positive and Accuracy measures the fraction of
correctly classified examples. Precision can be thought as a measure of a classifier’s exactness – a low
precision can indicate a large number of FP – while Recall can be thought as a measure of a classifier’s
completeness – a low recall indicates many FN.
Both the Precision and Recall metrics are often combined as their harmonic mean, known as the
F-measure [56], which can be formulated as follows:

(1 + β 2 ) × Recall × P recision
F = , (3.21)
(β 2 × P recision) + Recall

where β allows to weight either Precision or Recall more heavily, with both being balanced when β = 1
[57]. For ML projects that want to minimize the number of FP at the cost of potentially more FN, then
(3.21) should have β < 1, weighting more heavily the Precision metric. However, for ML projects that
want to minimize the number of FN at the cost of potentially more FP, then (3.21) should have β > 1
instead, weighting more heavily the Recall metric.

45
46
Chapter 4

Physical Cell Identity Conflict


Detection

4.1 Introduction

This chapter introduces the PCI conflict problem that can occur in LTE radio networks, and also its
subcategories – confusions and collisions. Furthermore, the steps taken towards achieving the best
approach to detect PCI conflicts, by using ML models to analyse KPI daily measurements are presented.
Each LTE cell has two identifiers, with different purposes – the Global Cell ID and the PCI. The
Global Cell ID is used to identify the cell from an Operations, Administration and Management (OAM)
perspective. The PCI has a value in the range of 0 to 503, and is used to scramble the data in order
to allow mobile phones to separate information from different eNB. Since a LTE network may contain
a much larger number of cells than the 504 available numbers of PCIs, the same PCI must be reused
by different cells. However, an UE, which is any device used directly by an end-user to communicate,
cannot distinguish between two cells if both have the same PCI and frequency bands; this phenomenon
is called a PCI conflict.
PCI conflicts can be divided into two situations – PCI confusions and PCI collisions. A PCI confusion
occurs whenever a E-UTRAN cell has two different neighbor E-UTRAN cells with equal PCI and fre-
quency. A PCI collision happens whenever a E-UTRAN cell has a neighbor E-UTRAN cell with identical
PCI and frequency. These two events are represented in Figure 4.1.
A good PCI plan can be applied to avoid most PCI conflicts. By contrast, it can be difficult to do
such a plan without getting any PCI conflicts in a dense network. Moreover, network changes, namely
increased power of a cell and radio channel fading, can lead to PCI conflicts. These changes might
result in a mobile phone that detects a cell different from one of the PCI plan. PCI conflicts can lead to
an increase of dropped calls due to failed handovers as well as an increased channel interference.
This chapter is organised in five sections. After the introduction, Section 4.2 presents the chosen
KPIs that were relevant for the PCI conflict classification task. Section 4.3 showcases a PCI conflict
classification task based in a network vendor equipment feature for PCI conflict reporting. Section 4.4

47
Figure 4.1: PCI Confusion (left) and PCI Collision (right).

presents a new PCI conflict classification approach based in configured global cell relations and tests
the following three hypotheses:

1. PCI conflicts are better detected by using KPI measurements in the daily peak traffic instant of
each cell.

2. PCI conflicts are better detected by extracting statistical calculations from each KPI daily time
series and using them as features.

3. PCI conflicts are better detected by using each cell’s KPI measurements in each day as an indi-
vidual feature.

Lastly, Section 4.5 presents the preliminary conclusions within this chapter. The overall PCI conflict
detection procedure using the configured global cell relations can be observed in Figure A.1.

4.2 Key Performance Indicator (KPI) Selection

The first step towards reaching the objective of this investigation was to gather a list of all the available
network vendor LTE KPIs and their respective documentation. In accordance with the theory behind
LTE and how PCIs are used, a new list containing the most relevant KPIs for PCI conflict detection was
obtained. These KPIs are represented in Tables 4.1 and 4.2. A brief time series analysis of these KPIs
regarding 4200 cells over a single day is also represented in Figure 4.2.

Table 4.1: Chosen Accessibility and Integrity KPIs.

Accessibility Integrity
RandomAcc Succ Rate DL Latency ms
DL Avg Cell Throughput Mbps
DL Avg UE Throughput Mbps

Regarding Accessibility, RandomAcc Succ Rate refers to the success rate of random access proce-
dures made through the PRACH, and it is relevant to detect PCI conflicts as PCIs are used for signal
synchronization and random access procedures. Thus, PCI conflicts can lead to the corruption of the
PRACH, reducing the success rate of random access procedures [58].

48
Table 4.2: Chosen Mobility, Quality and Retainability KPIs.

Mobility Quality Retainability


IntraFreq Prep HO Succ Rate Average CQI Service Drop Rate
IntraFreq Exec HO Succ Rate UL PUCCH Interference Avg Service Establish
ReEst during HO Succ Rate UL PUSCH Interference Avg

In Integrity, DL Latency ms measures the average time period it takes for a small IP packet to travel
from the UE to the Internet server, and backwards. DL Latency ms is relevant to detect PCI conflicts, as
processed handovers to unexpected PCI conflicting cells, that are far away from the UE, report higher
downlink latency due to higher distance than normal to the target cell. The last two KPIs measure the
average cell and UE downlink throughput, respectively. They were chosen, because PCI values are
related to the positioning of the reference signals where in PCI conflicts may result in reference signal
collisions. These reference signal collisions result in lower average downlink throughput for both cells
and UEs [59].

In Mobility, IntraFreq Prep HO Succ rate measures the success rate of the handover preparation
between cells in the same frequency band, and IntraFreq Exec HO Succ Rate refers to the success
rate of processed handovers between cells in the same frequency band. The aforementioned KPIs are
relevant for detecting PCI conflicts, as UEs may initiate handovers to the wrong cell that has the same
PCI and frequency band (other than the one intended), resulting in more frequently failed handovers [58].
ReEst during HO Succ Rate measures the success rate of handover re-establishment to the target cell.
It is relevant for detecting PCI conflicts, as processed handovers to unexpected cells may not be able
to be re-established due to low coverage by the target cell, reducing the handover re-establishment
success rate.

In Quality, UL PUCCH Interference Avg and UL PUSCH Interference Avg measure the average noise
and interference power on the Physical Uplink Control Channel (PUCCH) and on the PUSCH, respec-
tively. These KPIs are relevant for PCI conflict detection because PCI conflicting cells have the same
frequency band and might have higher noise and interference. Average CQI measures the average
CQI, which is relevant to identify PCI conflicting cells because they have the same frequency band and
reference signals, resulting in a situation where the channel quality might be lower than normal.

Regarding Retainability, Service Drop Rate measures the drop rate of all services in a cell. It is
relevant to detect PCI conflicts as service drops can happen whenever an UE attempts to perform a
handover to a PCI conflicting cell and fails the handover and the handover re-establishment. The last
KPI Service Establish, measures the total number of established services during a period and was
chosen to differentiate cells with different amounts of traffic.

Regarding Figure 4.2, it represents the distribution of each KPI’s values of 4200 LTE cells over a
single day. The Interquartile Range (IQR) represents the values where 50% of the data is distributed.
The IQR can be obtained through IQR = Q3 − Q1 , where both Q3 and Q1 refer to the third and first
quartiles of each KPI measure across a single day. The Upper and Lower Fences correspond to Q3 +
1.5 × IQR and Q1 − 1.5 × IQR, respectively; these limits are used to check for outlier values, which are

49
Average_CQI UL_PUCCH_Interference_Avg UL_PUSCH_Interference_Avg
15.0
95 95
12.5
100 100
Average CQI

10.0 105 105

dBm

dBm
7.5 110 110
5.0 115 115
120 120
2.5
125
Service_Establish Service_Drop_Rate DL_Avg_Cell_Throughput_Mbps
1.0
20000
Established Services

80
0.8
15000
Drop Rate
0.6 60

Mbps
10000 40
0.4
5000 0.2 20
0 0.0 0
DL_Avg_UE_Throughput_Mbps DL_Latency_ms RandomAcc_Succ_Rate
250 1.0
150 0.8
200

Success Rate
Miliseconds

100 150 0.6


Mbps

100 0.4
50
50 0.2
0 0 0.0
IntraFreq_Exec_HO_Succ_Rate IntraFreq_Prep_HO_Succ_Rate ReEst_during_HO_Succ_Rate
1.0 1.0 1.0
0.8 0.8 0.8
Success Rate

Success Rate

Success Rate

0.6 0.6 0.6


0.4 0.4 0.4
0.2 0.2 0.2
0.0 0.0 0.0
03 0
06 0
09 0
12 0
15 0
18 0
21 0
:00

03 0
06 0
09 0
12 0
15 0
18 0
21 0
:00

03 0
06 0
09 0
12 0
15 0
18 0
21 0
:00
:0
:0
:0
:0
:0
:0
:0

:0
:0
:0
:0
:0
:0
:0

:0
:0
:0
:0
:0
:0
:0
00

00

00

Median KPI Values Lower and Upper Fences
Outlier KPI Values Interquartile Range

Figure 4.2: Time series analysis of KPI values regarding 4200 LTE cells over a single day.

all the values that are outside of the fences. The represented outlier KPI values refer to the minimum
and/or maximum KPI values registered for all cells that are outside of the Upper and Lower Fences. As
expected, the Service Establish reflects the regular traffic of mobile networks across a single working
day, with its maximum peaks existing in lunch and late afternoon periods, and minimum traffic existing
during the night periods. The traffic has a visible effect on the remaining KPI, such as high traffic
leading to a lower CQI and random access procedure success rate. It can be easily observed that,
with the exception of the Average CQI and RandomAcc Succ Rate KPIs, the outlier KPI values can go
well beyond the Upper and Lower Fences. It can also be noted that the IntraFreq Prep HO Succ Rate
and the ReEst during HO Succ Rate KPIs median values are very close to 1, but with outlier values
that can go as down as to 0 which can be explained for cell malfunctioning problems. Furthermore, the
Service Drop Rate median values are very close to 0, but outlier values are around the 0.2 ratio and that

50
can go up to a ratio of 1. This fact reveals the high variable nature of all KPI values of mobile networks.

4.3 Network Vendor Feature Based Detection

The next step was to collect the selected KPI data from a real MNO’s LTE network, measured by equip-
ment of a network vendor. The data gathered from the mobile network operator had an average of 7% of
missing values for KPI data and 45% of missing values for the PCI conflict CM parameter. The missing
KPI values can be due to measurements done by the cells that failed, while the missing PCI conflict CM
parameter values can mean that the PCI Conflict Detection feature is inactive or unavailable in 45% of
the cells. The PCI Conflict Detection feature was used to label each cell as conflicting or nonconflicting.
It was decided to use the Service Establish KPI in order to find the 15 minute time period of each cell
that had the highest amount of established services and thus higher traffic; additionally, the two previous
and two following 15 minute measurements were also recorded. When LTE cells have conflicts, the
conflicts are usually more noticeable through the evaluation of the cell’s KPIs in peak traffic instants,
leading to the aforementioned decision. This resulted in a total of 5 measurements of 15 minute periods
each, totalling a period of 1 hour and 15 minutes for each cell, with the most demanding 15 minute period
in the middle. The tsfresh feature extraction package was then used to apply statistical calculations to
all the KPI time series of all the cells from two consecutive days, and to retrieve the most important
results through hypothesis testing [60]. Tsfresh applies hypothesis testing to each statistical calculation
obtained from each KPI of each cell, based on the respective cell class and selects the most relevant
ones. Tsfresh selected 87 different features of statistical calculations that had the highest contribution
for the classification problem. These 87 selected features were statistical calculations obtained from the
Service ReEst Succ Rate and ReEst during HO Succ Rate KPIs.

Table 4.3: The obtained cumulative Confusion Matrix.

Actual PCI Conflicting Actual Nonconflicting


Predicted PCI Conflicting 1222 689
Predicted Nonconflicting 106478 183511
Total 107700 184200

The AB, GB, ERT, RF and SVM classifiers were applied to the 87 selected features due to their
known high classification performance [61, 62]. The highest Precision was obtained through the SVM
classifier. The best performing hyperparameters obtained from a 10-fold cross-validation were C = 100,
γ = 10−4 and tolerance for stopping criterion of tol = 10−3 (difference of the distance of an observation
to the previous iteration’s margin by the actual one). The best performing kernel for the SVM classifier
was the RBF kernel. Furthermore, the data needed to be standardized as the SVM classifier expects
the values to range from either [−1; 1] or [0; 1].
The evaluation results were obtained after applying 100 iterations of k-fold cross-validation with k =
10 followed by a reshuffling of the data in order to maximize the generalization of the results. The data
consisted of 2919 cells, where 1842 were nonconflicting cells and 1077 were PCI conflicting cells. The

51
resulting confusion matrix with the sum of all FPs, FNs, TPs and TNs obtained in the iterations, as well
as the obtained model evaluation metrics, are presented in Tables 4.3 and 4.4 respectively.

Table 4.4: The obtained Model Evaluation metrics.

Accuracy Precision Recall F-measure Training Duration Average Testing Duration Average
63.28% 63.95% 1.13% 2.23% 0.020 ms 987 ms

The Precision score was quite low even with the very small Recall score because, as FPs are re-
quired to be as low as possible, the Precision score needs to be as high as possible. The Accuracy
score was marginally above to a majority classifier’s Accuracy Score of

689 + 183511
× 100 = 63.1% (4.1)
689 + 183511 + 1222 + 106478

where all cells are classified as nonconflicting. The F-measure was not much relevant with these results,
as both Recall and Precision scores were low. Regarding the Training and Testing Duration Averages,
they were quite low, which was a positive point.
A possible reason for the low model performance could have been the selection of KPIs and data time
periods, as well as the used feature extraction methods, which could have resulted in suboptimal results.
However, it was the best performing approach that was applied, as using the total daily data or extracting
simple statistical measurements like the mean and standard deviation lead to worse outcomes.

4.4 Global Cell Neighbor Relations Based Detection

In light of the obtained results from the previous section, the best approach was to go back to the
most fundamental part of building ML models – check the quality of the data. The documentation of
the PCI Conflict Detection Feature of the network vendor that was used for labeling was confusing,
even for experienced engineers that work with the network vendor equipment. This fact created doubts
concerning the quality of the detection made by the feature and resulted in an investigation to verify
the quality of the labeling (done by the network vendor feature). Thanks to a product developed by
CELFINET, it is possible to know all the PCIs and frequency bands of all the configured neighbor cells
of each cell that use equipment from different vendors. Otherwise, it would be very difficult, if not
impossible, to confirm the network vendor feature detection quality.
Two Structured Query Language (SQL) scripts were developed to detect PCI confusions – check for
configured neighbor cells with equal PCI and frequency bands – and to detect PCI collisions – check for
configured neighbor cells with equal PCI and frequency bands as the source cell. It was found that the
detection offered by the network vendor feature was very different from what was obtained from those
scripts. Cells where one or more cases of PCI confusions were detected through the scripts, were in fact
labeled as nonconflicting by the network vendor feature and the same was observed for PCI collisions.
In light of these results, it was decided not to use the network vendor feature, and use instead the written
scripts based in the global cell neighbor relations since their results were more reliable. These written

52
scripts also allowed for detecting PCI collisions and confusions separately, as the network vendor feature
was not able to distinguish those two types of PCI conflicts. The wrong labeling done by the network
vendor feature also reflected the almost random results obtained in the previous section, as the labeling
is crucial for a good functioning ML model. This new procedure to label cells required the collection of
new data, as the global cell relations are updated in the database once per week. This decision led to
another consequence, which was higher difficulty to collect high amounts of data as it was only possible
to collect data once per week. The aforementioned consequence led to it being only possible to gather
data respective to three days due to time constraints.
By only using one classification algorithm, namely SVM as in the last section, it is possible that the
reported results could be biased. Thus, it was decided to use a total of five different classifications
algorithms that were introduced in Section 3.9, namely ERT, RF, SVM, AB, GB. These classifiers were
used from the Scikit-learn library [2]. By registering the results from these five classification algorithms,
it was possible to choose the best classification algorithm for each frequency band and PCI conflict (i.e.
collision and confusion).
The general procedure to test the hypotheses in this section consisted in the following:

1. visualising important aspects if applicable – observe, for instance, the distribution of daily KPI
null values of all cells to gain insight in order to proceed to data cleaning;

2. cleaning the data with Python Data Analysis Library – observe the occurrence of null values
of each KPI and of data artifacts, such as infinite values and numeric strings, and either correct or
discard cells with those values;

3. hyperparameter tuning – search for the optimal hyperparameters for each classification algorithm
for each frequency band and type of PCI conflict through tools from the Scikit-learn library;

4. evaluating obtained models – testing each classification algorithm on test data and registering
the obtained results.

4.4.1 Data Cleaning Considerations

The process of data cleaning consisted in the following five steps:

1. data visualization – observe the daily distribution of KPI null values of all cells and discard cells
with outlier total KPI null values;

2. data imputation – linearly interpolate missing values in each KPI of each cell;

3. data artifact correction – check and correct any data artifacts, such as strings in a continuous
variable;

4. data separation – separate the data into groups of cells with the same frequency bands;

5. dataset split – split each resulting data set into training and test sets to be used by the classifica-
tion algorithms.

53
It was considered that each cell in each day was independent from itself in different days, as the
used data consisted of three days in different weeks. The initial set of raw data had a total of 32750
nonconflicting cells, 3176 cells with PCI confusion and 6 cells with PCI collision. As the data consisted
of time series measured by several sensors, there were high chances that the data could contain null
values and other artifacts such as string values (e.g. errors, infinite values). Furthermore, in order to
successfully perform classification, it was required that each time series has few or even zero null values
and zero data artifacts. In order to reach this goal, it is required to perform data cleaning. The Python
Data Analysis Library, known as pandas, was used for this task [63].

Figure 4.3: Boxplots of total null value count for each cell per day for three KPIs.

The first step was to check for the daily distribution of null values for each KPI in all cells. It was
found that only three KPIs had a third quartile of null value counts higher than zero, which are illus-
trated in Figure 4.3 by boxplots with the null count distribution in the background. It was noticeable that
ReEst during OP Succ Rate had high occurrences of high null value counts compared to the remaining
KPIs, with a median count of 66 null values per cell. The remaining two KPIs are not as degraded as
the aforementioned one, with a median of zero and a third quartile of 5 null value counts per cell. Either
one of two things could have been done:

• remove the ReEst during HO Succ Rate KPI and delete all data with cells with the sum of null
values higher than 13, which is the upper outer fence of the remaining two highest null count KPIs,
thus only eliminating outliers and keeping most of the data;

• keep the ReEst during HO Succ Rate KPI and delete all data with more total null counts than its
first quartile (i.e. 42 null value counts).

It was clear that the best choice would have been the former, but in order to study all the KPIs
importances for detecting PCI conflicts, it was chosen to perform the latter. In the next subsection, the

54
correlations between the KPIs for peak traffic instants will be obtained including the feature importances
given by the decision tree ensemble classifiers. Thus, the next subsection will give more insight if
whether or not the decision of not considering the ReEst during HO Succ Rate KPI should be taken.
After deleting all data with the sum of null values higher than 42, the data was greatly reduced to 7124
nonconflicting cells, 1511 cells with PCI confusion and 6 cells with PCI collision.
The second step was to linearly interpolate missing values, as the data consisted of time series,
followed by deleting any cell data that still had null values. The reason for a fraction of cell data still
having null values after interpolating was due to those null values being in the first daily measurements,
being not possible to interpolate. This step further reduced the data set to 5214 nonconflicting cells,
1176 cells with PCI confusion and 6 cells with PCI collision. No more data was deleted at this point.
This big reduction from the initial data set was necessary to test the considered hypotheses in a more
confident manner.
The third step was to replace any existing data artifacts, such as unexpected strings. It was verified
that both DL Avg Cell Throughput Mbps and DL Avg UE Throughput Mbps had a few occurrences of
infinite throughputs. These values of both KPIs were replaced by the maximum KPI value that each
cell had in that same day. No more data artifacts were present in the data. No outlier values were
deleted because, as the data consisted of time series, removing outlier values of time series also meant
removing the respective cell data which was already greatly reduced. Furthermore, since the great
majority of the classification algorithms are decision tree based, the outlier values will not affect their
performance as decision trees are robust to outliers.
The fourth step involved separating LTE cell data into its different frequency bands, namely 800,
1800, 2100 and 2600 MHz. Afterwards, it was decided to only analyse the 800 and 1800 MHz bands
as they represented about 91% of all the cell data. Furthermore, after the data cleaning, the 2100
and 2600 MHz frequency bands had no reported PCI conflicts. This choice of separating the cells
by frequency bands was taken in order to create more specific models as they are used for different
reasons. Low frequency bands, such as 800 MHz, cover bigger areas and are more used by an eNB
when its higher frequency bands already have high amounts of traffic. High frequency bands, such as
1800 MHz, provide higher traffic capacity and are used in more populated environments. The resulting
dataset is represented in Table 4.5. Interestingly, there were no PCI collisions in the 1800, 2100 and
2600 MHz frequency bands for either the raw and cleaned data sets. This fact could be due to cells that
operate in low frequency bands, such as the 800 MHz, cover bigger areas than higher frequency ones
and are located in low traffic environments which hinders its detection by the mobile network operators.

Table 4.5: Resulting dataset composition subsequent to data cleaning.

Cell 800 MHz Band 1800 MHz Band


Nonconflicting 3402 1737
PCI confusion 856 320
PCI collision 6 0

The fifth and last step consisted of splitting the entire data set into the training and test sets. It was

55
decided to assign 80% of the total data set to the training set and 20% of the total data set to the test
set. Due to the minimal amount of PCI collisions, it was decided to use 3 cells with PCI collision for both
the training and test sets, even if the results would not have any statistical significance.

4.4.2 Classification Based on Peak Traffic Data

This subsection tests the hypothesis if whether PCI conflicts can be detected by only analysing KPI
values in the instant of highest traffic of each individual cell. This hypothesis was first proposed because
radio network conflicts are most noticeable through KPI observation in busy traffic periods. Furthermore,
by analysing only one daily measurement per KPI in each cell considerably reduces the complexity and
processing power needed to detect PCI conflicts as the number of data rows per cell are highly reduced.

Figure 4.4: Absolute Pearson correlation heatmap of peak traffic KPI values and the PCI conflict detec-
tion label.

As the data in this subsection does not consist of time series, each KPI was considered as a feature.
Therefore, it would be interesting to explore the relationships between KPIs and observe if there are

56
highly correlated KPIs. Removing highly correlated features can reduce potential overfitting issues. It
was decided to remove features that would cause correlations of absolute values over 0.8. In order to
observe the correlations between KPIs, a Pearson correlation heatmap of absolute values was created
that can be observed in Figure 4.4. After analysing the heatmap, it was clear that the highest correlation
occurs between the UL PUSCH Interference Avg and UL PUCCH Interference Avg KPIs, which was
expected as the average interference power for both PUCCH and PUSCH are rather close, and behave
similarly. As the aforementioned correlation, which was the highest one, was marginally lower than
0.8, all features were kept. It was also interesting to observe that the second highest correlation was
between Average CQI and DL Avg UE Throughput Mbps, which was also expected as high throughputs
are related to higher channel quality. In Figure 4.4 there were also correlation values between each KPI
and the PCI conflict label (named as pciconflict) that identified each cell as either nonconflicting, with
PCI confusions or PCI collisions. Knowing that the performance of classification algorithms is better if
the features are highly correlated with the identification label, then the three best KPIs from the dataset
were the Average CQI, DL Avg UE Throughput Mbps and RandomAcc Succ Rate, even by having a
very small correlation. The most interesting insight that could be taken from this analysis was that KPIs
related to mobility were not the highest correlated to the labelling, but were instead part of the lowest
ones, which was unexpected. However, the previous fact could be due to the analysis not taking into
account the whole daily KPI measurements. It was also noted that the KPI with the highest total count
of null values, ReEst during HO Succ Rate, had the third lowest correlation, which strengthened the
option of removing that KPI and repeat the data cleaning process.

After taking conclusions from the correlation heatmap, the next step was to transform the data
through standardization which will mainly benefit SVM in order to converge faster and deliver better
predictions. Afterwards, it was applied 10-fold cross validation with the training set to test several com-
binations of hyperparameters that maximized precision for each classifier. The aforementioned process
is known as grid search and it was applied by resorting to the Scikit-learn library [2]. After a total of 3
hours of grid searching for all classifiers in parallel, the best hyperparameters were obtained.

Table 4.6: Average importance given to each KPI by each Decision Tree based classifier.

KPI ERT RF AB GB
Average CQI 0.118 0.120 0.170 0.121
UL PUCCH Interference Avg 0.090 0.100 0.110 0.096
UL PUSCH Interference Avg 0.086 0.094 0.105 0.100
Service Establish 0.098 0.105 0.125 0.115
Service Drop Rate 0.060 0.051 0.025 0.054
DL Avg Cell Throughput Mbps 0.086 0.090 0.095 0.090
DL Avg UE Throughput Mbps 0.112 0.110 0.100 0.105
DL Latency ms 0.080 0.094 0.065 0.101
RandomAcc Succ Rate 0.122 0.116 0.125 0.111
IntraFreq Exec HO Succ Rate 0.080 0.089 0.035 0.076
IntraFreq Prep HO Succ Rate 0.018 0.005 0.035 0.009
ReEst during HO Succ Rate 0.050 0.026 0.010 0.022

Afterwards, each classification algorithm was trained with the obtained hyperparameters on the train-

57
ing sets containing cells of different frequency bands (800MHz and 1800 MHz). In order to further reduce
the data complexity, it was decided that features with less than 5% importance given by each tree based
classifier should be removed. The average feature importances that each decision tree based classifica-
tion algorithm gave were registered and are represented in Table 4.6. The obtained feature importances
allowed to further explore the KPI contributions for classification. One of the most interesting insights
that was retrieved by the aforementioned table, was that, with exception to the Service Establish KPI,
the three KPIs that had the highest importance were the ones that had the highest correlation with the
PCI conflict label. The high importance of the Service Establish KPI could also be explained by the fact
of the number of established services will measure the amount of traffic impacting the remaining KPIs.
The importance given from all classifiers to Mobility KPIs was average for the execution of handovers,
but very small for IntraFreq Prep HO Succ Rate, which was below 5%. Additionally, as the former was
one of the KPIs that had the highest null value counts, it was discarded from the data set. As the
ReEst during HO Succ Rate was assigned the second lowest importance from all classifiers with less
than 5% of given importance, it was also discarded from the data set. Consequently, the data set was
changed from 12 KPIs to 10 KPIs.
The data cleaning was repeated but by removing all cell data with more than a total sum of null values
of 13 (i.e. the upper fence of the two KPIs with higher null value count). This new approach resulted in
a data set for the 800 MHz frequency band with 8666 nonconflicting cells, 1551 cells with PCI confusion
and 6 cells with PCI collision. The 1800 MHz frequency band data set changed to a total of 16675
nonconflicting cells, 1294 cells with PCI confusion and zero cells with PCI collision. Each data set was
divided once again, with 80% for training and 20% for testing. Once more, for the 800 MHz frequency
band, it was decided to use 3 cells with PCI collision for both training and test sets.

Table 4.7: Peak traffic PCI Confusion classification results.

800 MHz Band 1800 MHz Band


Model Accuracy Precision Recall Accuracy Precision Recall
ERT 84.94% NaN 00.00% 92.43% NaN 00.00%
RF 84.94% NaN 00.00% 92.43% NaN 00.00%
SVM 84.94% NaN 00.00% 92.43% NaN 00.00%
AB 84.94% NaN 00.00% 92.43% NaN 00.00%
GB 84.01% 29.41% 04.42% 92.13% 03.33% 02.87%

With the new and cleaned data sets, grid search with 10-fold cross validation was repeated once
again for each classifier. After 3 hours, the new hyperparameters that maximized the Precision were
obtained. Afterwards, each classification algorithm was trained on the training data set and was tested
on the test set. As each resulting model outputs probabilities and classifies as class A or class B
through a specified probability threshold, a default threshold of 50% was set. The classification results
for detecting PCI confusions are showcased in Table 4.7. It should be added that when a classifier
did not result in any TP and FP, the Precision is represented as a Not a Number (NaN), as it results
in a division by zero. It was clear that GB was the best performing classifier as it was the only one
that classified data samples with a certainty above 50%, but with low Precision and low Recall for both

58
frequency bands. Nevertheless, the best Precision and Recall was delivered on the 800 MHz frequency
band. The remaining models were unable to return any TP and FP. The aforementioned fact may have
indicated that the data did not have enough information for this classification task.

Figure 4.5: Smoothed Precision-Recall curves for peak traffic PCI confusion detection.

As Table 4.7 did not present much information concerning the majority of the used classifiers, it was
decided to calculate and plot their Precision-Recall curves for testing for both frequency bands test sets.
The resulting plots were smoothed through a moving average with a window of size 20 and are illustrated
in Figure 4.5. The area under each classifier’s curve is its average Precision. Through a close analysis
of the plot for the 800 MHz frequency band, it was clear that GB was the best performing classifier with
precision peaking at 35% until reaching 25% Recall. Thenceforth, RF and ERT show higher Precision,
with ERT having the highest average Precision of 0.24. Regarding the 1800 MHz frequency band, ERT
was the best performing one with higher average Precision and also with Precision as high as 80% until
reaching 20% Recall. From that point onwards, its performance was approximately tied with RF and
AB. For both cases, SVM was clearly the worst performing classifier with this data. The lower average
Precision comparatively to the 800 MHz frequency band could be due to the 1800 MHz frequency band
having a different cell class balance and also being commonly used over different environments, with
different amounts of traffic that can hinder the classification process.
After analysing the classification results, the next step was to evaluate how much time each classifier
took to train and to test for each frequency band, which is showcased in Table 4.8. In general, the
fastest performing classifier was SVM, which was the worst performing one. Additionally, the classifier
with fastest testing performance was GB with real-time testing times of 0.1 and 0.2 seconds to test data
sets with thousands of data samples. It should also be pointed out that the presented time durations are
highly influenced by the chosen number of iterations or estimators for each classifier. With these results,
the GB classifier could be chosen for the 800 MHz frequency band due to its fast training and testing

59
Table 4.8: PCI Confusion classification training and testing times in seconds.

800 MHz Band 1800 MHz Band


Model Training time [s] Testing time [s] Training time [s] Testing time [s]
ERT 18.7 5.2 17.8 6
RF 27.6 4.7 59 7.7
SVM 1.5 0.1 8.3 0.4
AB 10.9 0.2 18.6 0.3
GB 11.8 0.1 31 0.2

times as well as good Precision scores for low Recall. For the 1800 MHz, it is harder to choose the best
performing classifier, but the ERT could be chosen as it was the one with highest average Precision.
In order to verify whether or not enough data was used for the classification task, it was decided to
build and plot learning curves for both frequency bands. The learning curves applied 5-fold cross vali-
dation for 5 different training set sizes and measured the Precision-Recall area, which is also known as
average Precision score. The resulting learning curves are illustrated in Figure 4.6. The main insight that
could be taken from the learning curves is that the average Precision scores were already approximately
stabilized for the two last training set sizes for both frequency bands. This fact proves that the results
would not be significantly better if more data was added. These results, while not practical for mobile
network operators, show that it might be possible to classify PCI confusions through KPI analysis.

Figure 4.6: Learning curves for peak traffic PCI confusion detection.

Regarding PCI collision classification, as there are only 3 cases of it on the training and test sets,
the classification results could not be significant. Nevertheless, grid search with 10-fold cross validation
was performed for each classifier and the optimal hyperparameters were obtained after 3 hours. The
classifiers were then trained on the training set and were tested on the test set. No classifier could
classify a sample as a PCI collision with more than 50% certainty, so it was decided not to show the

60
table with the results. The Precision-Recall curves were obtained and plotted. It was chosen not to
add the resulting plot to this work because the maximum Precision that was obtained was 6% with 33%
Recall for SVM, not adding much visually. Furthermore, the average Precision was 1% for ERT and
AB. Due to the marginally low number of PCI collisions in the training set, it was not possible to obtain
and plot the learning curves. Even with these not significant results, the SVM classifier could be the
best classification algorithm for PCI collision classification. These results, while not significant due to
the marginally low number of PCI collisions, show that it is not possible to classify PCI collisions by only
analysing the KPI measurements at daily peak traffic instants.

4.4.3 Classification Based on Feature Extraction

This subsection tests the hypothesis if PCI conflicts can be detected by extracting statistical measure-
ments from each KPI daily time series and use those measurements as features. This hypothesis was
proposed as it is one of the main approaches to classify time series.
For extracting statistical data from all KPIs time series, it was decided to use tsfresh which is a
popular Python tool for this task [60]. It was first intended to extract those statistical measurements from
the busiest 1 hour and 15 minute periods from each cell, centered on the daily traffic peak. Unfortunately,
the extraction was not finished even after 48 hours of running tsfresh. Thus, it was chosen to run tsfresh
on full daily KPI time series since it was faster. More specifically, it took 5 hours to extract statistical data
from data relative to the 800 MHz and 1800 MHz frequency bands, in order to detect PCI confusions.
It also took 5 hours to extract statistical from data relative to the 800 MHz frequency band, to detect
PCI collisions. It should be mentioned that tsfresh did not find any statistical feature that was relevant
to detect PCI collisions. Thus, all resulting statistical measurements were used as features for PCI
collision detection, even if they were not statistically significant. Regarding PCI confusions, 798 and 909
features were extracted for the 800 MHz and 1800 MHz frequency bands, respectively. Concerning PCI
collisions, 2200 features were extracted for the 800 MHz frequency band that were not selected through
hypothesis testing, as mentioned above.
Due to the high number of total resulting features, this new data set brings dimensionality problems.
Fortunately, decision tree based classifiers are resistant to these problems. The same cannot be said
regarding SVM, which can result in hours of model training and possibly overfitting problems. Hence,
concerning the use of SVM, it was decided to reduce the data’s dimensionality through the application of
PCA. It was defined that the number of Principal Component (PC)s chosen should result in a respective
CPVE that reaches 98%. This decision ensures to reduce dimensionality and retain most of the informa-
tion that the data contains. The data was first standardized and then the PCA was applied to the data,
retrieving each PC’s eigenvalues. Each eigenvalue was divided by the sum of all eigenvalues, resulting
in the cumulative proportion of variance functions. The resulting functions for both 800 MHz and 1800
MHz frequency bands are illustrated in Figure 4.7. It was inferred that the data relative to the 800 MHz
frequency band could be reduced to 273 PCs and also that the data relative to the 1800 MHz frequency
band could be reduced to 284 PCs. The number of PCs was different for both frequency bands as their

61
800 MHz Band 1800 MHz Band
1.00 1.00
0.98 0.98
Cumulative Proportion of Variance Explained

0.75 0.75

0.50 0.50

0.25 0.25

0 200 273 400 600 800 0 200 284 400 600 800
Principal Component Principal Component

Figure 4.7: The CPVE for PCI confusion detection.

data had different number of features. These PCs will be used as new features to the SVM classifier,
resulting in a dimensionality reduction of around 30% with only a 2% variance loss.
As soon as the data sets were ready with the new features, grid search with 10-fold cross valida-
tion was performed for each classifier. After 11 hours, the new hyperparameters were obtained and
were used to train and test each obtained model. The obtained results for detecting PCI confusions
are showcased in Table 4.9. With this new approach, two models were able to classify cells with PCI
confusion with approximately 50% Precision and 2.5% Recall for the 800 MHz frequency band. The
best performing model was obtained by AB, however no model could classify a cell with PCI confusion
for the 1800 MHz frequency band. This last fact could be due to the 1800 MHz frequency band having a
different class balance and also due to its frequent use in different radio environments that have different
amounts of traffic.

Table 4.9: Statistical data based PCI confusion classification results.

800 MHz Band 1800 MHz Band


Model Accuracy Precision Recall Accuracy Precision Recall
ERT 85.24% NaN 00.00% 93.27% NaN 00.00%
RF 85.24% NaN 00.00% 93.27% NaN 00.00%
SVM 85.24% NaN 00.00% 93.27% NaN 00.00%
AB 85.24% 50.00% 02.83% 93.27% NaN 00.00%
GB 85.18% 46.00% 02.43% 93.27% NaN 00.00%

As Table 4.9 did not give many insights, it was decided once again to calculate and plot the resulting
Precision-Recall curves for each created model. The resulting plots were smoothed through a moving
average with a window of size 20 and are illustrated in Figure 4.8. The average Precision for all models
and frequency bands were overall slightly better than the previous hypothesis, specially for the 800
MHz frequency band. Regarding the 800 MHz frequency band, ERT, RF and GB had overall the best

62
performance which were similar between themselves. However, AB performed better for Recall lower
than 3%. Concerning the 1800 MHz frequency band, GB performed the best until reaching 17% Recall
where it starts performing similarly to ERT. Once again, SVM was the worst performing.

Figure 4.8: Smoothed Precision-Recall curves for statistical data based PCI confusion detection.

Afterwards, it was decided to evaluate the training and testing times of each classifier that led to the
presented results. All of those times are presented in Table 4.10. ERT and RF presented the lowest
training times for the 800 MHz and 1800 MHz frequency bands, respectively. GB had the lowest testing
times for both frequency bands, and was also the one with the best performance overall for PCI confusion
detection. Once more, the times presented are highly influenced by the chosen number of iterations or
estimators for each classifier.
In order to verify whether or not enough data was used for the classification task, learning curves
were built and plotted for both frequency bands. The resulting learning curves are illustrated in Figure
4.9. The main insight that could be taken from the resulting plot was that the classification performance
stabilized for the 800 MHz frequency band, while the performance was still slightly increasing for the
1800 MHz frequency band. However, the overall performance would not significantly increase with
more data for both frequency bands. These results show an improvement from the previous hypothesis
regarding PCI confusion detection. Furthermore, the results could be improved further if instead of
analysing daily measurements of each cell, periods of 48 or more hours were analysed. This approach
could retrieve more significant statistical features and thus result in higher classification performance.
Regarding PCI collision classification, and similarly to what was done for PCI confusion classification,
PCA was applied to the data to be used for SVM, as it consisted of 2200 initial features. The chosen
CPVE threshold was once again 98% and the resulting function is illustrated in Figure 4.10. It was found

63
Table 4.10: Statistical data PCI confusion classification training and testing times in seconds.

800 MHz Band 1800 MHz Band


Model Training time [s] Testing time [s] Training time [s] Testing time [s]
ERT 15.7 1.9 12.9 2.2
RF 51.3 1.6 5.7 0.1
SVM 820 7.2 1511 16.9
AB 18.9 0.1 13.1 0.1
GB 26.4 0.1 11.4 0.2

Figure 4.9: Learning curves for statistical data based PCI confusion detection.

that the data could be reduced to 619 PCs, resulting in a dimensionality reduction of approximately 30%
with only a variance loss of 2%.

800 MHz Band


1.00
0.98
Cumulative Proportion of Variance Explained

0.75

0.50

0.25

0 500 619 1000 1500 2000


Principal Component

Figure 4.10: The CPVE for PCI collision detection.

64
Afterwards, grid search was performed once again in order to obtain the optimal hyperparameters to
train and test training and testing each model, which took another 11 hours. Similarly to the previous
subsection, no classifier was able to classify a PCI collision with more than 50% certainty, so it was
decided not to show the table with the results. The Precision-Recall curves were obtained and plotted,
which showed a maximum Precision peak of 23% with 100% Recall by RF while it was approximately
zero for the remaining classifiers; the plot was not illustrated since the sample of PCI collisions in the
dataset was not statistically significant. These results show an improvement relative to the last sub-
section and are of interest to MNOs, as PCI collisions are very rare. For instance, an analysis for PCI
collisions of 15000 cells could be reduced to 15 cells, where 3 of them represented the entirety of PCI
collisions. With this data and these results, the RF classifier was the best suited for the PCI collision
classification task.

4.4.4 Classification Based on Raw Cell Data

This subsection tests the hypothesis if PCI conflicts can be detected by using each KPI measurement,
in each day, as an individual feature. This hypothesis was proposed in order to compare a more com-
putationally intensive, but simpler approach, with the previous ones. Consequently, as there are 96
daily measurements per KPI in each cell, and a total of 10 KPIs, there is a total of 96 measurements ×
10 KP Is = 960 f eatures.
This approach used the same cells of the previous two subsections. The data was standardized
and, in order to minimize noise from each KPI (due to several factors such as radio channel fading and
user mobility), it was also smoothed with a simple moving average filter with a window of size 20. The
aforementioned window size was chosen as it was the one that yielded the best results.
Similarly to the last subsection, due to the high number of features, it was decided to apply PCA
once again in order to reduce data dimensionality when using SVM. The CPVE threshold was defined
as 98%, as previously. After applying PCA to the data, the CPVE function was obtained for the 800
MHz and 1800 MHz frequency bands and illustrated in Figure 4.11. The obtained number of PCs in the
defined threshold was also the same for both frequency bands, which was 634. The CPVE was very
similar to both frequency bands. This step resulted in a 34% dimensionality reduction with a variance
loss of 2%.
After reducing data dimensionality for the SVM classifier, the next step was to apply grid search with
10-fold cross validation on the training set. The goal was to obtain once again the optimal hyperpa-
rameters that maximized Precision and Recall for detecting PCI confusions. The hyperparameters were
obtained after approximately 13 hours of grid testing. With these hyperparameters, each classification
algorithm was trained and tested; results are presented in Table 4.11. With this approach, the results
were notably better as more classifiers successfully predicted PCI confusions. More specifically, regard-
ing the 800 MHz frequency band, the GB classifier was the best performing one with 75% Precision,
1.07% Recall. Additionally, GB presented slightly higher Accuracy than RF, SVM and AB classifiers that
behaved as majority class classifiers. Concerning the 1800 MHz frequency band, the RF classifier was

65
800 MHz and 1800 MHz Bands
1.00
0.98
Cumulative Proportion of Variance Explained

0.75

0.50

0.25

0 250 500 634 750 1000


Principal Component

Figure 4.11: The CPVE for PCI confusion detection.

the best performing one in terms of Precision, with a Precision score of 100% and a Recall score of
0.9%. The GB classifier presented slightly different results, with higher Accuracy and Recall, but lower
Precision.

Table 4.11: Raw cell data PCI confusion classification results.

800 MHz Band 1800 MHz Band


Model Accuracy Precision Recall Accuracy Precision Recall
ERT 85.37% 22.22% 00.71% 93.57% 100% 00.45%
RF 85.63% NaN 00.00% 93.60% 100% 00.90%
SVM 85.63% NaN 00.00% 93.54% NaN 00.00%
AB 85.63% NaN 00.00% 93.54% NaN 00.00%
GB 85.73% 75.00% 01.07% 93.63% 80.00% 01.80%

In order to obtain more insights regarding the performance of each model, it was decided once again
to obtain and plot the Precision-Recall curves for each model. The resulting plots were smoothed with a
window size of 20 and are illustrated in Figure 4.12. The increase of the average Precision was notable
for both frequency bands comparatively with the last two subsections. Regarding the 800 MHz frequency
band, the best performing classifier was GB, with a smoothed Precision peak of 63% and 3% Recall.
Both the GB and ERT classifiers registered a notable performance increase comparatively to the two
previous subsections. Concerning the 1800 MHz frequency band, ERT was the best performing one,
overall, with an average Precision of 26%. Again, SVM had the worst performance, behaving closely as
a majority class classifier, sometimes even worse (see Recall from around 12% onwards).
Afterwards, in order to evaluate the models more deeply, the training and testing times were also
registered and are presented in Table 4.12. GB was the fastest classifier to train and to test for the 800
MHz frequency band data, with a real-time testing time of 0.2 seconds. For the 1800 MHz frequency
band, SVM was the fastest to train, but the worst performing, while the fastest testing classifier was GB,

66
Figure 4.12: Smoothed Precision-Recall curves for raw cell data based PCI confusion detection.

once again. With these results, one can say that GB was the best classification algorithm to apply to
the 800 MHz frequency band data. The ERT classifier was the most suited for the 1800 MHz frequency
band data as its training and testing times are not much higher than either SVM or GB and also due to
its best classification performance.

Table 4.12: Raw cell data PCI confusion classification training and testing times in seconds.

800 MHz Band 1800 MHz Band


Model Training time [s] Testing time [s] Training time [s] Testing time [s]
ERT 18.7 1.5 40.3 1.4
RF 61 2.3 133 1.5
SVM 74 2.5 40.1 0.6
AB 503 0.5 1286 1.6
GB 14.3 0.2 136 0.2

The next step was to investigate whether or not the classification results could be improved if more
data was added. In order to do so, the learning curves for all models and frequency bands were obtained,
plotted and illustrated in Figure 4.13. It can easily be observed that the average Precision scores for both
frequency bands did not stabilize. More specifically, the 1800 MHz frequency band data registered a
higher increase of average Precision comparatively to the 800 MHz frequency band data. The obtained
insight is that the results could be further improved with more data.
Regarding PCI collision classification, the SVM classifier used 634 PCs as features because the 800
MHz frequency band data set was the same as for PCI confusion detection. Grid search was performed,
taking 13 hours once again, and the resulting optimal hyperparameters were used to train and test each
classifier. Similarly to the last two subsections, no classifier was able to predict a PCI collision over the

67
Figure 4.13: Learning curves for raw cell data PCI confusion detection.

50% probability threshold. This fact could be due to the class imbalance, as only 3 data samples in the
training and test set consist of PCI collisions.

Figure 4.14: Precision-Recall curves for raw cell data PCI collision detection.

The Precision-Recall curves were obtained and plotted, and are shown in Figure 4.14. The approach
allowed for the AB classifier to correctly classify one PCI collision out of three with no FPs. However, for
100% Recall, the resulting models only achieved around 6% of average Precision. With these results,
the AB classifier was the best classifier for PCI collision classification.
With the obtained results for all hypothesis, it is possible to assert that the hypothesis proposed in
this subsection, not only was the simplest one, but it was also the one that conducted to the best results
for both PCI confusions and collisions. This assertion was based in the fact that even with low Recall, it
was able to identify PCI conflicts with the highest Precision score near to 100%. The obtained model’s

68
training was made in near real-time, in the order of minutes. The predictions were made in real-time, in
less than a second.

4.5 Preliminary Conclusions

The goal of this chapter was to study how the PCI is used in LTE networks in order to develop a super-
vised methodology to detect PCI conflicts with near real time performance.
In Section 4.2 the chosen KPIs were presented by stating their meaning and how they were relevant
for PCI conflict detection. A brief daily time series analysis of KPI values was presented as well, which
allowed for a better comprehension of their daily behaviours.
In Section 4.3 the network vendor PCI Conflict Detection feature was used to label each cell as
either conflicting or nonconflicting. The Python tsfresh library was used to apply and extract significative
statistical measurements from the peak 1 hour and 15 minutes of each individual cell which were used
as features for a SVM classifier. The obtained model presented a Precision of 63.95% for a Recall of
1.13%. The low model performance was due to the PCI Conflict Detection feature, the selected KPIs or
time period selections.
In Section 4.4 a new cell labeling approach was applied by using global cell neighbor configurations to
detect PCI conflicts. This new labeling approach delivered a better labeling control, allowing a distinction
between PCI confusions and collisions, and proved that the network vendor PCI Detection feature was
not consistent with the newly obtained labels. The data was further analysed which led to the removal
of two KPIs that had high null count averages and low feature importances.
The three presented hypotheses were tested by using five different classification algorithms, namely
SVM, AB, GB, ERT and RF. All the obtained results from all hypotheses delivered near real time
performance with training and testing times rarely going beyond 150 and 10 seconds, respectively. The
hypothesis that led to the best results was using each KPI measurement in each day as an individual
feature. Regarding the 800 MHz frequency band, the best model was obtained by GB, which led to
an average Precision of 31% with a Precision peak of 60% for 3% Recall. Regarding the 1800 MHz
frequency band, the best model was obtained by ERT, which delivered an average Precision of 26%
with a Precision peak of 80% for 1% Recall. Additionally, the obtained learning curves showed that the
results would significantly improve if more data was added to create the models. However, no more data
was obtained since to get one more day of data required to wait one more week to get it. Furthermore,
due to new security policies, it was more difficult to obtain access to new data.
The fact of the third hypothesis having delivered better results comparatively to the second hypoth-
esis, applying and extracting statistical calculations from the KPIs, could have been due to the latter
leading to a loss of information by extracting statistical measurements from full daily periods. More
clearly, network problems are better detected by KPIs in peak traffic instants, and that information was
lost by compressing it in statistical calculations of full daily KPIs. As the third hypothesis used all the
information in its raw form, the models were able to perform more effective classifications.

69
70
Chapter 5

Root Sequence Index Collision


Detection

5.1 Introduction

This chapter introduces the RSI collision problem that can occur in LTE radio networks; furthermore,
the steps taken towards achieving the best approach to detect RSI collisions, by using ML models to
analyse KPI daily measurements are described.
Whenever an UE is turned on, it starts scanning the radio network for frequencies corresponding to
the respective network operator. After the UE is synchronized to a frequency, it checks if it is connected
to the right PLMN by reading the Master Information Block (MIB) as well as the System Information
Block (SIB) 1 and 2. Namely, the SIB 2 contains the RSI which indicates the index of the logical root
sequence in order to derive the PRACH preamble sequence to start the random access procedure.
The random access procedure is used for service connection establishment and re-establishment, intra-
system handovers and UE synchronization for uplink and downlink data transfers. The LTE random ac-
cess procedure can be performed in two different ways: by allowing non-contention based or contention
based access. PRACH preambles aim to differentiate requests coming from different UEs through the
PRACH. Furthermore, each LTE cell uses 64 preambles, where 24 are reserved to be chosen by the
eNB for non-contention based access and the remaining 40 are randomly selected by the UEs for con-
tention based access.
Whenever two or more neighbor cells operate in the same frequency band and have the same RSI,
there is a higher occurrence of preamble collision amongst the requests coming from various other
UEs. The aforementioned problem is called RSI collision and can lead to an increase of failed service
establishments and re-establishments as well as an increase of failed handovers.
In LTE there are 838 root sequences available for preambles with each having a length of 839 bits.
A UE can generate several preambles with one root sequence and a cyclic shift. The smaller the cyclic
shift, the more preambles can be generated from a root sequence. Knowing that the total number of
PRACH preambles available in each LTE cell is 64, then the number of sequences needed to generate

71
the 64 preambles in a given cell is:
& '
64
Nrows = . (5.1)
integer( sequence length
cyclic shif t )

For instance, with the RSI as 200 and cyclic shift as 110, then the required number of rows to
generate the 64 preambles is:

 
64
Nrows = 839 = 10. (5.2)
integer( 110 )

Thus, for a correct RSI planning, if a cell A has a RSI of 200, then the neighbor cells B and C must
have 210 and 220 as RSI. This avoids neighbor cells using the same preambles and, thus, avoids RSI
collisions.
This chapter is organised in four sections. After the introduction, section 5.2 presents the chosen
KPIs that were relevant for the RSI conflict classification task. Section 5.3 proposes a RSI collision
classification task based in configured global cell relations and tests the following three hypotheses:

1. RSI collisions are best detected by using KPI measurements in the daily peak traffic instant of
each cell.

2. RSI collisions are best detected by extracting statistical calculations from each KPI daily time series
and using them as features.

3. RSI collisions are best detected by using each cell’s KPI measurements in each day as an individ-
ual feature.

Lastly, Section 5.4 presents the preliminary conclusions of the work done in this chapter. The overall
RSI collision detection procedure can be observed in Figure A.1.

5.2 Key Performance Indicator Selection

In accordance with the theory behind LTE and how PCIs are used, a new list containing the most relevant
KPIs for RSI collision detection was obtained. These KPIs are represented in Tables 5.1 and 5.2. A brief
time series analysis of these KPIs regarding 23500 cells over a single day is also represented in Figure
5.1.

Table 5.1: Chosen Accessibility and Mobility KPIs.

Accessibility Mobility
RandomAcc Succ Rate IntraFreq Exec HO Succ Rate
IntraFreq Prep HO Succ Rate

Regarding Accessibility, RandomAcc Succ Rate refers to the success rate of random access proce-
dures made through the PRACH. The aforementioned KPI is supposed to be the most relevant KPI to
detect RSI collisions as collisions will strongly decrease the success rate of random access procedures.

72
In Mobility, IntraFreq Prep HO Succ rate measures the success rate of the handover preparation
between cells in the same frequency band and IntraFreq Exec HO Succ Rate refers to the success
rate of processed handovers between cells in the same frequency band. The aforementioned KPIs are
relevant for detecting RSI collisions, as handovers require performing random access procedures and
may use contention based access. By using contention based access, there is a higher occurrence of
two of more UEs simultaneously sending the same PRACH preamble if there is a RSI collision. Hence,
resulting in more frequently failed handovers.

Table 5.2: Chosen Quality and Retainability KPIs.

Quality Retainability
UL PUCCH Interference Avg Service Establish
UL PUSCH Interference Avg Service ReEst Succ Rate

In Quality, UL PUCCH Interference Avg and UL PUSCH Interference Avg measure the average noise
and interference power on the PUCCH and the PUSCH respectively. These KPIs are relevant for RSI
collision detection because cells with RSI collisions have the same frequency band and might have be
in high density traffic areas. Thus, having increased interference.
Regarding Retainability, Service Establish measures the total number of established services during
a period and was chosen to differentiate cells with different amounts of traffic. Service ReEst Succ Rate
refers to the success rate of service re-establishment in a given cell. This KPI is relevant to detect RSI
collisions as when a UE suffers a service drop it will perform a service re-establishment request through
a random access procedure. If there is a RSI collision, there will be higher occurrences of failed service
re-establishments due to failed random access procedures.
Regarding Figure 5.1, it represents the distribution of each KPI’s values of 23500 LTE cells over
a single day. The IQR represents the values where 50% of the data is distributed. The IQR can be
obtained through IQR = Q3 − Q1 , where Q3 and Q1 refer to the third and first quartiles of each KPI
measure across a single day. The Upper and Lower Fences correspond to Q3 + 1.5 × IQR and Q1 −
1.5 × IQR, respectively. The Upper and Lower Fences are used to check for outlier values, which are
all the values that are outside of the fences. The represented outlier KPI values refer to the minimum
and/or maximum KPI values registered for all cells that are outside of the Upper and Lower Fences. As
expected, the Service Establish reflects the regular traffic of mobile networks across a single working
day, with its maximum peaks existing in lunch and late afternoon periods and minimum traffic existing
during the night periods. The traffic has a visible effect on the remaining KPI, such as high traffic
leading to a lower service re-establishment and random access procedure success rate. It can be
easily observed that, with the exception of the Service ReEst Succ Rate and RandomAcc Succ Rate
KPIs, the outlier KPI values can go well beyond the Upper and Lower Fences. It can also be noted
that the IntraFreq Prep HO Succ Rate and the IntraFreq Exec HO Succ Rare KPIs median values are
very close to 1, but with outlier values that can go as down as to 0 which can be explained for cell
malfunctioning problems. These observations reveal the high variable nature of all KPI values of mobile
networks across whole countries.

73
UL_PUCCH_Interference_Avg UL_PUSCH_Interference_Avg Service_Establish
30000
95 95 25000

Established Services
100 100
20000
105 105
15000
dBm

dBm
110 110
10000
115 115
5000
120 120
125 0

IntraFreq_Exec_HO_Succ_Rate IntraFreq_Prep_HO_Succ_Rate Service_ReEst_Succ_Rate


1.0 1.0 1.0

0.8 0.8 0.8


Success Rate

Success Rate

Success Rate
0.6 0.6 0.6

0.4 0.4 0.4

0.2 0.2 0.2

0.0 0.0 0.0


03 0
06 0
09 0
12 0
15 0
18 0
21 0
:00

03 0
06 0
09 0
12 0
15 0
18 0
21 0
:00
RandomAcc_Succ_Rate
:0
:0
:0
:0
:0
:0
:0

:0
:0
:0
:0
:0
:0
:0
00

00
1.0

0.8
Median KPI Values
Success Rate

0.6
Outlier KPI Values
Lower and Upper Fences
0.4
Interquartile Range
0.2

0.0
03 0
06 0
09 0
12 0
15 0
18 0
21 0
:00
:0
:0
:0
:0
:0
:0
:0
00

Figure 5.1: Time series analysis of KPI values regarding 23500 LTE cells over a single day.

5.3 Global Cell Neighbor Relations Based Detection

A SQL script was made to detect RSI collisions – check for configured neighbor cells with equal RSI
and frequency bands between themselves. Similarly to what was done in the previous chapter, it was
decided to use a total of five different classifications algorithms that were introduced in Section 3.9,
namely ERT, RF, SVM, AB, GB. These classifiers where used from the Scikit-learn library [2]. By
registering the results from these five classification algorithms, it will be possible to choose the best RSI
collision classification algorithm for each frequency band.
The general procedure to test the hypotheses in this section consisted in the following:

1. visualising important aspects if applicable – observe, for instance, the distribution of daily KPI
null values of all cells to gain insight in order to proceed to data cleaning;

2. cleaning the data with Python Data Analysis Library – observe the occurrence of null values
of each KPI and of data artifacts, such as infinite values and numeric strings, and either correct or
discard cells with those values;

74
3. hyperparameter tuning – search for the optimal hyperparameters for each classification algorithm
for each frequency band and type of PCI conflict through tools from the Scikit-learn library;

4. evaluating obtained models – testing each classification algorithm on test data and registering
the obtained results.

5.3.1 Data Cleaning Considerations

The process of data cleaning consisted in the following five steps:

1. data visualization – observe the daily distribution of KPI null values of all cells and discard cells
with outlier total KPI null values;

2. data imputation – linearly interpolate missing values in each KPI of each cell;

3. data artifact correction – check and correct any data artifacts, such as strings in a continuous
variable;

4. data separation – separate the data into groups of cells with the same frequency bands;

5. dataset split – split each resulting data set into training and test sets to be used by the classifica-
tion algorithms.

It was considered that each cell in each day was independent from itself in different days, as the data
used consisted of three days in different weeks. The initial set of raw data consisted of a total of 26596
nonconflicting cells and 14527 cells with RSI collision. As the data consisted of time series measured by
several sensors, there were high chances that the data may contain null values and other artifacts such
as string values (e.g. errors, infinite values). Furthermore, in order to successfully perform classification,
it is required that each time series has few or even zero null values and zero data artifacts. In order to
reach this goal, it is required to perform data cleaning. The Python Data Analysis Library, known as
pandas, was used for this task of data cleaning [63].
The first step was to check for the distribution of null values for each KPI in all cells in each day.
It was found that only three KPIs had a third quartile of null value counts higher than zero, which are
illustrated by boxplots with the null count distribution in background in Figure 5.2. It was notable that
Service ReEst Succ Rate had high occurrences of high null value counts compared to the remaining
KPIs, with a median count of 30 null values per cell. The remaining two KPIs were not as degraded as
the aforementioned one, with a median of zero and a third quartile of 5 null value counts per cell. Either
one of two things could have been done:

• remove the Service ReEst Succ Rate KPI and delete all data with cells with the sum of null values
higher than 13, which was the upper outer fence of the remaining two highest null count KPIs, thus
only eliminating outliers and keeping most of the data;

• keep the Service ReEst Succ Rate KPI and delete all data with more total null counts than its first
quartile (i.e. 13 null value counts).

75
Figure 5.2: Boxplots of total null value count for each cell per day for two KPIs.

It was clear that the best option was the first one. For the same threshold of allowed null values,
the first option removed a small portion of the data, while the second removed 75% of the data. Fur-
thermore, as seen in the previous chapter, the degradation of the Service ReEst Succ Rate was very
similar to the ReEst during HO Succ Rate which was removed from the data. After removing the Ser-
vice ReEst Succ Rate KPI and deleting all data with the sum of null values higher than 13, the data was
reduced to 17940 nonconflicting cells, 11131 cells with RSI collision.

The second step was to linearly interpolate missing values, as the data consisted of time series,
followed by deleting any cell data that still had null values. The reason for a fraction of cell data still having
null values after interpolating was due to those null values being in the first daily measurements, being
not possible to interpolate such values. This step further reduced the data set to 17906 nonconflicting
cells and 11105 cells with RSI collision. No more data was deleted at this point. This big reduction from
the initial data set was necessary to test the following hypotheses in a more confident manner.

No data artifacts were present in the data and no outlier values were deleted because, as the data
consisted of time series, removing outlier values of time series also meant removing the respective
cell data which was already greatly reduced. Furthermore, as the great majority of the classification
algorithms are decision tree based, the outlier values will not affect their performance as decision trees
are robust to outliers.

The fourth step involved separating LTE cell data into its different frequency bands, namely 800,
1800, 2100 and 2600 MHz. Afterwards, it was decided to only analyse the 800 and 1800 MHz bands
as they represented about 91% of all the cell data. Furthermore, after the data cleaning, the 2100 and
2600 MHz frequency bands had only a total of 38 reported RSI collisions. This choice of separating the
cells by frequency bands was taken in order to create more specific models as they are used for different
reasons. Low frequency bands, such as 800 MHz, cover bigger areas and are more used by an eNB
when its higher frequency bands already have high amounts of traffic. High frequency bands, such as
1800 MHz, provide higher traffic capacity and are used in more populated environments. The 800 MHz

76
frequency band data set thus consisted of 6302 nonconflicting cells and 4230 cells with RSI collision.
The 1800 MHz frequency band data set consisted of 10866 nonconflicting cells and 6837 cells with RSI
collision.
The fifth and last step consisted of splitting the entire data set into the training and test sets. It was
decided to assign 80% of the total data set to the training set and 20% of the total data set to the test
set.

5.3.2 Peak Traffic Data Based Classification

This subsection tests the hypothesis if RSI collisions can be detected by only analysing KPI values in
the instant of highest traffic of each individual cell. By analysing only one daily measurement per KPI in
each cell considerably reduces the complexity and processing power needed to detect RSI collisions as
the number of data rows per cell are highly reduced.

Figure 5.3: Absolute Pearson correlation heatmap of peak traffic KPI values and the RSI collision detec-
tion label.

Similarly for PCI conflict detection, each KPI was considered as a feature in order to explore the
relationships between KPIs. It was decided to remove features that would cause correlations of abso-
lute values over 0.8. In order to observe the correlations between KPIs, a Pearson correlation heatmap
of absolute values was created that can be observed in Figure 5.3. After analysing the heatmap, it
was clear once again that the highest correlation occurs between the UL PUSCH Interference Avg and
UL PUCCH Interference Avg KPIs which was expected, as already explained in the last chapter. As
the aforementioned correlation, which was the highest one, was marginally lower than 0.8, then all
features were kept. In Figure 5.3 there were also correlation values between each KPI and the RSI
collision label (named as collision) that identified each cell as either nonconflicting or with RSI colli-

77
sions. Knowing that the performance of classification algorithms is stronger with variables that are highly
correlated with the identification label, then the three best KPIs would be the RandomAcc Succ Rate,
UL PUCCH Interference Avg and UL PUSCH Interference Avg, even if they had a very small correla-
tion. The most interesting insight that could be taken from this analysis was that KPIs related to mobility
were the lowest correlated to the labeling compared to the remaining KPIs. The previous fact could be
due to the random access procedure being non-contention based for doing handovers where the eNB
chooses a reserved preamble for a UE to use.
After taking conclusions from the correlation heatmap, the next step was to transform the data
through standardization which will mainly benefit SVM in order to converge faster and deliver better
predictions. Afterwards, grid search was applied to obtain the optimal hyperparameters to create the
models. After a total of 3 hours of grid searching for all classifiers in parallel, the best hyperparameters
were obtained.

Table 5.3: Average importance given to each KPI by each Decision Tree based classifier.

KPI ERT RF AB GB
RandomAcc Succ Rate 0.352 0.230 0.350 0.215
UL PUCCH Interference Avg 0.202 0.213 0.130 0.158
UL PUSCH Interference Avg 0.150 0.176 0.160 0.159
Service Establish 0.144 0.141 0.050 0.182
IntraFreq Exec HO Succ Rate 0.115 0.178 0.310 0.175
IntraFreq Prep HO Succ Rate 0.037 0.062 0 0.111

Afterwards, each classification algorithm was trained with the obtained hyperparameters on both fre-
quency bands’ training sets. In order to further reduce the data complexity, it was decided that features
with less than 5% importance given by each tree based classifier would be removed. The average fea-
ture importances that each decision tree based classification algorithm gave were registered and are
represented in Table 5.3. The obtained feature importances allowed to further explore the KPI contri-
butions for classification. One of the most interesting insights that was retrieved by the aforementioned
table, was that the mobility KPIs, namely the IntraFreq Exec HO Succ Rate, was the second most im-
portant feature. This was the opposite of what was thought when analysing the correlation matrix and,
thus, there are some cells that do perform contention based access to perform handovers. As expected,
the RandomAcc Succ Rate was the most important feature by all obtained models. However, the over-
all importances given to IntraFreq Prep HO Succ Rate was very small, even not being used by the AB
model. As the mean of the importances was slightly higher than 5% the aforementioned KPI was not
dropped from the dataset. Thus, no KPIs were dropped in this step.
As the dataset was not changed, the trained models were tested on the test set. As each resulting
model outputs probabilities and classifies as class A or class B through a specified probability threshold,
a default threshold of 50% was chosen again. The classification results for detecting RSI collisions are
showcased in Table 5.4. At first glance, no major conclusions could be taken from the results as the
highest metrics were almost evenly distributed through the models. However, the highest Precisions
were obtained by the ERT and RF models at the cost of delivering the lowest Recall scores. The afore-

78
Table 5.4: Peak traffic RSI collision classification results.

800 MHz Band 1800 MHz Band


Model Accuracy Precision Recall Accuracy Precision Recall
ERT 62.04% 73.91% 02.79% 61.35% 75.00% 00.27%
RF 61.66% 58.06% 02.95% 61.63% 81.25% 01.19%
SVM 62.42% 54.75% 16.07% 62.19% 57.22% 09.41%
AB 62.67% 54.79% 19.67% 61.95% 66.67% 03.47%
GB 62.48% 52.22% 34.75% 62.09% 57.42% 08.14%

mentioned fact may have indicated that the data did not have enough information for this classification
task.

800 MHz Band 1800 MHz Band
1.0
ERT (area = 0.50) ERT (area = 0.49)
RF (area = 0.50) RF (area = 0.51)
SVM (area = 0.48) SVM (area = 0.48)
AB (area = 0.53) AB (area = 0.51)
GB (area = 0.50) GB (area = 0.51)
0.8

0.6
Precision

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Recall Recall

Figure 5.4: Smoothed Precision-Recall curves for peak traffic RSI collision detection.

As Table 5.4 did not present much information concerning the majority of the used classifiers, it was
decided to calculate and plot their Precision-Recall curves for testing for both frequency bands test sets.
The obtained Precision-Recall curves are illustrated in Figure 5.4. The area under each classifier’s
curve is its average Precision. Regarding the 800 MHz frequency band, it is clear that the curve relative
to the AB curve has a strange behaviour. The aforementioned behaviour was due to the AB model
assigning several cells with the same probability values. For both frequency bands, there was no clear
best performing model as all behaved similarly. Additionally, for both cases, SVM was clearly the worst
performing classifier with this data.
After analysing the classification results, the next step was to evaluate how much time each classifier
took to train and to test for each frequency band, which times are showcased in Table 5.5. The fastest

79
Table 5.5: RSI collision classification training and testing times in seconds.

800 MHz Band 1800 MHz Band


Model Training time [s] Testing time [s] Training time [s] Testing time [s]
ERT 0.4 0.1 0.4 0.1
RF 1.8 0.5 3.7 0.9
SVM 5.7 0.1 34.4 0.8
AB 0.7 0.1 0.5 0.1
GB 0.7 0.1 0.1 0.1

obtained model was GB with training and testing times going as low as 0.1 seconds. Furthermore, it
was one of the best performing models. Again, it should also be pointed out that the times presented
are highly influenced by the chosen number of iterations or estimators for each classifier. With these
results, ERT could be chosen for the 800 MHz frequency band due to its fast training and testing times
as well as its high Precision peak of 75% for low Recall. For the 1800 MHz, it is harder to choose the
best performing classifier as the results were very similar. However, the GB could be chosen as it was
the fastest performing model.

800 MHz Band 1800 MHz Band
ERT ERT
RF RF
0.65 SVM SVM
AB AB
GB GB
0.60
Precision ­ Recall area

0.55

0.50

0.45

0.40

0.35

0.30
500 1000 1500 2000 2500 3000 3500 4000 4500 1000 2000 3000 4000 5000 6000 7000
Training examples Training examples

Figure 5.5: Learning curves for peak traffic RSI collision detection.

In order to verify whether or not enough data was used for the classification task, the learning curves
for both frequency bands were obtained and plotted. The resulting learning curves are illustrated in
Figure 5.5. The main insight that could be taken from the learning curves is that the average Precision
scores were already approximately stabilized for the two last training set sizes for both frequency bands.
Thus, results would not be significantly better if more data was added. These results, while not practical
for mobile network operators, show that it is possible to classify RSI collisions through KPI analysis.

80
5.3.3 Feature Extraction Based Classification

This subsection tests the hypothesis if whether RSI collisions can be detected by extracting statistical
measurements from each KPI daily time series and use those measurements as features.
In order to extract statistical data from all KPIs time series, it was decided to apply tsfresh once again
[60]. Similarly for PCI confusion detection, it was chosen to run tsfresh on full daily KPI time series as it
ran faster. More specifically, it took 5 hours to extract statistical data from data relative to the 800 MHz
and 1800 MHz frequency bands in order to detect RSI collisions. 732 and 951 features were extracted
for the 800 MHz and 1800 MHz frequency bands, respectively.

800 MHz Band 1800 MHz Band


1.00 1.00
0.98 0.98
Cumulative Proportion of Variance Explained

0.75 0.75

0.50 0.50

0.25 0.25

0 100 200 273 300 400 0 100 200 284 300 400
Principal Component Principal Component

Figure 5.6: The CPVE for RSI collision detection.

Due to the high number of total resulting features, this new data set brings dimensionality problems.
Thus, the data was first standardized and then PCA was applied to the data. The resulting CPVE
functions for both 800 MHz and 1800 MHz frequency bands are illustrated in Figure 5.6. It was inferred
that the data relative to the 800 MHz frequency band could be reduced to 273 and 284 PCs relative to
the 800 MHz and 1800 MHz frequency bands, respectively. The number of PCs was different from both
frequency bands as their data had different number of features. This operation lead to a dimensionality
reduction of around 35% with only a 2% variance loss.

Table 5.6: Statistical data based RSI collision classification results.

800 MHz Band 1800 MHz Band


Model Accuracy Precision Recall Accuracy Precision Recall
ERT 60.32% 100% 00.48% 62.27% 72.97% 02.00%
RF 64.93% 61.30% 32.62% 64.13% 66.94% 12.12%
SVM 60.94% 54.80% 11.55% 61.79% NaN 00.00%
AB 64.02% 56.79% 40.83% 66.37% 59.88% 36.29%
GB 66.87% 61.60% 44.88% 69.39% 63.97% 45.53%

As the data sets were ready with the new features, grid search with 10-fold cross validation was

81
performed for each classifier. After 11 hours, the new hyperparameters were obtained and were used to
train and test each obtained model. The obtained results for detecting RSI collisions are showcased in
Table 5.6. The ERT model did deliver the highest Precision for both frequency bands, however GB had
the highest Accuracy and Recall overall.

800 MHz Band 1800 MHz Band
1.0
ERT (area = 0.54) ERT (area = 0.54)
RF (area = 0.58) RF (area = 0.55)
SVM (area = 0.48) SVM (area = 0.43)
AB (area = 0.60) AB (area = 0.61)
GB (area = 0.61) GB (area = 0.61)
0.8

0.6
Precision

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Recall Recall

Figure 5.7: Smoothed Precision-Recall curves for statistical data based RSI collision detection.

As Table 5.6 did not deliver many insights, it was decided once again to calculate and plot the
resulting Precision-Recall curves for each created model. The resulting plots are illustrated in Figure
5.7. The GB model performed the best for both frequency bands, having a Precision peak of around
85% and an average Precision of 61%. The abnormal curve behaviour of the AB model was due to it
assigning several cells with the same probability values. Once again, SVM was the worst performing.

Table 5.7: RSI collision classification training and testing times in seconds.

800 MHz Band 1800 MHz Band


Model Training time [s] Testing time [s] Training time [s] Testing time [s]
ERT 3 0.9 9 3.1
RF 13 3.7 8 1.5
SVM 119 2.9 87 2.4
AB 14.2 0.1 22.9 0.1
GB 28.4 1 246 0.5

The training and testing times of each created model that led to the presented results were also
collected. All of those times are presented in Table 5.7. The GB model showed testing times lower than
one second, however it had one of the highest training times. More specifically, it reached 28.4 and 246

82
seconds of training time for the 800 MHz and 1800 MHz frequency band, respectively. Nonetheless,
the GB model presented superior performance relative to other obtained models with near real time
performance, thus being the best model overall.

800 MHz Band 1800 MHz Band
0.70 ERT ERT
RF RF
SVM SVM
AB AB
0.65 GB GB

0.60
Precision ­ Recall area

0.55

0.50

0.45

0.40

0.35

1000 2000 3000 4000 5000 2000 4000 6000 8000


Training examples Training examples

Figure 5.8: Learning curves for statistical data based RSI collision detection.

In order to verify whether or not enough data was used for the classification task, learning curves
were built and plotted for both frequency bands. The resulting learning curves are illustrated in Figure
5.8. The obtained learning curves showed that the collected results would not significantly increase if
more data was added to the dataset.

5.3.4 Raw Cell Data Based Classification

This subsection tests the hypothesis if whether RSI collisions can be detected by using each KPI mea-
surement in each day as an individual feature. As there are 96 daily measurements per KPI in each cell
as well as a total of 6 KPIs, there is a total of 96 measurements × 6 KP Is = 576 f eatures.
The data was standardized and, in order to minimize noise from each KPI, was also smoothed with
a simple moving average with a window of size 20. The aforementioned window size was chosen as it
was the one that yielded the best results.
Similarly to the last subsection, due to the high number of features, PCA was applied once again
in order to reduce data dimensionality to use it as input to SVM. The CPVE threshold was defined
as 98%, as previously. After applying PCA to the data, the CPVE function was obtained for the 800
MHz and 1800 MHz frequency bands and is illustrated in Figure 5.9. The obtained number of PCs in
the defined threshold was also the same for both frequency bands, which was 332 PCs. The obtained
CPVE was very similar to both frequency bands. This step resulted in a 58% dimensionality reduction
with a variance loss of 2%.

83
800 MHz Band
1.00
0.98
Cumulative Proportion of Variance Explained

0.75

0.50

0.25
0 100 200 300 332 400 500
Principal Component

Figure 5.9: The CPVE for RSI collision detection.

Table 5.8: Raw cell data RSI collision classification results.

800 MHz Band 1800 MHz Band


Model Accuracy Precision Recall Accuracy Precision Recall
ERT 59.49% 50.00% 00.83% 59.83% 75.00% 00.22%
RF 61.70% 62.64% 13.52% 65.55% 63.86% 33.07%
SVM 60.07% 52.24% 16.61% 59.25% 46.67% 09.14%
AB 64.73% 60.38% 37.60% 64.99% 59.59% 40.32%
GB 66.41% 60.84% 47.92% 66.22% 62.72% 39.52%

After reducing data dimensionality for the SVM classifier, the next step was to apply grid search with
10-fold cross validation on the training set.The goal was to obtain once again the optimal hyperparame-
ters that maximized Precision first and Recall second for detecting RSI collisions. The hyperparameters
were obtained after approximately 10 hours of grid testing. With these hyperparameters, each classifica-
tion algorithm was trained and tested, and the results are showcased in Table 5.8. Once again, the GB
model had more accuracy for both frequency bands. The RF and ERT models had the highest Precision
for the 800 MHz and 1800 MHz frequency bands, respectively.
In order to obtain more insights regarding the performance of each model, the Precision-Recall
curves were obtained and plotted for each obtained model. The resulting plots are illustrated in Figure
5.10. The GB model presented the highest average Precision, while the RF and ERT models showed
slightly worse average Precision.
Afterwards, in order to evaluate the performance increase even further, the training and testing times
were also registered and are showcased in Table 5.9. The GB model showed testing times lower than
one second and the third highest training times for both frequency bands. More specifically, it took 12.8
and 24.4 seconds to train in the 800 MHz and 1800 MHz frequency bands, respectively. However, the
GB model’s performance was in near real time and was thus the best performing model overall.
The final step was to investigate whether or not the classification results could be improved if more

84
800 MHz Band 1800 MHz Band
1.0
ERT (area = 0.54) ERT (area = 0.54)
RF (area = 0.56) RF (area = 0.58)
SVM (area = 0.45) SVM (area = 0.44)
AB (area = 0.60) AB (area = 0.56)
GB (area = 0.61) GB (area = 0.60)
0.8

0.6
Precision

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Recall Recall

Figure 5.10: Smoothed Precision-Recall curves for raw cell data RSI collision detection.

Table 5.9: RSI collision classification training and testing times in seconds.

800 MHz Band 1800 MHz Band


Model Training time [s] Testing time [s] Training time [s] Testing time [s]
ERT 3.1 0.9 0.4 0.1
RF 1.9 0.5 0.6 0.2
SVM 189 5.2 395 17.9
AB 17.9 0.1 54.1 0.1
GB 12.8 0.1 24.4 0.2

data was added. Thus, the obtained learning curves are illustrated in Figure 5.11. The learning curves
showed that the results would improve, specially for the GB model. The learning curves relative to SVM
showed a big downward trend which meant that the model was overfitting to the data while the remaining
models were not.

5.4 Preliminary Conclusions

The goal of this chapter was to study how the RSI is used in LTE networks in order to develop a super-
vised methodology to detect RSI collisions with near real time performance.
In Section 5.2 the chosen KPIs were presented by stating their meaning and how they were relevant
for detecting RSI collisions. A brief daily KPI time series analysis was presented as well, which allowed
for a better understanding of their daily behaviours.
In Section 5.3 a similar cell labeling approach to the one presented in the last chapter was used,

85
800 MHz Band 1800 MHz Band
0.65 ERT ERT
RF RF
SVM SVM
AB AB
0.60 GB GB

0.55
Precision ­ Recall area

0.50

0.45

0.40

0.35

0.30

1000 2000 3000 4000 5000 6000 2000 4000 6000 8000
Training examples Training examples

Figure 5.11: Learning curves for raw cell data RSI collision detection.

once again. Specifically, by using cell neighbor configurations to detect RSI collisions. The data was
further analysed, which led to the removal of one KPI that had high null count averages.
The three presented hypotheses were tested by using five different classification algorithms, namely
SVM, AB, GB, ERT and RF. Similarly for PCI conflict detection, all the obtained results from all hypothe-
ses delivered near real time performance with training and testing times rarely going beyond 250 and 5
seconds, respectively. The hypothesis that led to the best results was using each KPI measurement in
each day as an individual feature, as was the case for PCI conflict detection. Regarding the 800 MHz
frequency band, the best model was obtained by GB which led to an average Precision of 61%, with a
Precision peak of about 85% for 3% Recall. Regarding the 1800 MHz frequency band, the best model
was again obtained by GB which delivered an average Precision of 60%, with a Precision peak of about
85% for 1% Recall.
The chosen hypothesis had similar results to the second hypothesis which applied and extracted
statistical calculations from the KPIs. This fact was not the same for PCI conflict detection which had
a distinct class imbalance. Specifically, PCI conflicts represented 10% of the total data in contrast
to RSI collisions that represented around 40% of the total data. Additionally, RSI collisions are more
easily identifiable than PCI conflicts as their main symptom is the success rate of the random access
procedure, which is easily measured. Lastly, the obtained learning curves showed that the results would
improve if more data was added to create the models. This was the criteria that resulted in the third
hypothesis being chosen as the best one out of the three presented hypotheses.

86
Chapter 6

Conclusions

6.1 Summary

This Thesis aimed to create and test ML models that were able to classify PCI conflicts and RSI col-
lisions with a minimum FP rate and near real time performance. To achieve such goal, three different
hypotheses were proposed and tested.
Chapter 2 presented a general technical background of LTE radio technology, since it was important
to understand how a LTE system operates and collects performance data.
Chapter 3 addressed ML concepts as well as more specific ones, such as how time series can be
classified to reach the Thesis’ objectives. Furthermore, it presented a technical overview of the five
applied classification algorithms in order to reduce the results’ bias, namely AB, GB, ERT, RF and SVM.
Chapter 4 introduced the LTE PCI network parameter, how PCI conflicts can occur, tested three
hypotheses of PCI conflict detection and presented the hypotheses’ results.
It was shown that the PCI is used to scramble date in order to aid mobile phones to separate infor-
mation from different eNBs and has a limited range of 0 to 503 values. PCI conflicts happen when two
or more neighbor cells operate in the same frequency band and share the same PCI, which can lead to
service drops and failed handovers.
The 12 proposed KPIs to be used for PCI conflict detection were presented and explained why they
were relevant. A first approach of using the network vendor PCI Conflict Detection Feature to label the
cells as either nonconflicting or conflicted was applied. By extracting statistical calculations from daily
KPIs and using them as features to be applied to a SVM classifier yielded poor results, with Precision
and Recall scores of 63.28% and 1.13%, respectively. This revealed that the used data should be
revised.
A new labelling approach was applied thanks to a CELFINET product that allows to obtain configured
cell relations, which labels both PCI collisions and confusions. This new labelling revealed to be superior
and more auditable comparatively with the network vendor PCI Conflict Detection Feature. The data
cleaning procedure was presented and explained, as well as the reason behind splitting the dataset into
the 800 MHz and 1800 MHz frequency bands.

87
Upon testing all three hypotheses, all of them yielded near real time performance with training and
testing durations rarely going beyond 150 and 10 seconds, respectively. Furthermore, the third hypothe-
sis, using each daily KPI measurement as an individual feature, yielded the best results. The best model
for the 800 MHz frequency band was obtained by GB, having reached an average Precision of about
31% with a Precision peak of 80% for 3% Recall. Additionally, for the 1800 MHz frequency band, the
ERT model achieved the best results with an average Precision of 26% with a Precision peak of 80% for
1% Recall.
Chapter 5 introduced the LTE RSI network parameter and how RSI collisions can occur, tested three
hypotheses of RSI collision detection and presented their respective results.
It was shown that the RSI indicates the index of the logical root sequence in order to derive the
PRACH preamble sequence to start the random access procedure. RSI collisions happen whenever
two or more neighbor cells operate in the same frequency band and share the same RSI, leading to
an increase of failed service establishments and re-establishments, as well as an increase of failed
handovers.
The approach taken to detect RSI collisions was the same as for PCI conflicts, but by only using 7
KPIs. After testing all three hypotheses, all of them yielded near real time performance with training and
testing durations rarely going beyond 250 and 5 seconds, respectively. Furthermore, the third hypothesis
yielded the best results, once again. The best model was obtained by GB, having reached for the 800
MHz frequency band an average Precision of about 61% with a Precision peak of 85% for 3% Recall.
Additionally, for the 1800 MHz frequency band, the GB model achieved an average Precision of 60%
with a Precision peak of 85% for 1% Recall.
The fact of the third hypothesis having delivered better results comparatively to the second hypoth-
esis, applying and extracting statistical calculations from the KPIs, could have been due to loss of in-
formation by extracting statistical calculations from full daily periods. Specifically, network problems are
best detected by KPIs in peak traffic instants and that information could have been lost by compressing
it in statistical calculations of full daily KPIs. As the third hypothesis used all the information in its raw
form, the models may have been able to perform more effective classifications.
The obtained results showed that after testing the model on data of a specific day, it is possible to
select a probability threshold in order to vary the Recall and obtain the cells that have higher chances of
having PCI conflicts or RSI collisions. This is because, the lower the Recall, the higher the Precision, as
seen in the Precision-Recall curves.
A point that should be stressed is that, due to several factors (i.e. distances between cells, cell radius,
and radio environment), there is not a clear distinction between a nonconflicting and a conflicting cell. A
cell can have a conflict with a cell that is far away, but the KPIs may not show a problem in the cell due
to the large distance between the two cells. Thus, a new approach taking distances into account should
be taken.
Several obstacles were encountered during this work, namely in the data gathering process and in
processing power which took a considerable time investment. For instance, data gathering for a single
day took between 1 to 2 hours to finish. As the databases containing the configured relations were only

88
updated once per week, six weeks were required to have data from 6 days (3 for PCI conflict detec-
tion and the other 3 for RSI collision detection). Additionally, the results regarding the third hypothesis
showed that they would improve with more data. Regarding processing power, limitations were most
noticed in the optimal hyperparameter search. Overall, 10 hours were required to obtain the optimal
hyperparameters for each classification algorithm and for each hypothesis.

6.2 Future Work

There is much to be explored for both PCI conflict and RSI collision detection. For instance, taking the
distances between conflicting cells should be a priority, as conflicts between distant cells will not have
a noticeable impact on the KPIs. The obtained distances should be studied to find an optimal distance
threshold to label cells initially reported as conflicting, as either conflicting or nonconflicting. This optimal
distance threshold will be the distance where the KPIs stop being significantly affected by conflicts. To
obtain such distance, an algorithm could be developed that takes into account the power emitted by the
cell, antenna tilt and its azimuth. Furthermore, more data should be added to the dataset.
As a single KPI measurement is not independent from the previous ones as well as from other KPI
measurements in the same instant, deep learning can be applied. There is a popular deep learning
network, namely the Long Short Term Memory (LSTM) network, that explores the time dependency in
time series by remembering values over arbitrary intervals. It is possible to apply a LSTM network to
raw multivariate time series, such as daily KPIs, in order to classify sequences as either conflicting or
nonconflicting. Furthermore, there is another deep learning network, namely the Convolutional Neural
Network (CNN), that takes into account the interactions between simultaneous features. Thus, it is
also possible to apply a CNN to raw multivariate time series, such as daily KPIs in order to explore the
interactions between simultaneous KPIs.

89
90
Appendix A

PCI and RSI Conflict Detection

Figure A.1: PCI and RSI Conflict Detection Flowchart.

91
92
Bibliography

[1] A. Gómez-Andrades et al., “Automatic root cause analysis for LTE networks based on unsupervised
techniques,” IEEE Transactions on Vehicular Technology, vol. 65, no. 4, pp. 2369–2386, 2016.

[2] F. Pedregosa et al., “Scikit-learn: Machine learning in python,” Machine Learning, vol. 12, pp. 2825–
2830, Nov. 2011.

[3] “Requirements for evolved UTRA and UTRAN,” 3GPP Technical Report TR 25.913, Tech. Rep.
version 2.1.0., June 2005.

[4] H. Holma and A. Toskala, LTE for UMTS: OFDMA and SC-FDMA Based Radio Access, 1st ed.
Wiley Publishing, 2009, ISBN:978-0-470-99401-6.

[5] H. Holma and A. Toskala, WCDMA for UMTS: HSPA Evolution and LTE, 4th ed. Wiley Publishing,
2007, ISBN:978-0-470-31933-8.

[6] S. Sesia, I. Toufik, and M. Baker, LTE - The UMTS Long Term Evolution: From Theory to Practice,
2nd ed. Wiley Publishing, 2011, ISBN:978-0-470-66025-6.

[7] C. Cox, An Introduction to LTE, LTE-advanced, SAE, VoLTE and 4G Mobile Communications,
2nd ed. Wiley Publishing, 2014, ISBN:978-1-118-81803-9.

[8] N. Shankar and S. Nayak, “Performance management in network management system,” Interna-
tional Journal of Science and Research (IJSR), no. 4:2505-2507, May 2015, paper ID: SUB154796.

[9] Cisco, “Performance Management Best Practices and Broadband Service Providers,” Cisco, Tech.
Rep., June 2008.

[10] M. Sattorov, S. Yeo, and H. Jo-Kang, “Pros and cons of multi-user orthogonal frequency division
multiplexing,” Division of IT Engineering, Graduate School, Mokwon University, Korea, May 2008.

[11] N. Dewangan, A detailed Study of 4G in Wireless Communication: Looking insight in issues in


OFDM, 1st ed. Anchor Academic Publishing, 2014, ch. 1, ISBN:978-3-954-89584-7.

[12] J. Sathyan, Fundamentals of EMS, NMS and OSS/BSS, 1st ed. Boston, MA, USA: Auerbach
Publications, 2010.

93
[13] C. Rizos, “Challenges of network performance monitoring,” SNMPcenter, 2014. Avail-
able at: http://www.snmpcenter.com/challenges-of-network-performance-monitoring/, ac-
cessed: 18/April/2017.

[14] A. Weber and R. Thomas, “Key Performance Indicators - Measuring and Managing the Mainte-
nance Function,” IVARA, Tech. Rep., November 2005.

[15] “Universal Mobile Telecommunications System (UMTS); LTE; Telecommunication management;


Key Performance Indicators (KPI) for Evolved Universal Terrestrial Radio Access Network (E-
UTRAN): Definitions (3GPP TS 32.450 version 14.0.0 Release 14),” 3GPP, Tech. Rep. 14.0.0, April
2017.

[16] “The mobile network test in the netherlands,” P3, 2016. Available at: http://www.
connect-testmagazine.com/wp-content/uploads/2016/03/160311_P3_connect_Mobile_
Benchmark_NL_2016_report_Release_FV.pdf, accessed: 20/April/2017.

[17] “3rd Generation Partnership Project; Technical Specification Group Services and System Aspects;
Telecommunication management; Configuration Management (CM); Concept and high-level re-
quirements(Release 14),” 3GPP, Tech. Rep. 14.0.0, April 2017.

[18] P. Domingos, “A few useful things to know about machine learning,” Commun. ACM, vol. 55, no. 10,
pp. 78–87, Oct. 2012.

[19] E. Alpaydin, Introduction to Machine Learning, 2nd ed. The MIT Press, 2010, ch. 1, ISBN:978-0-
262-01243-0.

[20] H. K. Jabbar and R. Z. Khan, “Methods to avoid over-fitting and under-fitting in supervised machine
learning (comparative study),” in Computer Science, Communication & Instrumentation Devices,
H. R. Janahanlal Stephen and S. Vasavi, Eds. Research Publishing, 2015.

[21] J. Luts et al., “A tutorial on support vector machine-based methods for classification problems in
chemometrics,” Analytica Chimica Acta, vol. 665, no. 2, pp. 129 – 145, 2010.

[22] T. Gorecki and M. Luczak, “Multivariate time series classification with parametric derivative dynamic
time warping,” Expert Syst. Appl., vol. 42, no. 5, pp. 2305–2312, Apr. 2015.

[23] Z. Zhang, “Introduction to machine learning: k-nearest neighbors,” Annals of Translational Medicine,
vol. 4, no. 11, 2016.

[24] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, “A practical guide to support vector classification,” Depart-
ment of Computer Science, National Taiwan University, Tech. Rep., 2003.

[25] R. E. Schapire, Explaining AdaBoost. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp.
37–52.

[26] Y. Freund and R. E. Schapire, A Decision-Theoretic Generalization of On-Line Learning and an


Application to Boosting, 1997, vol. 55, no. 1, pp. 119 – 139.

94
[27] A. Natekin and A. Knoll, “Gradient boosting machines, a tutorial,” Front. Neurorobot., vol. 2013,
2013.

[28] J. H. Friedman, “Greedy function approximation: A gradient boosting machine.” Ann. Statist.,
vol. 29, no. 5, pp. 1189–1232, 10 2001.

[29] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” Machine Learning, vol. 63,
no. 1, pp. 3–42, Apr 2006.

[30] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001.

[31] J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves,” in Pro-
ceedings of the 23rd International Conference on Machine Learning, ser. ICML ’06. New York, NY,
USA: ACM, 2006, pp. 233–240.

[32] P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-validation,” Encyclopedia of Database Systems, pp.
532–538, 2009.

[33] P. Domingos, “A unified bias-variance decomposition and its applications,” in In Proc. 17th Interna-
tional Conf. on Machine Learning. Morgan Kaufmann, 2000, pp. 231–238.

[34] A. Y. Ng, “Preventing ”overfitting” of cross-validation data,” in In Proceedings of the Fourteenth


International Conference on Machine Learning. Morgan Kaufmann, 1997, pp. 245–253.

[35] D. M. McNeish, “Using lasso for predictor selection and to assuage overfitting: A method long
overlooked in behavioral sciences,” Multivariate Behavioral Research, vol. 50, no. 5, pp. 471–484,
2015.

[36] R. Bellman, “Adaptive control processes: A guided tour. (A RAND Corporation Research Study).”
Princeton, N. J.: Princeton University Press, XVI, 255 p. (1961)., 1961.

[37] J. B. Tenenbaum, V. de Silva, and J. C. Langford, “A global geometric framework for nonlinear
dimensionality reduction,” Science, vol. 290, no. 5500, 2000.

[38] J. Shlens, “A tutorial on principal component analysis,” in Systems Neurobiology Laboratory, Salk
Institute for Biological Studies, 2005.

[39] G. Hulten and P. Domingos, “Mining complex models from arbitrarily large databases in constant
time,” in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discov-
ery and Data Mining, ser. KDD ’02. New York, NY, USA: ACM, 2002, pp. 525–531.

[40] P. Langley, “Machine learning as an experimental science,” Machine Learning, vol. 3, no. 1, pp. 5–8,
1988.

[41] C. Perlich, Learning Curves in Machine Learning. Boston, MA: Springer US, 2010, pp. 577–580.

[42] C. Chatfield, Time-Series Forecasting. CRC Press, 2000.

95
[43] K. Chakraborty et al., “Forecasting the behavior of multivariate time series using neural networks.”
Neural Networks, vol. 5, no. 6, pp. 961–970, 1992.

[44] X. Wang et al., “Experimental comparison of representation methods and distance measures for
time series data,” Data Min. Knowl. Discov., vol. 26, no. 2, pp. 275–309, Mar. 2013.

[45] J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, Mar. 1986.

[46] L. Auria and R. A. Moro, “Support vector machines (SVM) as a technique for solvency analysis,”
DIW Berlin, German Institute for Economic Research, Discussion Papers of DIW Berlin 811, 2008.

[47] F. J. Provost, T. Fawcett, and R. Kohavi, “The case against accuracy estimation for comparing in-
duction algorithms,” in Proceedings of the Fifteenth International Conference on Machine Learning,
ser. ICML ’98. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1998, pp. 445–453.

[48] C. D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. Cam-
bridge, MA, USA: MIT Press, 1999.

[49] V. V. Raghavan, G. S. Jung, and P. Bollmann, “A critical investigation of recall and precision as
measures of retrieval system performance.” ACM Trans. Inf. Syst., vol. 7, no. 3, pp. 205–229, 1989.

[50] J. Bockhorst and M. Craven, “Markov networks for detecting overlapping elements in sequence
data,” in Proceedings of the 17th International Conference on Neural Information Processing Sys-
tems, ser. NIPS’04. Cambridge, MA, USA: MIT Press, 2004, pp. 193–200.

[51] R. Bunescu et al., “Comparative experiments on learning information extractors for proteins and
their interactions,” Artif. Intell. Med., vol. 33, no. 2, pp. 139–155, Feb. 2005.

[52] J. Davis et al., “View learning for statistical relational learning: With an application to mammog-
raphy,” in Proceedings of the 19th International Joint Conference on Artificial Intelligence, ser. IJ-
CAI’05. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005, pp. 677–683.

[53] M. Goadrich, L. Oliphant, and J. Shavlik, Learning Ensembles of First-Order Clauses for Recall-
Precision Curves: A Case Study in Biomedical Information Extraction. Berlin, Heidelberg: Springer
Berlin Heidelberg, 2004, pp. 98–115.

[54] S. Kok and P. Domingos, “Learning the structure of markov logic networks,” in Proceedings of the
22Nd International Conference on Machine Learning, ser. ICML ’05. New York, NY, USA: ACM,
2005, pp. 441–448.

[55] P. Singla and P. Domingos, “Discriminative training of markov logic networks,” in Proceedings of the
20th National Conference on Artificial Intelligence - Volume 2, ser. AAAI’05. AAAI Press, 2005,
pp. 868–873.

[56] C. J. V. Rijsbergen, Information Retrieval, 2nd ed. Newton, MA, USA: Butterworth-Heinemann,
1979.

96
[57] G. Hripcsak and A. S. Rothschild, “Technical brief: Agreement, the f-measure, and reliability in
information retrieval.” JAMIA, vol. 12, no. 3, pp. 296–298, 2005.

[58] “The value of PCI planning in LTE,” RF Assurance, 2015. Available at: http://main.rfassurance.
com/?q=node/79, accessed: 07/June/2017.

[59] R. Acedo-Hernández et al., “Analysis of the impact of PCI planning on downlink throughput perfor-
mance in LTE,” Comput. Netw., vol. 76, pp. 42–54, Jan. 2015.

[60] M. Christ, “TSFRESH,” https://github.com/blue-yonder/tsfresh, 2016.

[61] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001.

[62] J. Zhu et al., “Multi-class adaboost,” Statistics and Its Interface, vol. 2, pp. 349–360, 2009.

[63] W. McKinney, “Data structures for statistical computing in python,” in Proceedings of the 9th Python
in Science Conference, S. van der Walt and J. Millman, Eds., 2010, pp. 51 – 56.

97

You might also like