Dissertation Parisa Ebrahim

Driver drowsiness monitoring using eye movement
features derived from electrooculography
Von der Fakultät Informatik, Elektrotechnik und

Informationstechnik der Universität Stuttgart
zur Erlangung der Würde eines Doktor-Ingenieurs (Dr.-Ing.)
genehmigte Abhandlung
vorgelegt von
Parisa
Ebrahim aus
Teheran
Hauptberichter: Prof. Dr.-Ing. Bin Yang

Mitberichter: Assistant Prof. Dr. Dongpu
Cao Tag der mündlichen Prüfung: 06.06.2016
Institut für Signalverarbeitung und Systemtheorie

Universität Stuttgart
2016
Accknowledgement
I would like to express my genuine thanks to my supervisor, Prof. Dr.-Ing. Bin Yang for his
smart guidance, warm encouragements and helpful comments, starting from my
Studienarbeit, continuing to Diplomarbeit and finally to this thesis. He sharpened my mind
towards a clear and scientific way of thinking and developing new ideas. For all of this I owe
him more than I can describe.
Next I would like to thank my co-referee Assistant Prof. Dr. Dongpu Cao for accepting to be
in the committee, and reviewing my thesis.
I am also grateful to Dr. Wolfgang Stolzmann for his great support starting from my Diplom-
arbeit at Daimler AG. I appreciate all his contributions of time, ideas and comments to make
a productive and stimulating working experience.
My deep gratitude goes to Dr. Klaus-Peter Kuhn for providing a good atmosphere in his
team at Daimler AG. His precious comments have opened new doors for research
possibilities, from which my thesis benefited tremendously.
Much of the research in this thesis was carried out as part of Attention Assist project at
Daimler AG. I would like to acknowledge all of my colleagues in this project, who provided a
great team work atmosphere. It was indeed enriching to be part of this project. I am deeply
indebted to Dipl.-Ing. Alexander Fürsich and Dipl.-Ing. Peter hermannstädter for the proof
reading, data collection and being ear to all my queries/problems both scientific and
otherwise. Their help, support and permanent positive attitude of collaboration showed me
that friendship knows no boundaries. My special thanks go to my predecessor Dipl.-Ing.
Fabian Friedrichs for his immense support and consultation.
I also greatly appreciate all of my colleagues at iss for their help, feedback and emergency
assistance.
My acknowledgments would not be complete without giving thanks to my lovely parents
and my brother. They made all my dreams come true. I am so much grateful for their loving
support and admiration over the years. They have constantly backed my decisions and
choices, and had unwavering faith in me. Last, but certainly not least, my tremendous and
deep thanks are extended to my love, my husband MJ, without whom I could not have
completed this journey. His patient, support and tremendous belief in me were the main
source of confidence and motivation to complete this thesis.
Contents
Notation and abbreviations v
Zusammenfassung xi
Abstract xv
1 Introduction 1
1.1 Problem statement and motivation....................................................................................1
1.2 Definition of drowsiness and inattention..........................................................................3
1.3 Countermeasures against drowsiness during driving.....................................................6
1.4 Driver drowsiness detection systems on the market.......................................................7
1.5 Thesis outline.........................................................................................................................9
1.6 Goals and new contributions of the thesis......................................................................10
2 Driver state measurement 13

2.1 Objective driver state measures........................................................................................13
2.1.1 Driving performance measures............................................................................13
2.1.2 Driver physiological measures.............................................................................16
2.2 Subjective estimation of the drowsiness..........................................................................26
3 Human visual system 31

3.1 Visual attention...................................................................................................................31
3.2 Structure of the human eye...............................................................................................31
3.3 Types of eye movements....................................................................................................32
4 In-vehicle usage of electrooculography and conducted experiments 37

4.1 Eye movement measurement during driving - a pilot study........................................37
4.1.1 Material....................................................................................................................37
4.1.2 Test tracks................................................................................................................38
4.1.3 Comparing baselines of track 1 and track 2.......................................................39
4.1.4 Bumps and eye movements: related or unrelated?...........................................41
4.1.5 Patterns of eye movements due to road curvature............................................43
4.2 Real road experiments........................................................................................................46
4.2.1 Daytime driving with no secondary tasks..........................................................46
4.2.2 Daytime driving with secondary tasks...............................................................46
4.2.3 Nighttime driving with no secondary tasks.......................................................48
4.3 Nighttime driving experiment in the driving simulator...............................................50
5 Eye movement event detection methods 53

5.1 Eye movement detection using the median filter-based method.................................53
5.2 Eye movement detection using the derivative-based method......................................55
ii Contents
5.3 Eye movement detection using the wavelet transform-based method........................61

5.3.1 Discrete Fourier transform....................................................................................61
5.3.2 Continuous wavelet transform.............................................................................62
5.3.3 Discrete wavelet transform....................................................................................71
5.4 Comparison of event detection methods.........................................................................83
5.4.1 Median filter-based versus derivative-based method.......................................84
5.4.2 Derivative-based method versus wavelet transform-based method................86
6 Blink behavior in distracted and undistracted driving 91

6.1 Time-on-task analysis of the saccade rate during the visuomotor task......................92
6.2 Time-on-task analysis of the blink rate...........................................................................92
6.3 Saccades time-locked to blinks during the visuomotor task........................................93
6.4 Direction-dependency of blinks time-locked to saccades.............................................93
6.5 Blink rate analysis during the secondary and primary tasks.......................................95
6.6 Impact of the visuomotor task on the blink behavior...................................................96
6.7 Amount of gaze shift vs. the occurrence of gaze shift-induced blinks........................97
7 Extraction and evaluation of the eye movement features 103

7.1 Preprocessing of eye movement features......................................................................103
7.1.1 KSS input-based feature aggregation.................................................................104
7.1.2 Drive time-based feature aggregation...............................................................105
7.1.3 Feature baselining.................................................................................................106
7.2 Eye blink features..............................................................................................................107
7.3 Saccade features................................................................................................................120
7.4 Event-based analysis of eye blink features...................................................................122
7.4.1 Event 1: Lane departure......................................................................................123
7.4.2 Event 2: Microsleep..............................................................................................127
7.5 Correlation-based analysis of eye blink features..........................................................130
7.5.1 Case 1: Correlation between a feature and KSS values......................................130
7.5.2 Case 2: Correlation between features................................................................134
7.6 Eye blink feature’s quality vs. sampling frequency.....................................................135
8 Driver state detection by machine learning methods 141

8.1 Introduction to machine learning...................................................................................141
8.1.1 Supervised classification......................................................................................142
8.1.2 Metrics for evaluating the performance of classifiers......................................144
8.1.3 Subject-dependent classification.........................................................................145
8.1.4 Subject-independent classification......................................................................145
8.1.5 Imbalanced class distributions............................................................................146
8.2 Artificial neural network classifier..................................................................................149
8.2.1 Network’s architecture.........................................................................................150
8.2.2 Training of the network.......................................................................................151
8.2.3 Classification results of subject-dependent data sets......................................153
8.2.4 Classification results of the subject-independent data sets.............................158
8.3 Support vector machine classifier...................................................................................159
8.3.1 Hard margin support vector machines.............................................................159
8.3.2 Soft margin support vector machines................................................................161
8.3.3 Kernel trick............................................................................................................162
Contents iii
8.3.4 Model construction...............................................................................................163

8.3.5 Multi-class classification approaches.................................................................165
8.3.6 Dealing with imbalanced data............................................................................167
8.3.7 Classification results of subject-dependent data sets......................................168
8.4 k-nearest neighbors classifier...........................................................................................172
8.4.1 Background theory...............................................................................................172
8.4.2 Classification results of the subject-dependent data sets................................173
8.5 Comparison of the supervised classifiers for driver state classification....................176
8.6 Features of the driving simulator versus real road driving........................................179
8.6.1 Generalization of the simulator data to real road driving..............................179
8.6.2 Classification of drop-outs under real road driving conditions.....................182
8.7 Feature dimension reduction...........................................................................................183
8.7.1 Sequential floating forward selection................................................................184
8.7.2 Margin influence analysis....................................................................................186
8.7.3 Correlation-based feature selection....................................................................187
9 Summary, conclusion and future work 191

9.1 Summary and conclusion................................................................................................191
9.2 Perspective of future work..............................................................................................194
A Derivation of sawtooth occurrence frequency during curve negotiation 197
B Description of a boxplot 199
C k-means clustering 201
D Statistical test 203

D.1 Paired-sample t-test.........................................................................................................203
D.2 Normal distribution test: Lilliefors test.........................................................................204
D.3 Test of significance for the Pearson correlation coefficient.........................................204
D.4 Comparison of two Pearson correlation coefficients....................................................205
D.5 One-way repeated measures ANOVA............................................................................................................ 205
D.6 Homogeneity of variance: Levene’s test.......................................................................207
D.7 Wilcoxon signed-rank test...............................................................................................208
D.8 Pearson’s chi-square test..................................................................................................209
E Mother wavelets 211
F Additional results 213

F.1 Analysis of statistical measures of features..................................................................213
F.2 Boxplot of drive time-based features versus KSS values................................................214
F.3 Correlation between features using the Spearman’s rank correlation coefficient.....254
G Gradient descent approach for training the ANN 257
H On the understanding of the dual form of the optimization problem 259

H.1 Karush-Kuhn-Tucker theorem.........................................................................................259
H.2 Extraction of the dual problem for the soft margin classifier.....................................260
iv Contents
List of figures 263
List of tables 271
Bibliography 273
Notation and abbreviations
Notations
x scalar
x column vector
X matrix
X space
Z, R set of integer and real numbers
Mathematical operations
|x absolute value of scalar

|lxl 2-norm of a vector
x∗ complex conjugate
x T , XT transpose of vector or matrix
A ⊕B direct sum of the vector spaces A and B
std(x) standard deviation of variable x
mean(x) average of variable x
cov(x) covariance matrix of random vector x
∈ is element of
F (.) discrete Fourier transform
W (.) continuous wavelet transform
rank(x)i the i-th rank of x
Symbols
a scale of the wavelet transform

A amplitude of a correctly detected blink
A1 closing amplitude of a blink
A2 opening amplitude of a blink
amp amplitude of detected potential blinks
ampem amplitude of a detected eye movement but not a fast blink
b translation of the wavelet transform
bi bias term of the i-th hyperplane
c true class label
cˆ estimated class label
vi Notation and Abbreviations
cj approximation coefficient of the dwt at

stage j C margin parameter
Copt optimal margin parameter
C+ margin parameter of the majority class
C− margin parameter of the minority class
dxi,xj distance between sample xi and xj in the feature space
dj detail coefficient of the dwt at stage j
D dimension of the feature matrix
D ˘ desired dimension of the feature matrix
E blink energy
f sampling frequency
F blink frequency
F feature matrix
F feature space
H(n) horizontal EOG signal
H a separating hyperplane in the features space
J cost function
kcfs desired dimension of the feature matrix for cfs method
K(xi, xj) kernel function
Lmaha(x, y) Mahalanobis distance between x and y
Lp(x, y) metric of Minkowski for x and y
Lwin window length of the stft
L Lagrangian function
Ld dual form of the Lagrangian function
m number of classes
Mi,j element of the i-th row and j-th column of the confusion matrix
M confusion matrix
MR evaluation metric of the cfs method
n sample
N number of observations or samples in the training set
N+ number of samples of the majority
class N− number of samples of the minority
class Nh number of neurons in a hidden layer
Nµ window size of the ewma
Nσ2 window size of the ewmvar
Nsmote number of samples to be added to the minority class in smote
r curve radius
R subset of features with kcfs number of features
S total data set
Svalidation validation set of the cross-validation
Strain training set of the cross-validation
s number of subjects
S covariance matrix
t time
t0 statistic value of the paired-sample t-test
tbaseline time interval for feature baselining
T blink duration
Symbols vii
Tc closing duration of a blink

To opening duration of a blink
Tcl duration of a the closed phase of a blink
Tro delay of reopening of a blink
T50 duration from 50% of the rise amplitude to 50% of the fall
amplitude T80 duration from 80% of the rise amplitude to 80% of the fall
amplitude T90 duration from 90% of the rise amplitude to 90% of the fall
amplitude
thk-means amplitude threshold for distinguishing between clusters of the k-means method
thdenosing amplitude threshold for noise removal by the dwt
thvel amplitude threshold regarding the amplitude of the blink velocity
ths amplitude threshold for distinguishing between saccades and long eye closures
V (n) vertical EOG signal
t
V (n) first derivative of V (n)
Vmed(n) median filter processed V
(n) V1(n) denoised V (n) by the dwt
Vˇ (n) V (n) after drift removal by the dwt
xi i-th feature vector
w weight vector
wmed window size of the median filter
z0 statistic value of the Wilcoxon signed-rank test
α confident level of a statistical test
αi Lagrange multiplier for xi
α Lagrange multiplier
γi geometrical margin of the i-th hyperplane in svm
γ parameter for a radial basis function kernel
γopt optimal parameter for a radial basis function kernel
δ angular displacement of the eyes
ζ random value for adding a new sample in smote
ϑ sensitivity of the ann
λ number of vanishing moment for a wavelet
λµ forgetting factor of the ewma
λσ2 forgetting factor of the ewmvar
µ mean value
ξi slack variable associated with xi
ρp Pearson product-moment correlation coefficient
ρs Spearman’s rank correlation coefficient
o residual
η learning rate of the ann
o standard deviation value
µn exponentially weighted moving average at sample n
σ
2 exponentially weighted moving variance at sample n
n
φ(t) wavelet scaling function
Φ(x) mapping function from x to the feature space
ψ(t) wavelet function
X dft of signal x
Xψ(a, b) continuous wavelet transform of x for scale a and translation b
viii Notation and Abbreviations
Vehicle dynamics and sensors
Γ displacement of the vehicle

r curve radius
w wheel speed
ψ yaw angle
κ road curvature
v vehicle velocity
ψ˙ yaw angle rate
Abbreviations
acc Accuracy
acv Average closing velocity
adr Average detection rate
ann Artificial neural network
anova Analysis of variance
aov Average opening velocity
asr Alpha spindle rate
cfs Correlation based feature selection
cwt Continuous wavelet transform
dac Driver alert control
dca Driver cursory attention
dda Driver diverted attention
dft Discrete Fourier
transform
dmpa Driver misprioritised attention
dna Driver neglected attention
dr Detection rate
dra Driver restricted attention
dwt Discrete wavelet
transform ECG
Electrocardiography
EEG Electroencephalography
EOG Electrooculography
er Error rate
ewma Exponentially weighted moving average
ewmvar Exponentially weighted moving variance
ess Epworth sleepiness scale
fdr False detection rate
fnr False negative rate
fpr False positive rate
fwhm Full width at half maximum
gps Global positioning system
gsrd Generalization of the simulator data to real driving
hrv Heart rate variability
ir Infra red
Abbreviations ix
kkt Karush-Kuhn-Tucker theorem

k-nn k-nearest neighbors
kss Karolinska sleepiness scale
mia Margin influence analysis
mcs Monte Carlo sampling
mcv Maximum closing velocity
mov Maximum opening velocity
nhtsa National highway traffic safety administration
okn Optokinetic nystagmus
perclos Percentage of eye
closure pc Precision
psd Power spectral density
rbf Radial basis function
rc Recall
rem Rapid eye movement
sdlp Standard deviation of lateral position
sem Slow eye movement
smote Synthetic minority oversampling technique
snr Signal-to-noise ratio
sbs Sequential backward selection
sffs Sequential floating forward selection
sfs Sequential forward selection
sss Stanford sleepiness scale
stft Short-time Fourier
transform svm Support vector
machine
swt Stationary wavelet transform
tlc Time-to-line crossing
tp Tangent point
vez Virtual edge zone
vor Vestibulo-ocular reflex
wt Wavelet transform
Zusammenfassung
Die Zunahme müdigkeitsbedingter Verkehrsunfälle in den letzten Jahren verdeutlicht die

Notwen- digkeit anhand geeigneter Referenzmaße Assistenzsysteme zur Erkennung von
Müdigkeit zu en- twickeln. Das Ziel der vorliegenden Arbeit ist daher die Klassifikation des
Fahrerzustandes basierend auf den Augenbewegungen anhand von Elektrookulographie
(EOG).
Um einen Einblick in die Zustände des Fahrens zu geben, die zu sicherheitskritischen
Verkehrssi- tuationen führen, werden zunächst die Konzepte von Fahrermüdigkeit und
-ablenkung sowie verschiedene damit zusammenhängende Terminologien beschrieben.
Anschließend werden Tech- niken betrachtet, um den Fahrer wach zu halten und somit
Autounfälle zu verhindern. Da diese Techniken keine lang anhaltende Wirkung auf die
Wachheit des Fahrers zeigen, sind in- telligente Systeme zur Erkennung der Müdigkeit des
Fahrers notwendig. In der Vergangenheit sind derartige Systeme bereits entwickelt worden,
von denen einige in dieser Arbeit vorgestellt werden.
Wie auch in früheren Studien festgestellt wurde, ist der Fahrerzustand durch objektive und
subjektive Maße quantifizierbar. Zur Erfassung objektiver Maße wird der Fahrer entweder di-
rekt oder indirekt überwacht. Bei der indirekten überwachung des Fahrers werden
Messgrößen verwendet, die die Fahrleistung des Fahrers widerspiegeln, wie zum Beispiel die
Spurhaltung oder Lenkradbewegungen. Im Gegensatz dazu umfasst die direkte
überwachung hauptsächlich physiologische Messgrößen wie Hirnaktivität, Herzfrequenz und
Augenbewegungen. Um diese objektiven Messgrößen beurteilen zu können, sind subjektive
Messgrößen wie eine Eigenbew- ertung durch den Fahrer notwendig. Die vorliegende Arbeit
stellt diese Messgrößen vor und diskutiert die Bedenken hinsichtlich ihrer Interpretation und
Zuverlässigkeit.
Die auf dem Markt existierenden Müdigkeitsassistenzsysteme stützen sich alle auf
fahrleistungs- basierte Maße. Diese setzen voraus, dass das Fahrzeug ausschließlich durch
den Fahrer selbst gelenkt wird. Solange andere Assistenzsysteme mit dem Ziel das Fahrzeug
in der Mitte der Fahrbahn zu halten aktiviert sind, würden Maßzahlen zur Fahrleistung
falsche Entscheidungen hinsichtlich einer Warnung treffen. Der Grund dafür ist, dass die
Sensoren eine Kombination aus dem Verhalten des Fahrers und des aktivierten
Assistenzsystem messen. Das Müdigkeitswarn- system kann den Beitrag des Fahrers an der
Fahraufgabe nicht bestimmen. Dies unterstreicht die Notwendigkeit einer direkten
Fahrerüberwachung.
Frühere Arbeiten haben als Indikator für Müdigkeit den Abfall der Alpha Spindelrate (asr)
eingeführt. Hierbei handelt es sich um ein Merkmal, das aus den Hirnaktivitätssignalen
während der direkten Beobachtung des Fahrers extrahiert wird. Es wurde gezeigt, dass
anhand asr Ablenkung des Fahrers erfasst werden kann und insbesondere eine visuelle
Ablenkung dabei einen entgegenwirkenden Effekt hat. Basierend auf den
Augenbewegungen des Fahrers wurde ein Algorithmus entwickelt, um die negative
Auswirkung der visuellen Ablenkung des Fahrers auf die asr zu reduzieren. Der
Zusammenhang von asr und der Müdigkeit des Fahrers kann dadurch teilweise verbessert
werden.
Da der Fokus dieser Arbeit auf den Augenbewegungen des Fahrers liegt, wird das visuelle
System des Menschen vorgestellt und die Idee des „Was“ und „Wo“ beschrieben, um visuelle
Aufmerk-
xii Zusammenfassung
samkeit zu definieren. Desweiteren wird die Struktur des menschlichen Auges beschrieben
und die relevanten Arten von Augenbewegungen während der Fahrt definiert. Außerdem werden
Au- genbewegungen in zwei Gruppen mit langsamen und schnellen Augenbewegungen
kategorisiert. Es wird gezeigt, dass Lidschläge je nach Wachsamkeit des Fahrers zu beiden
dieser Gruppen gehören können.
EOG als ein Werkzeug zur Messung der Augenbewegungen ermöglicht es uns zwischen
müdigkeits- bzw. ablenkungsbedingten Augenbewegungen und Augenbewegungen
aufgrund der Fahrsit- uation zu unterscheiden. Aufgrund dessen wurde im Rahmen einer
Pilotstudie ein Experi- ment unter vollständig kontrollierten Bedingungen auf einem Testgelände
durchgeführt, um den Zusammenhang zwischen den Augenbewegungen des Fahrers und
verschiedenen realen Fahrszena- rien zu untersuchen. In diesem Experiment sind
unerwünschte Kopfschwingungen in den EOG- Signalen und das Sägezahnmuster der
Augen (optokinetischer Nystagmus, okn) als situation- sabhängige Augenbewegungen erkannt
worden. Kopfschwingungen treten aufgrund von Bode- nanregungen auf, wohingegen okn in
Kurven mit kleinen Radien (50m) vorkam. Die statistis- che Untersuchung zeigt eine
signifikante Veränderung in den EOG-Signalen durch unerwünschte Kopfschwingungen.
Darüber hinaus wird ein analytisches Modell entwickelt, um den möglichen Zusammenhang
zwischen okn und den Tangentenpunkt der Kurve zu erklären. Das entwickelte Modell wird
mit realen Daten einer Strecke mit hohen Krümmungen validiert.
Um alle relevanten Muster von Augenbewegungen während wacher Fahrten und Fahrten
unter Müdigkeit zu erfassen, werden in dieser Arbeit verschiedene Experimente —inklusive
Tag- und Nachtversuche— sowohl unter realen als auch simulierten Fahrbedingungen
durchgeführt.
Basierend auf den in den Experimenten erhobenen Signalen werden verschiedene Ansätze
zur Detektion der Augenbewegungen untersucht. Zunächst wird die Detektion von
Lidschlägen basierend auf der Medianfilterung beleuchtet und ihre Nachteile bei der
Erkennung langsamer Lidschläge und Sakkaden aufgezeigt. Danach wird ein adaptiver
Erkennungsansatz basierend auf der Ableitung der EOG-Signale vorgeschlagen, welcher
nicht nur Lidschläge, sondern auch andere fahrrelevante Augenbewegungen wie Sakkaden
und Sekundenschlafereignisse erkennt. Der vorgeschlagene Algorithmus unterscheidet darüber
hinaus, zwischen den häufig verwechselten fahrrelevanten Sakkaden und einer verringerten
Lidschlagamplitude eines müden Fahrers, obwohl Müdigkeit die Augenbewegungsmuster
beeinflusst. Die Auswertung der Ergebnisse zeigt, dass der dargestellte Algorithmus die
bekannte Medianfilterungsmethode übertrifft, so dass schnelle Augenbewegungen während
beider Phasen, wach und müde, korrekt erkannt werden.
Weiter befasst sich die vorliegende Arbeit mit der Erkennung der langsamen Lidschläge als
typi- sche Muster bei Müdigkeit durch die Anwendung einer kontinuierlichen Wavelet-
Transformation auf EOG-Signale. In dem vorgeschlagenen Algorithmus werden die schnellen
und langsamen Lid- schläge durch die Einstellung der Parameter der Wavelet-
Transformation gleichzeitig detektiert. Allerdings führt dieser Ansatz zu einer größeren
Falscherkennnungsrate im Vergleich zu der auf der Ableitung basierenden Methode. Deshalb
wird in dieser Arbeit für die Lidschlagerkennung eine Kombination beider Verfahren
angewandt. Um die Qualität der gesammelten EOG-Signale zu verbessern und Rauschen
und Drift zu entfernen, wird die diskrete Wavelet-Transformation genutzt. Für die
Rauschunterdrückung wird eine adaptive Schwellenstrategie für die diskrete Wavelet-
Transformation vorgeschlagen.
Frühere Forschungsarbeiten haben gezeigt, dass die lidschlagsbasierte Merkmale des Fahrers
(Lidschlagfrequenz, -dauer, usw.) zu einem gewissen Grad mit Müdigkeit korreliert sind.
Da- her können diese —mit einer gewissen Unsicherheit— einen Beitrag zu
Müdigkeitswarnsystemen leisten. Um diese Systeme zu verbessern, werden Eigenschaften
der detektierten Lidschläge
Zusammenfassung xiii
bezüglich ihrer unterschiedlichen Herkunft untersucht. Im Rahmen eines Experimentes unter

realen Straßenbedingungen zeigte sich, dass Lidschläge sowohl spontan als auch aufgrund
von Blickwechseln auftreten. Die Blickwechsel zwischen festen Positionen, die aufgrund der
visuo- motorischen Nebenaufgabe eingetreten sind, induzierten und modulierten das
Auftreten von Lidschlägen. Die Ergebnisse eines weiteren Fahrsimulatorversuchs ohne
Nebenaufgabe zeigen, dass die Menge der Blickwechsel (zwischen verschiedenen Positionen) mit
der Wahrscheinlichkeit des Lidschlagsauftretens positiv korreliert. Aus diesem Grund wird für
Müdigkeitswarnsysteme, die sich ausschließlich auf die änderung der Lidschlagfrequenz
stützen, empfohlen, durch Blick- wechsel (z.B. während visueller Ablenkung) induzierte
Lidschläge anders als spontan auftretende Lidschläge zu behandeln.
Nach der Analyse der Abhängigkeiten von Lidschlagfrequenz und Blickwechseln, wurden
aus jedem erkannten Lidschlag von 43 Probanden, gesammelt unter simulierten und realen
Fahrbe- dingungen, während 67 Stunden von Tages- und Nachtfahrten 19 Merkmale
extrahiert. Dies entspricht der größten Anzahl von lidschlagsbasierten Merkmalen und der
größten Anzahl an Probanden im Vergleich zu früheren Studien. Es werden zwei Ansätze zur
Aggregation von Merk- malen vorgestellt, um ihren Zusammenhang mit der sich langsam
entwickelnden Müdigkeit zu verbessern. Im ersten Ansatz wurden ausschließlich die Teile
der gesammelten Daten untersucht, die am besten mit der subjektiven Selbstbewertung der
Fahrer durch die Karolinska Sleepiness Scale korrelierten. Im zweiten Ansatz werden die
gesamten Daten mit der maximalen Menge an Informationen zur Fahrermüdigkeit
untersucht. Bei beiden Ansätzen wird die Abhängigkeit zwischen den einzelnen Merkmalen
und der Müdigkeit statistisch unter Berechnung von Kor- relationskoeffizienten betrachtet.
Die Ergebnisse zeigen, dass sich die Müdigkeitsabhängigkeit der Merkmale in einem hohen
Grad nicht-linear entwickelt. Darüber hinaus zeigte sich, dass für einige Merkmale bei
unterschiedlichen Probanden verschiedene müdigkeitsabhängige Verläufe möglich sind. Daher
stellen wir Warnsysteme in Frage, die sich nur auf ein einziges Merkmal für ihre
Entscheidungsstrategie verlassen, und betonen, dass sie anfällig für hohe Falschalarmraten
sind.
Um zu untersuchen, ob ein einzelnes Merkmal für die Vorhersage von sicherheitskritischen
Ereignissen geeignet ist, untersuchen wir die Veränderung der Merkmale für alle Probanden
kurz vor dem Auftreten des ersten unbeabsichtigten Verlassens der Fahrspur und des ersten
unbeabsichtigten Sekundenschlafs im Vergleich zum Beginn der Fahrt. Basierend auf
statistischen Tests ändern sich vor dem Verlassen der Fahrspur die meisten Merkmale
signifikant. Daher recht- fertigen wir die Rolle der lidschlagbasierten Merkmale für die frühe
Fahrermüdigkeitserkennung. Dies gilt jedoch nicht für die Variation der Merkmale vor
Sekundenschlaf.
Weiter wurden alle 19 augenbewegungsbasierte Merkmale gleichzeitig näher betrachtet. Der
Fahrerzustand wurde dabei durch künstliche neuronale Netzwerke, Support-Vektor-Maschines
und k-nearest neighbour Klassifikatoren beurteilt, sowohl für binäre als auch Multi-Class-
Fälle. Die binären Klassifikatoren sind sowohl fahrerunabhängig als auch fahrerabhängig trainiert
worden, um die Generalisierungsaspekte der Ergebnisse für ungesehene Daten zu
adressieren. Die binäre Fahrerzustandsklassifikation (wach vs. müde) basierend auf
Augenbewegungsmerkmalen ergab eine durchschnittliche Erkennungsrate von 83% für jeden
Klassifikator. Für eine dreistufige Klassifizierung (wach vs. mittel vs. müde), betrug die
Erkennungsrate lediglich 67%, möglicher- weise aufgrund ungenauer Selbstbewertungen des
Vigilanzzustandes. Darüber hinaus wurde die Frage der unausgeglichenen Daten mit
klassifikatorabhängigen und -unabhängigen Ansätzen betrachtet. Wir zeigen, dass es für eine
zuverlässige Fahrerzustandsklassifikation entscheidend ist, Ereignisse von beiden Phasen —
wach und müde— in ausgeglichener Weise zu berücksichtigen. Grund hierfür ist, dass die
vorgeschlagenen Lösungen früherer Untersuchungen zum Umgang
xiv Zusammenfassung
mit unausgeglichenen Datensätzen die Klassifikatoren nicht verallgemeinern, sondern zu ihrer

überanpassung führen.
Der Nachteil von Fahrversuchen in Fahrsimulatoren im Vergleich zu realen Fahrversuchen
wird ebenfalls dargestellt. Zu diesem Zweck wird zunächst eine Reduktion der Daten
vorgenommen. Weiter wenden wir die von uns trainierten Klassifikatoren auf nicht in den
Trainingsdaten enthal- tenen Daten müder Fahrer —gesammelt unter realen
Fahrbedingungen— an, um zu untersuchen, ob die Müdigkeit in Fahrsimulatoren
repräsentativ für die Müdigkeit unter realen Fahrbedingun- gen ist. Mit einer
durchschnittlichen Erkennungsrate von über 68% für alle Klassifikatoren, kann von der
Vergleichbarkeit beider Experimentalumgebungen ausgegangen werden.
Schließlich werden Ansätze zur Dimensionsreduzierung der Merkmale diskutiert, um die
Fahrzeug- tauglichkeit der extrahierten Merkmale zu bestimmen. Aus diesem Grund wurden
Filter und Wrapper Ansätze eingeführt und miteinander verglichen. Unsere Ergebnisse
zeigen, dass die Wrapper-Ansätze die Filter-basierten Methoden übertreffen.
Abstract
The increase in vehicle accidents due to the driver drowsiness over the last years highlights
the need for developing reliable drowsiness assistant systems by a reference drowsiness
measure. Therefore, the thesis at hand is aimed at classifying the driver vigilance state based
on eye movements using electrooculography (EOG).
In order to give an insight into the states of driving, which lead to critical safety situations,
first, driver drowsiness, distraction and different terminologies in this context are described.
After- wards, countermeasures as techniques for keeping a driver awake and consequently
preventing car crashes are reviewed. Since countermeasures do not have a long-lasting effect
on the driver vigilance, intelligent driver drowsiness detection systems are needed. In the
recent past, such systems have been developed on the market, some of which are introduced
in this study.
As also stated in previous studies, driver state is quantifiable by objective and subjective
measures. The objective measures monitor the driver either directly or indirectly. For
indirect monitoring of the driver, one uses the driving performance measures such as the
lane keeping behavior or steering wheel movements. On the contrary, direct monitoring
mainly comprises the driver’s physiological measures such as the brain activities, heart rate
and eye movements. In order to assess these objective measures, subjective measures such
as self-rating scores are required. This study introduces these measures and discusses the
concerns about their interpretation and reliability.
The developed drowsiness assistant systems on the market are all based on driving
performance measures. These measures presuppose that the vehicle is steered solely by the
driver himself. As long as other assistance systems with the concept to keep the vehicle in the
middle of the lane are activated, driving performance measures would make wrong
decisions about warnings. The reason is what the sensors measure is a combination of the
driver’s behavior and the activated assistance system. In fact, the drowsiness warning system
cannot determine the contribution of the driver in the driving task. This underscores the need
for the direct monitoring of the driver.
Previous works have introduced the drop of the alpha spindle rate (asr) as a drowsiness
indicator. This rate is a feature extracted out of the brain activity signals during the direct
monitoring the driver. Additionally, asr was shown to be sensitive to driver distraction,
especially a visual one with an counteracting effect. We develop an algorithm based on eye
movements to reduce the negative effect of the driver visual distraction on the asr. This
helps to partially improve the association of asr with the driver drowsiness.
Since the focus of this study is on driver eye movements, we introduce the human visual
system and describe the idea of what and where to define the visual attention. Further, the
structure of the human eye and relevant types of eye movements during driving are defined.
We also categorize eye movements into two groups of slow and fast eye movements. We show
that blinks, in principle, can belong to both of these groups depending on the driver’s
vigilance state.
EOG as a tool to measure the driver eye movements allows us to distinguish between
drowsiness- or distraction-related and driving situation dependent eye movements. Thus, in
a pilot study,
xvi Abstract
an experiment under fully controlled conditions is carried out on a proving ground to

investigate the relationship between driver eye movements and different real driving
scenarios. In this experiment, unwanted head vibrations within EOG signals and the
sawtooth pattern (optokinetic nystagmus, okn) of eyes are realized as situation dependent
eye movements. The former occurs due to ground excitation and the latter happens during
small radius (50 m) curve negotiation. The statistical investigation expresses a significant
variation of EOG due to unwanted head vibrations. Moreover, an analytical model is
developed to explain the possible relationship of okn and tangent point of the curve. The
developed model is validated against the real data on a high curvature track.
In order to cover all relevant eye movement patterns during awake and drowsy driving,
different experiments are conducted in this work including daytime and nighttime
experiments under real road and simulated driving conditions.
Based on the measured signals in the experiments, we study different eye movement
detection approaches. We, first, investigate the conventional blink detection method based on
the median filtering and show its drawback in detecting slow blinks and saccades.
Afterwards, an adaptive detection approach is proposed based on the derivative of the EOG
signal to simultaneously detect not only the eye blinks, but also other driving-relevant eye
movements such as saccades and microsleep events. Moreover, in spite of the fact that
drowsiness influences eye movement patterns, the proposed algorithm distinguishes between
the often confused driving-related saccades and decreased amplitude blinks of a drowsy
driver. The evaluation of results shows that the presented detection algorithm outperforms
the common method based on median filtering so that fast eye movements are detected
correctly during both awake and drowsy phases.
Further, we address the detection of slower eye blinks, which are referred to as typical
patterns of the drowsiness, by applying the continuous wavelet transform to EOG signals. In
our proposed algorithm, by adjusting parameters of the wavelet transform, fast and slow
blinks are detected simultaneously. However, this approach suffers from a larger false
detection rate in comparison to the derivative-based method. As a result, for blink detection
in this work, a combination of these two methods is applied. To improve the quality of the
collected EOG signals, the discrete wavelet transform is benefited to remove noise and drift.
For the noise removal, an adaptive thresholding strategy within the discrete wavelet
transform is proposed which avoids sacrificing noise removal for saving blink amplitude or
vice versa.
In previous research, driver eye blink features (blink frequency, duration, etc.) have shown
to be correlated to some extent with drowsiness. Hence, within a level of uncertainty they
can contribute to driver drowsiness warning systems. In order to improve such systems, we
investigate characteristics of detected blinks with respect to their different origins. We
observed that in a real road experiment, blinks occur both spontaneously or due to gaze
shifts. Gaze shifts between fixed positions, which occurred due to secondary visuomotor
task, induced and modulated the occurrence of blinks. Moreover, the direction of the gaze
shifts affected the occurrence of such blinks. Based on the eye movements during another
experiment in a driving simulator without a secondary task, we found that the amount of
gaze shifts (between various positions) is positively correlated with the probability of the
blink occurrence. Therefore, we recommend handling gaze shift-induced blinks (e.g. during
visual distraction) differently from those occurring spontaneously in drowsiness warning
systems that rely solely on the variation of blink frequency as a driver state indicator.
After studying dependencies between blink occurrence and gaze shifts, we extract 19 features
out of each detected blink event of 43 subjects collected under both simulated and real
driving
Abstract xvii
conditions during 67 hours of both daytime and nighttime driving. This corresponds to the
largest number of extracted eye blink features and the largest number of subjects among
previous studies. We propose two approaches for aggregating features to improve their
association with the slowly evolving drowsiness. In the first approach, we solely investigate
parts of the collected data which are best correlated with the subjective self-rating score, i.e.
Karolinska Sleepiness Scale. In the second approach, however, the entire data set with the
maximum amount of information regarding driver drowsiness is scrutinized. For both
approaches, the dependency between single features and drowsiness is studied statistically
using correlation coefficients. The results show that the drowsiness dependency to features
evolves to a larger extent non-linearly rather than linearly. Moreover, we show that for some
features, different trends with respect to drowsiness are possible among different subjects.
Consequently, we challenge warning systems which rely only on a single feature for their
decision strategy and underscore that they are prone to high false alarm rates.
In order to study whether a single feature is suitable for predicting safety-critical events, we
study the overall variation of the features for all subjects shortly before the occurrence of the
first unintentional lane departure and first unintentional microsleep in comparison to the
beginning of the drive. Based on statistical tests, before the lane departure, most of the
features change significantly. Therefore, we justify the role of blink features for the early
driver drowsiness detection. However, this is not valid for the variation of features before the
microsleep.
We also focus on all 19 blink-based features together as one set. We assess the driver state
by artificial neural network, support vector machine and k-nearest neighbors classifiers for
both binary and multi-class cases. There, binary classifiers are trained both subject-
independent and subject-dependent to address the generalization aspects of the results for
unseen data. For the binary driver state prediction (awake vs. drowsy) using blink features,
we have attained an average detection rate of 83% for each classifier separately. For 3-class
classification (awake vs. medium vs. drowsy), however, the result was only 67%, possibly due
to inaccurate self-rated vigilance states. Moreover, the issue of imbalanced data is addressed
using classifier-dependent and classifier-independent approaches. We show that for reliable
driver state classification, it is crucial to have events of both awake and drowsy phases in the
data set in a balanced manner. The reason is that the proposed solutions in previous
researches to deal with imbalanced data sets do not generalize the classifiers, but lead to
their overfitting.
The drawback of driving simulators in comparison to real driving is also discussed and to
this end we perform a data reduction approach as a first remedy. As the second approach, we
apply our trained classifiers to unseen drowsy data collected under real driving condition to
investigate whether the drowsiness in driving simulators is representative of the drowsiness
under real road conditions. With an average detection rate of about 68% for all classifiers, we
conclude their similarity.
Finally, we discuss feature dimension reduction approaches to determine the applicability of
extracted features for in-vehicle warning systems. On this account, filter and wrapper
approaches are introduced and compared with each other. Our comparison results show that
wrapper approaches outperform the filter-based methods.
1. Introduction
1.1. Problem statement and motivation
The number of injured and killed persons in traffic accidents of Germany based on the
statistics provided by the Federal Statistical Office of Germany (DESTATIS, 2013a) has slightly
decreased after 1970 (see Figure 1.1). This decrease is the result of introducing different safety
regulations in the course of time. Despite the overall drop of both numbers after 1970, the fact
of killing over 3000 persons along with 370000 person injuries in 2013 should not be
neglected. Number of killed persons
1957: urban speed limit of 50 Km/h
1972: highway speed limit of 100 Km/h 1973: 0.8% alcohol limit, oil crisis
1974: recommended speed on highways
1980: fine for not wearing helmet
1984: fine for not wearing seat belt
1998: 0.5% alcohol limit Number of injured persons
Number of injured persons

Number of killed persons
4 5
x 10 x 10
6
2.5 5
2 4
1.5 3
1 2
0.5 1
0
1950 1960 1970 1980 1990 2000 2010
Year
Figure 1.1.: The evolution of number of killed and injured persons in traffic accidents of Germany
(DESTATIS, 2013a)
Car crashes, in general, occur because of different reasons such as driver drowsiness,
distraction, bad weather condition, high speed, consuming alcohol, etc. Among these
reasons, in Germany, sleepy drivers are responsible for about 25% of car crashes on highways
(Zulley and Popp, 2012). In addition, one out of every six heavy road accidents with truck
involvement is caused by a drowsy truck driver. Apart from that, an increase of 6% in the
number of vehicle accidents involving person injuries due to driver drowsiness between 2008
and 2012 (see Figure 1.2) reveals that drowsiness has huge contribution in car accidents.
Based on 14268 crashes in the United States from 2009 to 2013 (Tefft, 2014), a drowsy driver
was involved in the accident percentages with the following consequences:
• the vehicle was towed away from the scene: 6%
2 Introduction
4500
traffic accidents
4000 injured persons
Number
3500
3000
2500
2000
1500
1975 1980 1985 1990 1995 2000 2005 2010
Year
Figure 1.2.: The evolution of number of vehicle accidents due to driver drowsiness and the number of
injured persons involved in them (DESTATIS, 2013b)
• a person was injured: 20%

• a person was killed: 21%.
It is also mentioned by Brown et al. (2014) that in the United States, overall, more than 80000
car crashes and 850 fatalities are the result of drowsy driving every year. In addition, based
on the 100-car naturalistic driving study, 22–24% of crashes and near-crashes were caused by
drowsy drivers (Klauer et al., 2006). According to the National Highway Traffic Safety
Administration (nhtsa), young people, shift workers and people with sleep disorder diseases
are at very high risk of car crashes due to the drowsiness (NHTSA, 1998). In the mentioned
study, “slower reaction time”, “reduced vigilance” and “impaired information process” were noted
as the consequences of drowsiness.
The aforementioned statistics underscore the necessity for developing drowsiness warning
systems. In the ideal case, such assistance systems observe one or several drowsiness
related measures and warn the driver in order to prevent accidents. This is done either
based on driving performance measures such as the steering or lane-keeping behavior or
based on the direct monitoring of the driver by means of physiological measures like eye
blinks, yawning frequency, etc. Since adding some sensors to the vehicle for monitoring the
lane keeping and steering wheel movement behaviors of the driver were affordable and
technologically much easier to develop, initially, most of the commercial products by car
manufacturers and supplier companies were all based on driving performance measures.
Along with the development of such driver drowsiness detection systems, the intelligent
vehicle technologies have also grown in the field of driver assistance rapidly to prevent car
crashes by means of other approaches. As an example, the distronic plus with Steering Assist
of Mercedes-Benz (Daimler AG, 2014b) supports the driver to stay within the lane. In other
words, the car steers to some extent together with the driver. Such a system, however, would
not prevent driver drowsiness, and on the contrary, it may lead to insurmountable problem
for the drowsiness warning system, if it is activated. The reason is that the resulting driving
behavior is a combination of the driver actions and the supporting system. Since the warning
system is unable to determine to what extent the driver is contributing, it fails to detect a
drowsy driver. Therefore, the new assistance systems deteriorate the performance of the
classical warning systems. To be emphasized again, many of the driver assistance systems
help the driver to improve the driving performance and to avoid severe crashes, but they are
unable to eliminate crashes to the full extent. This fact highlights the need of new driver
drowsiness detection
1.2 Definition of drowsiness and inattention 3
systems even for cars equipped with a variety of assistance systems. A possibility to deal
with this problem is to directly observe the driver. Hence, one can think of a driver
observation camera.
The mentioned problem will even get worst, as soon as autonomous cars are developed in
near future. Such cars drive (steer, brake or turn) independently and relieve the driver of the
full concentration on the driving task. However, at the moment that the car is unable to
interpret the surrounding information correctly and a crash deems unavoidable, the driver
should undertake the driving task. Therefore, the car should inform the driver in a timely
manner with respect to driver’s level of attention. Clearly, a driver, who is distracted, should
be informed earlier than a driver who observes the scene ahead carefully. Driver’s level of
vigilance and distraction cannot be determined based on the lane keeping and steering wheel
movement behaviors, because the car itself is responsible for them. Therefore, an
autonomous system should directly observe the driver the whole time as well, in order to
assess driver’s vigilance and attentional level. Again, one can think of a camera-based driver
monitoring system.
A driver observation camera monitors one or several driver’s physiological measures such as
repeated yawning, slower reactions, difficulty in keeping eyes open, etc. (Dong et al., 2011).
Prior to employing a camera or developing an eye tracking algorithm, it should, however, be
ascertained to what extent features of the corresponding biological measures reflect the
driver state, especially under real driving conditions which is the ultimate goal of all
assistance systems. In other words, a reliable reference drowsiness measure is needed whose
development is beneficial for evaluating any drowsiness detection system. On this account,
this work concentrates on driver eye movements analysis based on a reference measuring
system.
Before introducing different approaches for monitoring the driver and assessing the vigilance
level, we must first define what exactly we are trying to measure and to quantify. Therefore,
next section primarily concerns definition of drowsiness and inattention.
1.2. Definition of drowsiness and inattention
The states of driving, which lead to critical safety situations, have been described and distin-
guished by a variety of terminologies such as distraction, inattention, fatigue, exhaustion,
sleepiness, and drowsiness. In addition, the proper states of driving are also referred to as
awareness, vigilance and alertness. Therefore, prior to determining our terminology in this
work, first we define and discuss the aforementioned terms.
Fatigue and sleepiness
Schmidt et al. (2011) stated that the terms fatigue and sleepiness are usually used as
synonyms, although they are not identical. According to Hirshkowitz (2013), depending on
the field of study, the term fatigue might have different meanings. In civil engineering, as an
example, it is defined as “a weakening or material breakdown over time produced by repeated
exposure to stressors”. Human physical fatigue, however, refers to the “weakness from repeated
exertion or a decreased response of cells, tissues, or organs after excessive stimulation or activity”.
Considering this definition, human physical fatigue is not necessarily associated with
sleepiness, but improves after resting. As a result, it is the mental fatigue which is highly
correlated with sleepiness and car crashes and is related to the context of driving (May, 2011).
In general, Hirshkowitz (2013)
4 Introduction
defined fatigue as “a sense of tiredness, exhaustion, or lack of energy” which is intensified due
to stress load or in the course of time, also called time-on-task.
Based on Williamson et al. (2014), fatigue as a comprehensive terminology covers not only
sleepiness and mental fatigue, but also fatigue due to illness. Therefore, they associated
sleepiness only to effects such as time since awaking and time of the day effects, while fatigue
might occur due to both duration and work-loading of a task and also sleepiness or illness
factors. This is in agreement with Philip et al. (2005) who also defined sleepiness as “difficulty
in remaining awake, which disappears after sleep, but not after rest.”
Driver fatigue, as defined by May (2011), is the demotivation of continuing the driving task
and also sleepiness. She mentioned following causes for fatigue which are believed to affect
each other as well:
• task-related fatigue and environmental factors (e.g. trip duration or weather/road condi-
tion)
• sleep-related fatigue (e.g. quality/quantity of sleep, circadian rhythm)
Moreover, May (2011) categorized task-related fatigue as either active fatigue or passive fatigue.
The former, in general, is associated with the overload of “attentional resource” such as in the
case of performing a secondary task during driving. The latter, however, interacts with the
monotonicity or familiarity with the route. It is also emphasized that time-on-task
deteriorates the driving performance, if it is combined with monotonous driving.
The circadian rhythm manages the timing of alertness and sleepiness. Based on the circadian
rhythm, two peaks of sleepiness can be predicted (Čolić et al., 2014). For people, who sleep at
night, these peaks occur before afternoon and at night. May (2011) referred to the circadian
rhythm as the internal body clock with high contribution to sleepiness and driving
performance degradation. As a result, for those people, who are not synchronous to the
circadian alerting process, sleepiness is more probable (Hirshkowitz, 2013).
Čolić et al. (2014) used the general term drowsiness as a factor which threatens road traffic
safety and related it to sleepiness reasons with the following subcategories: “sleep restriction
or loss, sleep fragmentation and circadian factors”.
Distraction and inattention
Regan et al. (2011) suggested distinguishing between driver distraction and driver inattention
for a better comparison of research findings. Oxford dictionary (Oxford, 2014) defines
distraction as “a thing that prevents someone from concentrating on something else”. Regan et
al. (2011) also summarized following points for defining distraction in the driving context:
• “There is a diversion of attention away from driving or safe driving.”
• “Attention is diverted toward a competing activity, inside or outside the vehicle, which may
or may not be driving-related.”
• “The competing activity may compel or induce the driver to divert attention toward it.”
• “There is an implicit, or explicit, assumption that safe driving is adversely effected.”
Inattention, on the other hand, is defined as “lack of attention; failure to attend to one’s re-
sponsibilities; negligence” (Oxford, 2014). Hoel et al. (2010) categorized attentional dysfunction
as: “inattention, attentional competition and distraction”. Similar to distraction, inattentive
1.2 Definition of drowsiness and inattention 5
situations also involve the interference in the driving task. Thus, inattention in terms of the
driving task occurs, while performing a secondary non-driving-related task such as text
messag- ing. On the contrary, Hoel et al. (2010) linked distraction to personal concerns like
daydreaming. Finally, performing secondary driving-related tasks in addition to the primary
driving task is considered as attentional competition (e.g. driving and navigating). Unlike
Hoel et al. (2010), Wallén Warner et al. (2008) (as cited by Regan et al., 2011) decomposed
inattention as:
• “driving-related distractors inside vehicle”: e.g. navigation system
• “driving-related distractors outside vehicle”: e.g. road signs
• “non driving-related distractors inside vehicle”: e.g. speaking to a passenger
• “non driving-related distractors outside vehicle”: e.g. a passenger on the pavement
• “thoughts/daydreaming”: e.g. personal problems.
According to Pettitt et al. (2005) (cited by Regan et al., 2011), distraction leads to inattentive
driving. Inattention, however, might have other motives than only the distraction.
Regan et al. (2011) defined driver inattention as “insufficient, or no attention, to activities
critical for safe driving” with following sub-categories:
• Driver restricted attention (dra) which is the result of a biological factor such as
blinking. During blinking the driver is not able to perceive any visual information.
• Driver misprioritised attention (dmpa) which, as its name states, occurs, if
multiple driving-related tasks are not correctly prioritized. As a result, a very relevant
task to safe driving is thoroughly excluded such as looking over the shoulders while
moving forward and not paying attention to the vehicle in front for a timely braking
reaction. Regan et al. (2011) emphasized on different interpretations of such level of
inattention with respect to driving experience.
• Driver neglected attention (dna) which happens, if the driver neglects an important
driving-related task. An example is a driver who does not expect a train at a railway
level crossing and, therefore, does not observe properly. In fact, “expectation and over-
familiarity” motivate this type of inattention.
• Driver cursory attention (dca) which is the result of hastiness during driving.
Conse- quently, driving-related tasks are not performed thoroughly.
• Driver diverted attention (dda) which is similar to the definition of driver
distraction and has been studied deeply for understanding of safety in comparison to
other categories. In this category, the secondary task, which diverts driver attention,
comprises not only internal or external activities, but also mental activities. Clearly, dda
is either driving- related, e.g. by navigation system, similar to attentional competition
studied by Hoel et al. (2010) or non-driving-related, like by text massaging. In addition,
it might occur both voluntarily/involuntarily and internally/externally. This category of
driver inattention mainly covers unusual and unexpected tasks which drivers hardly
ignore.
Despite the mentioned taxonomy of inattention, Regan et al. (2011) also stated issues such as
whether tools and methods, in general, are able to collect data for all mentioned categories of
driver inattention or whether it is possible to distinguish between the mentioned categories
in a crash. It seems that except for the dda, which has been studied systematically, other
categories need new instrumentations and new algorithms for assessing the experiment data
in future.
6 Introduction
It is clear that in the taxonomy of driver inattention defined by Regan et al. (2011), the driver
state itself is also included as a factor which affects the inattention level. For example, severe
drowsiness results in longer eye closures and leads to dra. In addition, they also believe that
depending on the reaction of the driver to an event, different categories of inattention might
occur.
Regan et al. (2011) interpreted the phenomenon of looked but failed to see as either resultant of
drowsiness and, therefore, dra or internal thoughts, i.e. dda. Moreover, daydreaming has
been categorized as dda, since it is also a non-driving-related task. They discriminated
daydreaming from internal unintentional thoughts from this aspect that daydreaming is
more fantasy-like, while internal unintentional thoughts are linked to current concerns. In most
cases, the driver recognizes the daydreaming only after it is finished.
Alertness and vigilance
According to Oxford dictionary (Oxford, 2014), vigilance is defined as “the action or state
of keeping careful watch for possible danger or difficulties”. Moreover, alertness is defined as
“the state of being quick to notice any unusual and potentially dangerous or difficult
circumstances”.
Dukas (1998) defined vigilance as “a general state of alertness that results in enhanced processing
of the information by the brain”. Thiffault and Bergeron (2003), however, categorized vigilance
into two groups. The first one deals with “information process and sustained attention”, while
the other one refers to “physiological processes underlying alertness or wakefulness”. Overall,
during driving, the lack of both of them results in safety-critical situations as also mentioned
by S c h m i d t e t a l . (2009).
Terminologies of this work
In this work, the term drowsiness will be used to refer to a broader scope including both
sleepiness and fatigue. The term fatigue will not be used further to avoid ambiguity in its
definition. Since in this work only performing a secondary task will be engaged in the
primary driving task, we only use the term distraction or dda in the taxonomy defined by
Regan et al. (2011). The state that a driver does not threaten road safety and is aware of
danger is called awake.
1.3. Countermeasures against drowsiness during driving
Countermeasures are techniques for keeping a driver awake and consequently preventing car
crashes. Examples of such countermeasures are taking a nap (43%), opening a window (26%),
drinking coffee (17%), pulling over or getting off the road (15%), turning on the radio (14%),
taking a walk or stretching (9%), changing drivers (6%), eating (3%) and asking passengers to
start a conversation or singing (3%), which are all “driver-initiated” (May, 2011). The percent-
ages represent how often each countermeasure was selected by 4010 subjects based on the
results provided by Royal (2003). Multiple countermeasures have been selected by half of the
subjects.
Anund (2009) studied the interaction between countermeasures and various factors such as
age, gender, driving experience, etc. and showed how often each countermeasure is used by
drowsy subjects. She found taking a nap as a very typical countermeasure and showed that
this countermeasure was used by those drivers who had experienced sleep-related crashes,
professional drivers, males and drivers aged 46–64 years.
1.4 Driver drowsiness detection systems on the market 7
Schmidt et al. (2011) implicitly analyzed the effect of conversing as a countermeasure. They
showed that a 1-min conversation for verbally assessing the drowsiness level led to an
increase of the vigilance state estimated by physiological measures. However, the activating
effect of this countermeasure lasted only up to 2 min in their experiment. Moreover, they
emphasized in their study that their experiment does not declare any issue about the type of
conversation and its generality as a countermeasure, since the conversation only comprised
the driver self-estimation of the drowsiness level. All in all, the distracting effect of talking
with a passenger or cell phone during driving should not be ignored.
Gershon et al. (2011) explored the drowsiness countermeasures based on their usage and per-
ceived effectiveness aspects for professional and non-professional drivers regardless of the
drivers’ age. They considered a driver as a professional one, if driving was part of his work
requirements
e.g. a taxi or bus driver. However, non-professional drivers had other primary jobs than
driving. The results revealed that listening to the radio and opening the window were the
most frequently used and perceived as effective countermeasures by both groups. They
believe that it is the ac- cessibility of these two countermeasures which has made them so
popular. Non-professional drivers used talking to passengers as the second most common
countermeasure. Drinking coffee, however, was used by professional ones more often. In
general, planning rest stops ahead, stopping for a short nap and drinking coffee were more
often used by professional drivers in comparison to the non-professional ones. Gershon et al.
(2011) justified this finding based on the level, at which each group counteracts drowsiness.
They called it “tactical/maneuvering level” for non-professional drivers versus
“strategic/planning level” for professional ones. The former, unlike the latter, tries to decrease
the weariness and boredom without planning ahead.
Rumble strips located on the roadsides or on the middle line are another safety technique to
prevent car accidents due to lane departure (May, 2011). As soon as the driver crosses or
contacts the rumple strips with a wheel, vibrations together with a loud noise will be heard
inside the car and notify the driver of the lane departure. Damousis and Tzovaras (2008) and
Hu and Zheng (2009) used rumble strips in their experiments and defined events of hitting
rumble strips as critical events for driver drowsiness detection systems whose occurrences
should not be missed. Anund (2009), however, studied “the effect of rumple strips on sleepy
drivers” and analyzed the 5 min windows preceding a rumble strip hit and shortly after the
hit. The results revealed an increase in the studied sleepiness indicators such as blink
duration and electroencephalography analysis (will be explained in Section 2.1.2).
Interestingly, shortly after the hit the subjects were alert and enhanced performance was
observed. Nevertheless, these effects did not last more than 5 min and signs of sleepiness
returned. Therefore, as May (2011) also mentioned, rumble strips are only useful for
highlighting the driver drowsiness level but clearly they do not eliminate drowsiness.
1.4. Driver drowsiness detection systems on the market
This section introduces driver drowsiness detection systems available on the market by car
companies and explains the idea behind their detection methods based on the review
provided by Čolić et al. (2014).
Mercedes-Benz In 2009, Daimler AG introduced the Attention Assist to warn drowsy

drivers. This system is mainly based on steering wheel movements and their velocities
(Daimler AG, 2008, 2014a). The idea behind the Attention Assist is that an alert driver steers
with small corrections
8 Introduction
and moderate movements. On the contrary, a drowsy driver does not steer for a short time
followed by a fast large-amplitude steering activity. Hence, the transition between these two
phases is monitored for vehicle speeds from 80 to 180 km/h. Since 2013, this range has been
enhanced to 60 and 200 km/h in the new S and E-class series. The system determines whether
the driver is drowsy by comparing the current steering behavior with that of the beginning of
the driving session. If the difference between them exceeds a threshold, the driver is warned
both audibly and visually, i.e. by an audible signal and displaying “Attention Assist: Break!”
in the instrument cluster. In some new models, the attention level in form of a 5-level bargraph
is also displayed during the drive.
Figure 1.3 shows an example of a typical steering event which is detected by the Attention
Assist and its corresponding vehicle trajectory. In this figure, up to t = 6 s, there is no steering
activity which leads to lane departure. However, shortly after t = 6 s, the driver corrects his
lane departure by an abrupt steering wheel movement up to 10◦.
3
Lane lateral distance [m]
2 vehicle trajectory
1
−1
−2
−3
12.5
Steering wheel angle [◦]
Steering wheel velocity [◦/s]

75
10 Steering wheel angle
Steering wheel 50
7.5 velocity
25
5
0
2.5
−25
0
−50
−2.5
−75
0 2 4 6 8 10
Time [s]
Figure 1.3.: A typical steering event detected by the Attention Assist as a drowsiness-related steering
wheel movement
Attention Assist has several advantages. First of all, it is able to warn the driver early enough
before the occurrence of microsleep events. Moreover, the parameters of the systems are set
individually at the beginning of the drive. Therefore, regardless of the driving characteristics
of a specific driver, the system adapts itself to the current driver. In addition, weighting
factors such as time of day and driving time have been considered in the warning strategy.
Longitudinal and transverse acceleration and vehicle speed also contribute to improve the
system. Since not all steering wheel movements are necessarily drowsiness-related, the
Attention Assist does not take all steering movements into account. Side wind, road bumps
and operation of center console elements, e.g. navigation system or turn signal indicator, are
examples of such non-in-vehicle and in-vehicle external influences on the steering wheel
movements which are filtered out for assessing driver’s level of drowsiness.
1.5 Thesis outline 9
Volvo Driver Alert Control (dac) introduced by Volvo Group in 2007 works based on a lane
tracking camera which observes the lane keeping behavior (Volvo Group, 2014). It is located
between the rear-view mirror and the windshield. Moreover, steering wheel movements are taken
into consideration. The system is active, if the vehicle speed exceeds 65 km/h. During the
drive, a bargraph displayed in the instrument cluster shows the drowsiness level to the
driver. As soon as it decreases to one bar, the driver will be warned both visually, i.e. a
message on the speedometer and acoustically, i.e. an audible warning.
Ford The Ford Motor Company has also developed a system called Driver Alert, which,
similar to the system of Volvo, is based on the lane keeping behavior (Ford Motor Company,
2010). The lane tracking camera is located behind the rear-view mirror. Depending on the
detected lanes, the position of the vehicle is predicted and then compared with the true one.
If the difference exceeds a certain threshold, the driver is warned audibly and visually on the
instrument cluster. If the system determines that the alertness level is still decreasing, a
second warning is displayed which should be accepted by pressing a button. The system will
be reset after turning the engine off or after opening the driver’s door.
Volkswagen The Fatigue Detection system is developed by Volkswagen AG and is based on

the steering movements and some other available signals (Volkswagen AG, 2014). It activates
after a minimum driving time of 15 min and for a vehicle speed exceeding 65 km/h. In the
case of the drowsiness detection, the driver is warned both visually and acoustically. After 15
min, the warning is repeated, if the driver keeps driving. Under some conditions the system
might not work properly, such as “sporty driving style, winding road and poor road surfaces.”
In addition to the mentioned systems integrated into the vehicle, there are also some other
systems available which can be installed as an additional equipment on a vehicle. More
details about such systems are provided by Kircher et al. (2002), Barr et al. (2005) and Čolić et
al. (2014).
1.5. Thesis outline
The outline of the thesis is pictorially shown in Figure 1.4. It starts with Chapter 2, in which
we discuss and review the approaches for objective and subjective driver state
measurements. In the context of objective measures, both driving performance and driver
physiological measures are covered. Since this thesis concentrates on driver eye movements
as a drowsiness indicator, the human visual system, its structure and relevant types of eye
movements are introduced in Chapter 3. Further, in Chapter 4 the measurement system used
in this work is investigated for in-vehicle applications. This chapter also describes the
conducted experiments for collecting eye movement data which will be used later to achieve
the goals of this study (data collection block in Figure 1.4). Our experiments are based on
daytime and nighttime drives both on real roads and in the driving simulator. After data
acquisition, Chapter 5 deals with the detection of different types of eye movements including
both simple and complex approaches like the median filter-based and wavelet transform
methods, respectively (event detection block in Figure 1.4). Afterwards, in Chapter 6, we
answer the question how the occurrence of the eye movements is associated with each other,
especially under distracted driving conditions. From the detected events in Chapter 5, we
introduce extracted blink features in Chapter 7 where we extensively review the previous
studies in order to compare them with our findings (feature extraction
10 Introduction
Data collection Event detection Feature extractionDriver state classification

Electrooculography Blink: ANN Validation
Duration SVM
Blink 300
Amplitude k-NN
[µV]
0
0 Frequency Driver self- estimation
0.5
Time [s] …
Saccade
200
[µV]
0
0123
Time [s] Vehicle data
Microsleep
[µV] Speed
100
0
Lane data
-100
012
...
Time [s]
Figure 1.4.: Tool chain of this thesis
block in Figure 1.4). In addition, the relationship between extracted features and drowsiness
is investigated individually by event-based and correlation-based analyses. In Chapter 8, a
driver state classification is performed using the extracted features by applying three types of
classifiers. The classification results are compared with each other and the optimal classifier is
suggested (classification and validation blocks in Figure 1.4). This chapter ends by assessing
feature dimension reduction approaches. Finally, in Chapter 9 we summarize and conclude
the results of this work and give an outlook of this study which opens rooms for future work.
1.6. Goals and new contributions of the thesis
The main goal of this thesis is to provide a ground truth-based eye movement analysis for
the future development of driver observation camera with concentration on driver
drowsiness detection. In other words, this work targets the full coverage of requirements to
be fulfilled during the development of the camera and the warning system for timely
detection of the onset of drowsiness. Therefore, studying, implementation and evaluation of
drowsiness-related eye movement features are on the central focus. To this account, well-
known methods for event detection, feature extraction and classification need to be
investigated along with providing new ideas and approaches. These goals can be achieved by
designing daytime and nighttime experiments with representative driving scenarios.
In the following, the main contributions of this thesis towards driver drowsiness detection
are summarized.
• Based on a thorough literature review on the terminologies related to drowsiness, a
suitable term among many terminologies e.g. fatigue, sleepiness, etc. is targeted which
best describes the driver state during driving. (Section 1.2)
• A new approach for enhancing the calculation of the alpha spindle rate is suggested
and evaluated against the initial calculation method. This idea benefits from the fusion
of eye movement activity information into the calculation of the alpha spindle rate.
(Section 2.1.2)
• Most of the previous studies collected eye movement data with the eye movement mea-
surement system used in this work (electrooculography) in laboratories or in fixed-base
driving simulators. However, in this work, the reliability and robustness of this system
1.6 Goals and new contributions of the thesis 11
for in-vehicle measurements is evaluated on a proving ground and under fully

controlled real road driving conditions. In addition, due to using vehicle sensors, road-
dependent eye movements are thoroughly analyzed. (Section 4.1)
• The results and findings of this thesis are evaluated based on different experiments on
both real roads and driving simulators with a total number of 43 subjects. Moreover, by
designing both daytime and nighttime experiments, the collected data set contained all
vigilance levels during driving. On the contrary, in previous studies, drowsiness
detection has mostly been explored under simulated driving (see Table 7.3). In
addition, almost all of the previous researches had a smaller number of participants
than this study in their experiments. Moreover, in most of the previous works, sleep-
deprived subjects were participated in the experiments. This leads to an imbalanced
data set in terms of availability of information about different levels of driver
drowsiness and vigilance. (Sections 4.2 and 4.3)
• In an experiment on the real road, where secondary tasks have been performed along
with the primary driving task, eye blink behavior is analyzed. Based on the findings, a
recommendation about task-induced blinks is made. (Chapter 6)
• In order to improve the performance of eye movement detection and the accuracy of the
extracted features, two preprocessing steps, drift and noise removal, are proposed. The
strength of our proposed noise removal approach is its flexibility in selecting noise
removal threshold with respect to the amount of noise. (Section 5.3.3)
• For the detection of blink and distinguishing them from other eye movements, two
novel algorithms are proposed. The first algorithm is based on derivative signal and is
suitable for detection of fast eye movements. The seconds approach is based on
continuous wavelet transform and covers the detection of both fast and slow eye
movements. Therefore, in contrast to other studies, this work addresses the detection of
all relevant eye movements to drowsiness. (Sections 5.2 and 5.3.2)
• This work is the most comprehensive study on eye blink features for in-vehicle
applications and under real driving conditions. By considering all inconsistent
definitions of features in previous studies, 19 features are well-defined and extracted
per blink. Afterwards, their evolution due to drowsiness is studied individually. In
addition, the findings are compared with previous studies which were mostly based on
restricted conditions. (Section 7.2)
• Clearly, the quality of driver observation cameras in detecting eye blinks will not be as
high as that of the measurement system used in this work. This issue is investigated
and possible peak amplitude loss is evaluated. (Section 7.6)
• Two feature aggregation approaches are suggested for the investigation of the relation-
ship between extracted eye movement-based features and drowsiness. Based on the
first approach, features are analyzed shortly before safety-critical events. Moreover, the
lane- keeping based and eye movement-based drowsiness detection methods are
challenged. How- ever, the second approach benefits from quick changes of drowsiness
level in the course of time. On this account, in the second approach, feature values are
extracted over time regardless of safety-critical events (Sections 7.1, 7.4 and 7.5)
• Since features based on physiological measures are highly individual and vary from
one subject to the next, we propose two baselining methods to minimize this
deteriorating effect. (Section 7.1.3)
12 Introduction
• Apart from scrutinizing features as separate and independent source of information,

they are fused and investigated by different state-of-the-art classifiers. The performance
of the sophisticated classifiers are evaluated and compared with each other for different
types of extracted features and different data division methods. (Chapter 8)
• One the one hand, data collection in driving simulators, either fixed or moving-base,
and on real roads are very expensive. One the other hand, for driver drowsiness
detection with high detection rate and low false alarm rate, representative data set with
both awake and drowsy driving samples is essential. In this work, first, the imbalanced
data sets are explored and their issue is addressed by artificially balancing the data sets.
Moreover, it is investigated whether artificially balanced data sets can replace data
collection of awake samples. (Chapter 8)
• This thesis provides new insights into the generalization of the data collected in the
driving simulator to that of the real road driving on feature fusion level. To this end,
two new approaches are studied. (Section 8.6)
• Finally, to address the concerns about computational resource in in-vehicle warning
systems, feature dimension reduction approaches are applied. (Section 8.7)
2. Driver state measurement
This chapter introduces approaches for measuring the driver state either objectively or
subjec- tively. The former deals with methods for developing an external measure to prevent
car crashes due to driver drowsiness. The latter, however, serves as the reference for
assessing the efficiency of an objective measure. In addition, previous studies, which have
introduced these measures, are reviewed. A novel idea for improving one of the objective
measures is also proposed and evaluated.
2.1. Objective driver state measures
Driver objective measures, as their name implies, are measures which are collected by a
measurement technique such as sensors, electrodes, etc. with no deliberate interference of
the driver in them. An objective measure is developed based on either driving performance
measures, driver physiological measures or their fusion which is called hybrid measures.
2.1.1. Driving performance measures
Čolić et al. (2014) summarized car crashes due to drowsy driving with the following
characteristics which were based on reports by the police or the driver himself:
• “Higher speed with little or no breaking” which means the combination of high speed with
low reaction time due to drowsiness.
• “A vehicle leaves the roadway” which is also called single-vehicle crash due to lane depar-
ture.
• “The crash occurs on a high-speed road” which might be due to monotonicity of such roads.
• “The driver does not attempt to avoid crashing” which is the result of severe drowsiness
and falling asleep.
• “The driver is alone in the vehicle”.
The common point in these characteristics is that they all lead to degraded driving
performance. As a result, by quantifying them by means of sensors installed in the car, it is
possible to develop a drowsiness indicator measure to prevent car crashes. These measures
are called driving performance measures. A main specification of them is that they observe
the driver indirectly. Moreover, they do not measure the drowsiness itself, but its
consequences.
An advantage of such measures is that they can be measured even without having a direct
contact with the driver, namely unobtrusively. These measures, which are all related to the
vehicle, mainly contain steering wheel and lane keeping behaviors (Liu et al., 2009). In the
following, both of these behaviors and studies on them are introduced and discussed.
14 Driver state measurement
Driver’s lane keeping behavior
One of the measures reflecting driving performance is the lane keeping behavior which is ex-
tracted from the lane lateral distance. Lane lateral distance refers to the offset between the
middle of the lane and middle of the vehicle. Analysis of this measure is mainly based on the
assumption that an alert driver, unlike a drowsy one, stays in the middle of the lane.
However, not staying in the middle of the lane is not necessarily a reason for a low vigilance
state of the driver. A counterexample is a driver who keeps more to the left side of the lane
for a better forward look, while another car is in front of him. Therefore, the standard deviation
of lateral position (sdlp) is used instead to quantify to what extent the driver swings in the lane.
Johns et al. (2007) and Damousis and Tzovaras (2008), who studied the relationship between
eyelid activity and drowsiness, used lane departure events as safety-critical phases of the
drive for evaluating their eyelid-based drowsiness measure. Damousis and Tzovaras (2008)
reported that the analysis of brain activity (will be explained in Section 2.1.2) did not
correlate well with these events. Similarly, Sommer and Golz (2010) also relied on the sdlp as
an objective reference measure which increased due to drowsiness in their experiment. They
defined 13% deviation of it as the threshold between mild and strong drowsiness.
Verwey and Zaidel (2000) defined the following lane keeping behavior occurrences as driving
errors:
• “road departure error: leaving the pavement with all four wheels”
• “moderate lane crossing error: leaving the pavement with one or two wheels”
• “minor lane crossing error: crossing the solid lane markings with one or two wheels”
• “time-to-line crossing (tlc): crossing the solid lane marking within 0.5 s, if no action is
taken (tlc < 0.5 s)”
They acknowledged the tlc as a reliable measure reflecting poor driving performance.
Skipper and Wierwille (1986) found a significant positive interaction between eyelid closure
and sdlp whose variation improved distinguishing between alert and drowsy classes by a
discriminant analysis model. Åkerstedt et al. (2005) also studied lane departure events defined
as “four wheels outside the left lane marking (accident) and two wheels outside the lane markings
(incident)”. In their study with shift workers, it was shown that the number of incidents
increased three times due to drowsiness. sdlp also increased from 18 cm to 43 cm. Otmani et al.
(2005), however, found no interaction between sdlp and sleep deprivation, although this
measure increased during their experiment for both sleep-deprived and non-sleep-deprived
subjects. In fact, it was the driving duration (time-on-task) which affected the sdlp. Ingre et
al. (2006) analyzed the relationship between sdlp and a subjective measure for shift workers
with both enough night sleep and no night sleep. They found that the significant relationship
between them is curvilinear, i.e. looks like a curved line. Moreover, they emphasized the large
between-subject differences in the values of the sdlp as an issue. Arnedt et al. (2005) reported
a tendency to left side lane keeping for subjects with prolonged wakefulness.
Wigh (2007) and Ebrahim (2011) studied event-based lane keeping behavior. In order to define
events, two zones, called virtual edge zone (vez), were defined on the right and left side of the
vehicle near to each lane. The location and the width of the vez’s were adjusted individually
based on the lane keeping behavior of the driver during driving. Hence, the zones were
adapted for a driver who tended to keep left or right in a lane without any problem. The
entrance of the wheels to the zones were then weighted, i.e. the more the wheel entered the
zone, the higher was
2.1 Objective driver state measures 15
the corresponding weight. In the end, based on an incremental running mean of the weights
and its comparison with a threshold, it was decided whether to warn the driver or not. In
fact, the driver was considered drowsy with respect to the amount and the number of times
he entered or got close to the defined zones which were not necessarily the road lane
marking.
Working of Vehicle Based Systems
In Vehicle based system Driver modeling approach is used which addresses the driver state
detection by means of steering movement and lane keeping behavior. By applying system
identification methods, a model is developed for predicting the steering wheel angle based on
the lane lateral position. Changes in the model parameters or the deviation of the measured
steering wheel angle from the predicted one are suggested as the objective measures for driver
state identification by Pilutti and Ulsoy (1999). Hermannstädter and Yang (2013) also
distinguished between distracted and undistracted driving based on driver modeling. Working
of two main vehicle based system is discussed in detail below:-
Steering Wheel Movement (SWM). These methods rely on measuring the steering wheel
angle using an angle sensor mounted on the steering column, which allows for detection of
even the slightest steering wheel position changes. When the driver is drowsy, the number of
micro-corrections on the steering wheel is lower than the one found in normal driving
conditions. A potential problem with this approach is the high number of false positives.
SWM-based systems can function reliably only in particular environments and are too
dependent on the geo- metric characteristics of the road and, to a lesser extent, on the kinetic
characteristics of the vehicle.
Standard Deviation of Lane Position (SDLP). Leaving a designated lane and crossing into
a lane of opposing traffic or going off the road are typical behaviors of a car driven by a driver
who has fallen asleep. The core idea behind SDLP is to monitor the car’s relative position
within its lane with an externally-mounted camera. Specialized software is used to analyze the
data acquired by the camera and compute the car’s position relative to the road’s middle lane.
SDLP-based systems’ limitations are mostly tied to their dependence on external factors such
as: road marking, weather, and lighting conditions.
Working of Physiological Systems
Driver physiological measures, in general, are measures based on the direct observation of the
driver which can be either intrusive like electrophysiological ones or non-intrusive like
cameras. Unlike the latter, the former requires a direct contact of the electrodes to the driver’s
skin. These measurement techniques provide an objective measure to describe the driver state
(Simon et al., 2011) and are believed to outperform other introduced measures from the point
that they detect drowsiness at its early phase.
In the following, working of these measurement techniques is explained in detail.
Electroencephalography
Electroencephalography (EEG) is a physiological method for recording brain activity in terms

of electrical potential by electrodes located on special positions on the head (Niedermeyer and
da Silva, 2005; Kincses et al., 2008). This measurement method has the advantage of being a
continuous recording method together with a high temporal resolution (up to 512 Hz) (Kincses
et al., 2008).
Figure 2.1 shows a 32-electrode arrangement (excluding 4 electrodes for eye movement data
collection) of this measurement system which can also be used during driving by wearing a cap.
Figure 2.1.: 32-electrode arrangement of EEG (excluding 4 electrodes for eye movement data collection)
In order to improve the conductivity of the electrodes, a special paste should be used between
the electrodes and the skin. However, depending on the number of electrodes being used,
injecting the paste can be very time-consuming and the subjects should wash their hair after
data collection. Figure 2.2 shows an electrode, an EEG-cap and the paste being injected from
ActiCAP, Brain Products GmbH (2009).
Figure 2.2.: ActiCAP measurement system for EEG recording by Brain Products GmbH (2009)
The amplitude range of EEG waves is usually between 0 to 200 µV which makes it difficult to
distinguish them from noise and some artefacts (e.g. scratching the head) (Svensson, 2004;
Damousis and Tzovaras, 2008). Moreover, EEG is very sensitive to movements and muscle
artefacts. Even fast spontaneous eye blinks affect the waves easily. Therefore, if a suitable
artefact removal in not applied, the collected data should not be analyzed further. Simon
(2013) and Santillán-Guzmán (2014) studied different approaches for EEG artifact removal.
EEG waves can be either analyzed in the time domain or in the frequency domain. The former
includes calculating some statistical values within an interval (Dong et al., 2011), whereas the
latter covers the analysis within the following different frequency bands: δ (up to 3.5 Hz), θ
(4– 7 Hz), α (7–13 Hz), β (14–30 Hz) and γ (35–100 Hz) (Niedermeyer and da Silva, 2005).
Among these bands, the α-band and especially α-bursts have shown to be the most
drowsiness-related
bands for detecting early phases of drowsiness (O’Hanlon and Kelley, 1977; Kecklund and
Åk- erstedt, 1993; Eoh et al., 2005; Papadelis et al., 2007; Schmidt et al., 2009; Simon et al.,
2011). Moreover, α-waves are mainly dominant during eye closure (Saroj and Craig, 2001).
Figure 2.3(b) shows an EEG recording with closed eyes containing α-bursts. For a better
comparison, EEG signals recorded with open eyes are also shown in Figure 2.3(a). The bursts
occur with different amplitudes in different locations on the head and are very dominant in
parieto-occipital electrodes, i.e. P3, Pz, P4, O1, O2 electrodes. This was also reported by
Simon et al. (2011) who evaluated α-bursts with respect to the driver drowsiness and called
them α-spindles.
O2
O1
P4
EEG signals
Pz
P3
C4
Cz
C3
F4
Fz
F3
0 1 2 0 1 2 3
3 Time [s]
Time [s]
(b) closed eyes
(a) open eyes
Figure 2.3.: EEG signals showing α-bursts with closed eyes versus open eyes
Figure 2.4 shows the Fourier transform of the electrode O2 for both cases with open and
closed eyes. It can be seen that the bursts are within 8.8–12.5 Hz which corresponds to the
frequency range of the α-band.
2
10 |F (O2)| with open eyes
|F (O2)| with closed eyes
8.8 Hz 12.5 H z
1
10
0
10
10−1
0 5 10 15 20 25 30 35 40
Frequency [Hz]
Figure 2.4.: Frequency components of the α-bursts by applying the Fourier transform to the
wave of the O2 electrode shown in Figure 2.3
Simon et al. (2011) suggested a method for identification of the mentioned spindles as
explained in the following. First, a 1-s zero-mean-made segment of the EEG recording with
75% overlap is multiplied with a Hamming window and the fast Fourier transform of it is
calculated. If the maximum value of the calculated spectrum is located within the range of the
α-band, then the full width at half maximum (fwhm) of the spectral peak is determined and
compared with twice the bandwidth of the Hamming window (BW Hamming). Depending on
the result of the mentioned thresholding analysis (desired: fwhm < 2 BWHamming), the
segment is subject to further investigations. BW Hamming corresponds to the minimum
bandwidth for an oscillatory activity. This procedure is repeated for all 1-s zero-mean-made
segments. After calculating the signal-to-noise ratio (snr) (for more details see Simon et al.
(2011)), segments with acceptable snr values and peak frequencies, whose deviation from each
other is not more than 10%, are summed up as an α-spindle. Different features such as
duration, spectral amplitude and peak frequency of the discrete α-spindle events are then
calculated by moving average within a 1 or 5 min windows and 50% to 80% overlap. Simon et
al. (2011) also introduced alpha spindle rate (asr) as the number of α-spindle events occurring
within the mentioned moving average time intervals. Based on the statistical analysis, they
showed in their study that α-spindle parameters, averaged over all subjects, increased within
the last 20 min of the drive in comparison to the first 20 min. In addition, asr in their study
outperformed the common α-power for the drowsiness detection.
In agreement with the mentioned study, Schmidt et al. (2011) explored the variation of asr,
blink duration, heart rate and reaction time in a monotonous daytime drive under real driving
condition. The results showed a significant increase of asr and a significant decrease of heart
rate. Blink duration increased as well, however, it was not statistically significant. On the
contrary, Anund (2009) showed that θ and α activity of the EEG do not necessarily detect
phases shortly before a safety-critical event such as lane departures and hitting the rumble
strips.
Unlike Schmidt et al. (2009) and Simon et al. (2011), who analyzed the long-term variation of
the asr, Sonnleitner et al. (2011, 2012) studied the short-time variation of it with respect to
driver distraction, i.e. while the driver performed secondary tasks. These studies showed that
under both real and simulated driving conditions, the asr is subject to contrary variations due
to the type of the secondary tasks, namely auditory and visuomotor, in comparison to the
primary driving task. That means, performing the auditory secondary task leads to an increase
of the asr, while performing the visuomotor secondary task results in a drop. Their experiment
under real road condition is explained in Section 4.2.2. Figure 2.5(a) shows the asr of a
participant in the experiment. asr60 refers to the length of the moving average window, i.e. 60
s, for counting the number of α-spindles. The phases indicating secondary tasks are also
shown. Sonnleitner et al. (2011) related these results to the “visual information process” whose
increase and decrease directly influences the value of the asr. Performing the visuomotor
secondary task in their study is very similar to the data entering into the navigation system in
daily routines. On the other hand, the auditory secondary task in this study can be considered
as a cognitive distraction similar to a cell phone conversation during driving. Therefore, it can
be concluded that the asr and, in general, the EEG is a very dynamic and sensitive measure
which makes its interpretation ambiguous. A high value of it can either be due to drowsiness
or cognitive distraction and, similarly, its lower values show either alertness or visual
distraction of the driver. To put it another way, not all α-spindles seem to be drowsiness-
related.
45
auditory secondary task
asr60 [1/minute] 40 visuomotor secondary task
35
30
25
20
15
10
5
0
0 20 40 60 80 100 120 140 160
Time [minute]
(a) sensitivity of the asr to secondary tasks during real daytime driving
Nr. of horizontal saccades
25
calculated threshold
20
15
10
0
0 20 40 60 80 100 120 140 160
Time [minute]
(b) corresponding number of horizontal saccades
Figure 2.5.: Sensitivity of the asr to auditory and visuomotor secondary tasks and the
corresponding number of horizontal saccades
Heart activity and Respiration
By adding an extra electrode to the EEG measurement system, similar to electrocardiography

(ECG), heart rate and heart rate variability (hrv) can be measured based on the so-called
R- to-R (beat-to-beat) interval. There are some studies which believe that these measures are
drowsiness indicators. During a prolonged night driving, Riemersma et al. (1977) showed the
decrease of heart rate and the increase of the hrv in agreement with O’Hanlon (1972) and Lal.
(2001). In the study by Lal. (2001), the drop of heart rate was quantified as up to 6 beat/min
in the driving simulator. Schmidt et al. (2011) also reported a similar result in a monotonous
daytime driving. Moreover, it seems that the regularity of the heart rate depends on the fact
whether the subject is focusing on the task or not (Rosario et al., 2010).
Lal. (2001) and Tran et al. (2009) analyzed the power spectrum of the hrv based on its low (lf,
0.05–0.1 Hz) and high (hf, 0.1–0.35 Hz) frequency components and found correlation between
these components or their ratio (lf/hf) with the drowsiness. Rosario et al. (2010) and Chua
et al. (2012) also suggested the hrv for the detection of attentional failures, especially if it is
fused with other measures for integration in safety systems.
In general, however, it is believed that the mentioned measures are also subject to variation
due to other factors such as stress or relaxation. On the contrary to the mentioned studies,
Papadelis et al. (2007) did not find any statistically significant change effects of the hrv in
sleep-deprived subjects. This result was valid even by comparing the first and the last parts of
the drive.
The correlation between respiration activity and drowsiness has also been studied. Rosario et
al. (2010) reported 5% increase in the respiration amplitude during drowsy phases in
comparison to the awake phase. Moreover, a decreased respiratory rate due to drowsiness was
found by D u r e m a n a n d B o d é é n (1972).
Recently, new approaches are introduced for a contactless measurement of the heart rate and
respiration which are all based on a camera or a radar signal, etc. (Bartula et al., 2013; Gault
and Farag, 2013). Kranjec et al. (2014) reviewed non-contact heart rate measurement
methods.
■ Electromagnetic coil system
The electromagnetic coil system, the search coil and the scleral contact lens are the most
precise methods for measuring eye movements, since they are directly attached to the eyes. As
a contact lens, these measurement systems look like a ring which is placed over the cornea and
sclera and, as a result, they are the most intrusive measurement systems. They are also
believed to manipulate some eye movements.
Blinking behavior can also be measured by search coils, if they are placed around the eyes
(e.g. above and below) (Hargutt, 2003). Depending on the distance between the coils, which
corresponds to the distance between eyelids, electrical voltage is induced. Therefore, the
eyelid gap can be measured in voltage unit and then converted into millimeter.
■ Electrooculography
Electrooculography is a popular measurement system from the EEG and ECG family which
also comprises attaching electrodes directly around the eyes as shown in Figure 2.8.
According to Fairclough and Gilleade (2014), EOG benefits from the fact that eyes are
considered as a dipole with its negative and positives poles at the retina and the cornea,
respectively. Therefore, the eye has a static potential field under the assumption that the
potential difference
Figure 2.8.: EOG electrodes attached around the eyes for collecting horizontal and vertical eye
movement data
between its poles is fixed. As soon as the eyes move, the potential measured by the electrodes
varies. This is exactly what EOG measures, the “corneal-retinal potentional” (Stern et al.,
2001).
As shown in Figure 2.8, for a bipolar electrode setup, in addition to reference and ground
electrodes1, four electrodes are needed. The two electrodes located at the right and left outer
canthi2 of the eyes collect the horizontal component of eye movements and the other two
located about 2 cm above and below the eye collect eye blinks and the vertical component of
the eye movements. In this arrangement of electrodes, it is assumed that the movements of
both eyes are synchronous. Therefore, it is sufficient to locate the electrodes only around one
eye. Since locating an electrode at the inner canthus of the eye is intrusive, the outer canthus
of the other eye is usually used. This corresponds to a dipole with the negative pole on one
eye and the positive pole on the other one. By moving the eyes, different poles get close to the
electrodes which leads to the potential variation. The measured voltage is the difference
between the potential measured at an active electrode and the reference electrode. The ground
electrode, however, is used for common mode rejection (Nunez and Srinivasan, 2006). In
addition to the vertical and horizontal eye movements, EOG can also record eye blinks. The
reason is that during blinking, the eye ball rotates upward which also leads to the change of
the dipole field. Consequently, blinks are only visible in the vertical component of the EOG.
Since the occurrence of involuntary blinks (see Section 3.3) is inevitable, they can also be
considered as artifacts in the capturing of voluntary eye movements.
An advantage of the EOG is its high sampling frequency (up to 1000 Hz) which makes it a
very suitable system for extracting the velocity of very rapid eye movements. In addition, it
provides a continuous recording with some artifacts, though. Its record is independent of
almost all external factors such as wearing glasses, contact lenses or lighting conditions.
Unlike cameras, it can be used in darkness.
According to Straube and Büttner (2007), EOG is subject to three types of noise:
• inductive noise: the residential power line or any electromagnetic field affects the
recorded signal by induction and coupling into it. This noise, however, can be filtered
out in the preprocessing step.
• thermal noise: the skin resistance and the electrode’s input resistance generate this type
of noise which deteriorates the signal quality. Therefore, it is recommended to use
conductive paste and to clean the electrodes and the skin before starting the
measurement. Otherwise, drift might be visible in the recorded data as shown in Figure
2.9. We discuss about the methods for drift removal in Chapter 5.
• capacitive noise: in electrical circuits, capacitive noise refers to the noise due to the nearby
1Reference and ground electrodes are located behind the ears.

2corner of the eye
600 V (n)
[µV] 400
200
−200
0 5 10 15 20 25
Time [s]
Figure 2.9.: An example of the drift in the collected EOG data (vertical component)
electronics. Similarly, in EOG, this noise corresponds to the nearby muscle artifacts like
chewing.
Furthermore, another disadvantage of the EOG is its dependency to the location of electrodes.
If the electrodes are placed very far from the eyes, the measured amplitudes will be smaller.
Therefore, for a specific person, different eye movement characteristics might be measured at
different recording times, if electrodes are located at other places than before.
Attaching electrodes directly around the eyes and injecting the paste also makes this measure-
ment system an obtrusive one. Moreover, since EOG measures the potential resulting from
eyeball movements, it can never measure the eyelid gap directly.
Since both EOG and the electromagnetic coil system measure relative eye movements, they
cannot measure head movements.
By conducting a pilot study on a proving ground under fully controlled real road conditions,
we investigated the robustness and reliability of the EOG measurement system for an in-
vehicle eye movement data collection. Based on the achieved results and findings, which are
discussed in Section 4.1, the EOG measurement system has been used in all other experiments
conducted in this work for collecting eye movement data especially during real driving
conditions.
2.2. Subjective estimation of the drowsiness
Subjective estimation of the drowsiness, as its name says, is based on the rating of subjects
about their vigilance or drowsiness level before, during and at the end of the experiment.
This estimation can be done either by the subject himself or by an investigator.
Williamson et al. (2014) studied 90 drivers in a driving simulator to answer the question:
“Are drivers aware of sleepiness and increasing crash risk while driving?”. According to this study,
drivers are aware of their drowsiness level based on the access to their cognitive information.
Nevertheless, they are poor in judgments about the risk of crashes due to drowsiness. This
finding is in agreement with that of the Baranski (2007) with sleep-deprived subjects who
showed that both subjective and objective measures were related to drowsiness. On the
contrary, Moller et al. (2006) found no interaction between these two measures and concluded
that subjects might suffer from lack of full insight into their degraded performance. Verwey
and Zaidel (2000) also reported disassociation between physiological and subjective measures.
Simon (2013) believes that since cognitive performance degrades in line with the increase of
the drowsiness, a drowsy subject is also less able to estimate himself correctly. In fact, self-
rating requires higher mental performance.
Clearly, due to its nature, subjective self-estimation of the drowsiness cannot be collected
very frequently, because it affects the driver state at the time of recording, especially the
monotonicity and drowsiness (Schmidt et al., 2011). It is possible to either collect it
verbally, i.e. an
2.2 Subjective estimation of the drowsiness 27
investigator asks the driver to rate his drowsiness level based on a pre-defined scale or by a
touchscreen and pushing on the desired scale by the subject himself. Each of these variants
has its own shortcomings. As an example, Schmidt et al. (2011) studied how verbal
assessment of driver’s state affects the vigilance during a monotonous daytime driving, i.e.
task-related passive drowsiness. Implicitly, this study also discussed to what extent
conversing with passengers can be considered as a drowsiness countermeasure. The results
showed that verbal assessment of the drowsiness level led to an improved vigilance state
which did not last longer than 2 min, though. Therefore, Schmidt et al. (2011) suggested 5-min
interval data collection of subjective self-rating as an effective way to avoid contamination of
drowsiness evolution. Using a touchscreen also involves some off-road gazes shifts and
influences the driving performance similar to the visual distraction.
Another concern about the subjective measure is its interpretation due to its discreteness. In
this context, the following issues raise:
• How should we compare the evolution of a continuous objective measure, which is
collected with a higher frequency, with a less frequently collected subjective self-rating?
• Is it allowed to assume that the subjective self-rating between successive inputs remains
constant or that it varies linearly/non-linearly?
Unfortunately, no consistent answers to these questions can be found in other studies. Our
assumptions about the aforementioned questions will be discussed in Sections 7.1 and 8.1.1.
In the following, different scales for a subjective self-rating are introduced and discussed.
Karolinska Sleepiness Scale
The Karolinska Sleepiness Scale (kss), as the most common self-rating scale (Dong et al., 2011),
was first introduced by Åkerstedt and Gillberg (1990) and has 9 scales as shown in Table 2.1.
Table 2.1.: Karolinska Sleepiness Scale (kss)

KSS Description
1 Extremely alert
2 Very alert
3 Alert
4 Rather alert
5 Neither alert nor sleepy
6 Some signs of sleepiness
7 Sleepy, but no effort to keep alert
8 Sleepy, some effort to keep alert
9 Very sleepy, great effort to keep alert
According to Shahid et al. (2012b), this scale is sensitive to fluctuations and best reflects the
psycho-physical state in the last 10 min of self-estimation. Ingre et al. (2006) and Anund
(2009) as well as Sommer and Golz (2010) believed that parts of the drive ≥ with kss 7 are
mainly associated with safety-critical conditions.
There exist a lot of studies which relied on the kss values as a drowsiness reference for
evaluating other objective measures (Åkerstedt et al., 2005; Ingre et al., 2006; Fürsich, 2009;
Sommer and Golz, 2010; Friedrichs and Yang, 2010a; Friedrichs et al., 2010; Friedrichs and
Yang, 2010b; Pimenta, 2011). Belz et al. (2004), as an example, concluded that their studied
metrics, such
as the minimum time to collision, were not drowsy indicators based on the correlation
analysis with the kss. Other studies, however, analyzed the correlation of objective
measures with the kss as an independent factor. Ingre et al. (2006) studied the relationship
between the kss and objective measures of the sdlp and blink duration. The results showed
that both measures were significantly related to the kss with a curvilinear effect. A similar
result was found by Åkerstedt et al. (2005) for shift workers. Kaida et al. (2006) validated
kss against EEG features and achieved significant high correlations between them. Schmidt et
al. (2009) studied physiological measures (e.g. EEG features and heart rate) under
monotonous daytime driving while the subjects rated their drowsiness level based on the
kss. According to their finding, the evolution of all measures was consistent with each
other, i.e. the asr increased parallel with the increase of the subjective self-rating.
Interestingly, at the last part of the drive, the kss decreased, although all physiological
measures kept their previous trends. They believed that this improved vigilance level may
correlate with either the circadian effect, the intensified traffic density and even the joyful
feeling that soon the experiment is over or a combination of all mentioned reasons. On
the other side, however, by assuming that physiological measures correctly reflected drivers’
state, they concluded that long monotonous driving (longer than 3 h) led to deterioration of
the self-rating ability due to a declined vigilance level.
Different lengths of time intervals between successive kss inputs have been reported in the
mentioned studies which are listed in Table 2.2. They are varying from 2 to 30 min.
Schleicher et al. (2008) used 30 min time interval, but suggested 15–20 min for future studies
and interpolated the values linearly.
Table 2.2.: Literature review of the length of time intervals between successive kss inputs
Author time interval
Sommer and Golz (2010) 2 min
Ingre et al. (2006) 5 min
Åkerstedt et al. (2005) 5 min
Friedrichs and Yang (2010a) 15 min
Friedrichs and Yang (2010b) 15 min
Schmidt et al. (2009) 20 min
Schleicher et al. (2008) 30 min
According to Svensson (2004) and Sommer and Golz (2010), although each scale of the kss is
clearly defined, it is very probable that subjects interpret the scales inaccurately and
relatively to the previous situations. This is indeed a disadvantage of the kss. In other words,
for each kss input, it is very probable that the subjects compare their current state with the
previous ones for a better self-rating. Anund (2009), as an example, even instructed the
subjects to rate their drowsiness level with respect to the state in the last 5 min. Hence,
depending on the preciseness of the first selected kss value, there might be a bias shift on the
other selected values until the end of the experiment. Furthermore, a subject, who due to the
mentioned bias shift reaches kss
= 9 relative early, has no other choice to select during deeper phases of the drowsiness.
Stanford Sleepiness Scale
Stanford Sleepiness Scale (sss) has 7 scales and only one of the scales should be selected at the
time of query (Hoddes et al., 1973; Shahid et al., 2012c). sss is very similar to the kss and the
description of its scales is listed in Table 2.3.
2.2 Subjective estimation of the drowsiness 29
Table 2.3.: Stanford Sleepiness Scale (sss)

SSS Description
1 Feeling active, vital, alert, or wide awake
2 Functioning at high levels, but not at peak; able to concentrate
3 Awake, but relaxed; responsive but not fully alert
4 Somewhat foggy, let down
5 Foggy; losing interest in remaining awake; slowed down
6 Sleepy, woozy, fighting sleep; prefer to lie down
7 No longer fighting sleep, sleep onset soon; having dream-like thoughts
Epworth Sleepiness Scale
Epworth Sleepiness Scale (ess) introduced by Johns (1991) is another subjective measure
which summarizes the likelihood of falling asleep during 8 different situations, such as
watching tv, sitting and reading, as a passenger in the car for one hour without a break,
etc. Each situation can be evaluated based on a 0–3 scale, i.e. from 0 for “would never
doze” to 3 for “high chance of dozing” (Shahid et al., 2012a). The overall score of a subject is
between 0 (0 for all situations) and 24 (3 for all 8 situations). Scores smaller than 10 and
higher than 15 are interpreted as awake and sleepy, respectively (Čolić et al., 2014). This
scale is not useful, if sleepiness should be measured repeatedly (Anund, 2009). The reason is
that on the contrary to the other introduced scales which are situational, ess gives an
insight to the general tendency of the sleepiness.
In addition to the mentioned scales, which are collected based on the self-rating of the subject
himself, it is also possible to rely on an expert-rating or video-labeling. Despite the fact that
these are also subjective, they contain information about the subject’s state of which the
subject might not be aware, e.g. a microsleep.
Expert-rating is performed either online, i.e. during the experiment or offline, i.e. based on
the recorded video data. Both of the approaches mainly rely on the observable drowsiness
symptoms such as yawning, heavy eyelids, improper lane keeping, etc. A more reliable
expert rating is achieved, if more than one expert rate the experiment and its events and if the
experts are trained using video examples to reach a common understanding (inter-rater
reliability (Field, 2007)). In the end, the majority voting determines the final rating. In general,
however, the quality of offline expert ratings depends highly on the quality of the recorded
video data.
Schleicher et al. (2008), who studied blink behavior as a drowsiness indicator, used, in
addition to a subjective self-rating measure, an offline video-rating based on symptoms such
as facial gestures, blink frequency, scaring, etc. Damousis and Tzovaras (2008) also decided
based on the video-labeling whether lane departures occurred simultaneously with
microsleep events. In the experiment conducted by Rosario et al. (2010), an observer recorded
body and face movements online.
In Chapters 5 and 7, we also rely on an offline expert-rating as the ground truth for
evaluating the blink detection algorithm and the occurrence of safety-critical events, such as
the lane departure and microsleep.
There exist other subjective measures such as Visual Analog scales (vas) (Monk, 1989), Crew
Status Check Card and Sleep Survey Form (Morris and Miller, 1996) and Karolinska drowsiness
score (kds) (Jammes et al., 2008; Hu and Zheng, 2009). Čolić et al. (2014) has reviewed many
of them.
3. Human visual system
In this chapter, first, visual attention and the way the human eye operates is described. After-
wards, different types of relevant eye movements during driving are defined. Here, we
concentrate on the drowsiness as the long-term distraction. Therefore, the information process
in human visual system remains out of the scope of this work. This topic is of importance for
the interpretation of the short-term distraction.
3.1. Visual attention
The material in this section is taken from Duchowski (2007). He explained visual attention
based on “where” and “what”. The idea of “where” defines visual attention as roaming eyes
in the space (von Helmholtz, 1925). On the contrary, with the definition of James (1981),
visual attention means “focus of attention”, i.e. “what”. At the first glance, these two ideas
seem to be independent. However, they support each other in a way that visual attention is
only understandable, if both definitions are considered. The idea of “where” occurs
parafoveally which means, at first, something roughly attracts our attention as a whole in the
entire visual field. It is similar to a low resolution image. Then, the idea of “what” leads to the
collection of more detailed information through the foveal vision1. In fact, during the second
step, the image will be perceived as a high resolution image. It is believed that in both steps,
when the eyes move and are not fixating, the attention is turned off. There exist also some
other ideas such as “how” which deals with the type of responses and the reaction of eyes to
stimulus. Such ideas are out of our scope, though.
We discussed above about collecting full detailed information using foveal vision which is
about 2◦. The fovea is a narrow area with the sharpest image. Nevertheless, the information
outside of it can be seen and perceived within a certain area. The size of this larger area,
which is called the functional visual field, varies depending on the task being performed
(Holmqvist et al., 2011). As an example, its size decreases in line with the increase of the
cognitive load during driving.
3.2. Structure of the human eye
Figure 3.1 shows a very simple structure of the human eye. The human eye, which has a
spherical shape, receives the light reflected from objects in the environment. The light rays
are then bent by the cornea and refracted parallel towards the lens. Afterwards, the lens
focuses the rays on the retina from where the image is then received by the rod and cone
cells. They convert the image to electrical nervous stimuli and then sent them to the visual
cortex for processing and interpretation.
1
Fovea is the central region of the retina with the most acute perception and the sharpest vision.
32 Human visual system
cornea
lens
retina
Figure 3.1.: Structure of the human eye while transmitting the ray of light
Six muscles in three pairs are responsible for moving the eye into different directions: the me-
dial and lateral recti for sideways movements, the superior and inferior recti for up and down
movements and the superior and inferior obliques for twisting. These muscles are all shown in
Figure 3.2.
superior
superior
lat me
Figure 3.2.: Eye muscles
3.3. Types of eye movements
Eye movements can be categorized based on different characteristics. The first type of catego-
rization is as follows: voluntary versus involuntary or reflexive. Another categorization is
based on the velocity of the eye movements, i.e. slow versus fast eye movements as shown in
Figure
3.3. Fast and slow eye movements are also called rapid eye movements (rem) and slow eye
movements (sem).
Eye movements
fast slow
eye blink saccade smooth pursuit eye blinks during

drowsiness
non-saccadic saccadic
short long short long

eye closure eye closure eye closure eye closure &
(normal eye blink) & microsleep microsleep
Figure 3.3.: Different categories of eye movements based on their velocity
3.3 Types of eye movements 33
In the following, blinks, saccades, fixation, smooth pursuit and optokinetic nystagmus as
relevant eye movements during driving are defined.
Eye blinks
Regular and rapid closing and opening of the human eyes is called eye blink which consists of
three stages: closing, closed and opening (Hammoud, 2008). Blinking occurs either
voluntarily or involuntarily. Involuntary blinks can also be divided into spontaneous and
reflex blinks. The former includes eye blinks which occur regularly to protect eyes against
external particles. They also keep the eyes wet by spreading “precornial tear film” over the
cornea (Records, 1979). Reflex blinks, however, are the result of an obvious, identifiable
external stimulus like bright light or loud noise. Stern et al. (1984) also used similar taxonomy
but under different terms: endogenous versus exogenous blinks. Exogenous blinks include
reflex, voluntary blinks and long eye closures such as microsleeps. Thus, endogenous blinks
are equivalent to the mentioned spontaneous blinks. In this work, however, long eye closures
and microsleeps are also considered as spontaneous blinks.
Characteristics of spontaneous blinks like frequency can be influenced by factors such as vigi-
lance, activity, emotion and tasks. Furthermore, air quality and cognitive process also affect
the occurrence rate of such blinks (Stern et al., 1984). In general, performing tasks, which
require visual attention, such as reading decreases the frequency of such blinks. According to
Stern et al. (1984), the amount of the drop depends on the nature of the task and how
demanding the task is. They have also mentioned that during performing tasks, blinks are
liable to occur when the attention has decreased. Another moment, during which the
occurrence of spontaneous blinks is also very probable, is the gaze shift. Gaze shifts are most
often accompanied by spontaneous blinks, especially, while redirecting the attention to a new
object (Records, 1979). In Chapter 6, it will be discussed how performing different secondary
tasks and gaze shifts affect the occurrence of blinks. More information about the
characteristics of blinks, e.g. duration, frequency, etc. will be provided in Chapter 7.
In EOG, blinking is only evident in the vertical component as shown in Figure 3.4(a). As this
figure shows, during the awake phase, blinks are very sharp. Therefore, they are categorized
as fast eye movements in Figure 3.3. During the drowsy phase, however, two characteristics
were observed in our experiments. The first one, which is shown in Figure 3.4(b), contains
eye blinks which are still fast in opening and closing motions but with longer closed
duration. In fact, opening and closing phases are almost similar to the awake phase. In Figure
3.4(c), on the contrary, the blinks are much slower in opening and closing phases with slight
changes in the closed duration. In Chapter 5, we will discuss different methods for detecting
all types of blinks shown in Figure 3.4.
Saccades
We mentioned that the fovea consists of a very small area. Therefore, in order to see different
objects sharply, their image should be projected on the fovea. This is possible by eyes in
movements which are referred to as saccades. Saccades are fast movements of both eyes
occurring due to the change of the looking direction in order to reposition the fovea from one
image to another one. They can be characterized by their amplitude and duration (typically
10 to 100 ms (Duchowski, 2007)) which depend on the eyes rotation angle. The amplitude of
saccades can
300
H(n)
200 V (n)
[µV]
100
−100
0 2 4 6 8 10 12 14 16
Time [s]
(a) Blinks during awake phase as fast movements
200
V (n) [µV]
100
−100
0 2 4 6 8 10 12 14 16
Time [s]
(b) Blinks during drowsy phase as fast movements
200
V (n) [µV]
100
−100
0 2 4 6 8 10 12 14 16
Time [s]
(a) Blinks during drowsy phase as slow movements
Figure 3.4.: Representative examples of blinks measured by the vertical (V (n)) and horizontal
(H(n)) components of the EOG
be considered linear to the gaze angle up to±30 ◦ (Young and Sheena, 1975; Kumar and Poole,
2002). A voluntary saccade is a saccade used for scanning the visual field. However, an invol-
untary saccade can be induced as a “corrective optokinetic or vestibular measure” (Duchowski,
2007). The very short duration of a saccade leads to a blurred image on the retina which
cannot be perceived. In fact, during this period, we are blind. In addition, it is presumed that
the distance to be traveled during saccadic movement is preprogrammed and consequently
cannot be altered after being determined.
Figure 3.5 shows three examples of saccades occurring in different directions measured by
the EOG. For a saccade occurring only in one direction, just one component of the EOG varies
remarkably as shown in Figures 3.5(a) and 3.5(b). However, according to Figure 3.5(c), for
diagonal saccades, both H(n) and V (n) signals are informative. Such saccades refer to the
glance at the mirrors during driving. Saccades similar to Figure 3.5(b) occur while looking at
the speedometer.
In Figure 3.5(c), the first saccade in both H(n) and V (n) is followed by a remarkable overshoot.
3.3 Types of eye movements 35
300 V (n)
H(n) 200
0
right left
200
sacca de sacc ade 100
[µV]
−100
100 up saccade
down saccad 0
e
−200
0
−100
fixation
0 0.5 1 1.5 0 0.5 1 1.5 0 1 2 3
Time [s] Time Time [s]
[s]
(a) Horizontal (b) Vertical (c) Diagonal
Figure 3.5.: H(n) and V (n) representing different types of saccades due to horizontal, vertical and
diagonal eye movements
This happens, if the eye movements are time-locked to a head rotation and leads to the
vestibulo- ocular reflex (vor) (Sağlam et al., 2011). It is the result of a backward movement of the
eyes after having reached the destination, while the head movement is not finished due to its
slower velocity. The second saccade is also time-locked to an eye blink which is only present
in V (n). For the rest of this study, such eye blinks occurring simultaneously with a saccade
will be called saccadic eye blinks. Other examples of such saccades are shown in Figure 5.7.
Figure 3.5 indicates that the amplitude of vertical saccades is smaller than that of the
horizontal saccades. This is due to the fact that the horizontal space of the human eyes, i.e.
from one corner to the other corner, is larger than that of the vertical one, i.e from upper lid
to the lower lid. Therefore, the eyes travel a larger distance in the horizontal direction.
Comparing the long eye closures of Figure 3.4(b) with saccades shown in Figure 3.5, it can be
seen that both eye movements have similar shape and forms, although they occur in totally
different situations. Unintended long eye closures are a drowsiness indicator, while saccades
occur during scanning the visual field. In chapter 5, we suggest a method to distinguish them
from each other.
Fixation
The time interval between two successive saccades, during which the eyes fixate on a new
location, is called fixation (Figure 3.5(b)). Fixation is defined in ISO 15007 (2013) as the
“alignment of the eyes so that the image of the fixated area of interest falls on the fovea for a given
time period”. Fixation on an object can also be interpreted as focusing the attention on that
object or visual intake. Nevertheless, this is not always the case, such as during the “looked
but failed to see” phenomenon (Holmqvist et al., 2011). Fixation can also be considered as
miniature or micro eye movements such as microsaccades.
During a fixation, following tasks are processed: analysis of the image on the fovea, i.e.
processing available visual information, next fixation location and pre-programming of the
following saccade (ISO 15007, 2013). These tasks might not be completed thoroughly during
the fixation period which leads to some corrections by looking back to the previous location.
As a result, a minimum duration of 100 or 150 ms has been assumed for the fixation.
At the first glance, it seems that fixation duration reflects the complexity of the task being
performed and the depth of the cognitive process (Holmqvist et al., 2011). However, other
factors like stress and daydreaming affect the fixation duration as well.
Smooth pursuit
Smooth pursuit describes the slow eye movements while tracking a moving object with the
same velocity (up to 30 ◦/s) as it moves (Leigh and Zee, 1999). During driving this eye
movement occurs while fixating on any moving or non-moving object outside the vehicle.
Optokinetic nystagmus
The combination of a smooth pursuit followed by a saccade (without head movement) leads
to a sawtooth-like pattern called optokinetic nystagmus (okn). According to Young and Sheena
(1975), okn consists of phases with low and high velocities (slow and fast phases) in opposite
directions. During the slow phase, the eyes fixate on a portion of a moving object while
following it (smooth pursuit). However, during the fast phase, since that portion had moved
out of the field of vision, by a correcting saccadic jump in the opposite direction, the eyes
move back to the previous position to fixate on a new portion of the moving field. This type
of eye movement also occurs during driving and will be studied in Section 4.1.5. An example
of okn is shown in Figure 4.8.
4. In-vehicle usage of electrooculography
and conducted experiments
As mentioned in Section 2.1.2, EOG can be a suitable measurement system for collecting eye
movement data. On the contrary to cameras, it does not need any calibration and is not
affected by varying lighting conditions. However, it should be attached directly around the
eyes which makes it impractical as an in-vehicle product for sale.
In this chapter, first, a pilot study is explained in Section 4.1 which concentrates on the appli-
cation of the EOG measurement system in the field of automotive. This application is clearly
different from using EOG in the laboratory or in fixed-base driving simulators. Hence, the
pilot study evaluates the robustness of the EOG measurement system for in-vehicle
applications by ex- ploring road-dependent eye movements. Based on the achieved results of
this study, EOG is used in other experiments of this work as well for collecting eye movement
data. The conducted daytime and nighttime experiments for studying driver drowsiness
detection are described in detail in Sections 4.2 and 4.3. These experiments have been
designed such that they are representative of awake and drowsy driving scenarios in real
life.
4.1. Eye movement measurement during driving - a pilot study
In this section, we introduce the conducted pilot study and discusses its results which are
mostly based on Ebrahim et al. (2013b). The goal is to know whether the EOG measurement
system is reliable and robust for collecting eye movement data during real driving
conditions.
To explore the in-vehicle usage of the EOG measurement system, in a fully controlled
experiment, the relationship between driver eye movements and different real driving scenarios
is investigated by the EOG signals. In order to be able to reproduce similar conditions several
times, all data of the pilot study was collected on the proving ground with a total of eight
expert drivers (29–58 years, mean: 39.9 years, all male). All subjects were accompanied by an
investigator during the experiment. Based on the collected data, it is explored, if and how
driver eye movements can be influenced by the road structure, independent of the driver’s
drowsiness or distraction. In other words, we determine which eye movements during driving
are road- or situation-dependent. Such results are, in general, of interest for any driver
monitoring system with concentration on eye movements.
4.1.1. Material
Electrooculography measurement system
In the pilot study and all other experiments, which were conducted within the framework of
this work, EOG signals were collected at 250 Hz by the measurement system ActiCAP, Brain
38 In-vehicle usage of electrooculography and conducted experiments
Products GmbH. Horizontal and vertical components H and V of the EOG were measured by
four electrodes located around the eyes as shown in Figure 2.8. H and V were defined as
follows
V = electrodeabove − electrodebelow (4.1)

H = electroderight − electrodeleft. (4.2)
Two electrodes were also used as reference and ground which were located on the bone
behind each earlobe. H(n) and V (n) refer to the collected signals of both components at
sample n. For further analysis, both H(n) and V (n) were sampled down to 50 Hz.
Vehicle
A Mercedes-Benz E-Class (W212) with an automatic gearbox equipped with the explained
EOG measurement system was used for this experiment. In addition, a multitude of vehicle-
related measures such as vehicle speed, lane lateral distance, steering wheel angle and global
positioning system (gps) were recorded synchronously at the sampling frequency of 50 Hz.
During the entire experiment, four ir-cameras were installed in the car to record videos from
the driver’s face, the vehicle interior, the road ahead and behind synchronously. Therefore, it
was always possible to analyze the driving sections offline.
4.1.2. Test tracks
Since driver eye movements and vehicle speed can highly be influenced by the traffic density
during the experiment, we decided to collect our data under a fully controlled situation on
Applus+ idiada proving ground (Applus+ IDIADA, 2014). This also helped to omit
disturbing maneuvers, e.g. lane changes and takeover maneuvers. In addition, in a fully
controlled experiment, it would be easier to investigate the reproducibility of the results by
different subjects.
The selected tracks (see Figure 4.1), tasks and the concepts behind them are explained below.
All tracks have been driven two times by each subject.
left curve
1
3
4
right curve 2
baseline
baseline P1
Figure 4.1.: Selected tracks of the Applus+ idiada proving ground (Applus+ IDIADA, 2014)
Track 1: General road
This track consists of two straight parts (P1, P2) which have been used for baseline driving in
our experiment. During baselines, the subject was asked to look at the horizon and keep the
head steady. Baselines were driven with the adaptive cruise control system adjusted at 100
km/h.
4.1 Eye movement measurement during driving - a pilot study 39
The parts of the track, which are not shown as baselines in Figure 4.1, were not driven
according to the mentioned conditions. Measurements of this track are our references for
assessing other tracks’ data.
Track 2: Fatigue1 track
Track 2 mimics a badly maintained road with many ground excitations leading to a lot of
head movements during driving. This track was paved with setts and also contains a straight
part which has been used as a baseline condition similar to track 1. The whole track 2 was
driven at the maximum speed of 50 km/h. The experiment on such a track targets the answer
to the question, if and to what extent the mentioned head movements cause EOG signal
degradation.
Track 3: Straight line braking surface
On the contrary to track 2 with permanent ground excitations, this track contains temporary
speed bumps with different sizes and shapes. Different categories of bumps were
successively repeated 5 times on the track. The straight parts of track 3 were driven with the
adaptive cruise control system at 80 km/h. Based on the data collected in this track, we study
whether hitting a bump leads to unwanted vertical saccades or blinks.
Track 4: Dry handling track
We have also chosen a curved track, driven with the adaptive cruise control system at 80
km/h, to study the dependency of eye movements on the road curvature. The minimum
curve radius of the track was roughly 50 m. It should be mentioned that track 4 is a wide
track which makes turns at high speeds in larger radii than the curve’s radii possible.
4.1.3. Comparing baselines of track 1 and track 2
We consider the two baselines of track 1 as the reference measurement which reflect the
common behavior of the eyes and the intrinsic noise of our measurement system. Top and
bottom plots of Figure 4.2 represent 30 s of H(n) and V (n) signals of subject S2 for the baseline
P1 of track 1, respectively. The peaks in V (n) imply the eye blinks. Figure 4.3 shows also 30 s
of H(n) and V (n) related to the baseline of track 2 for the same subject. The high frequency
variations of the EOG signal in Figures 4.2 and 4.3 are due to the mentioned intrinsic noise of
the measurement system. Visual inspection of these two figures reveals that there exist low
frequency changes both in H(n) and V (n) of track 2. They correspond to the slight
compensatory movements of the eyes trying to keep the gaze direction concentrated on the
horizon. In fact, the subject has experienced unwanted head vibrations, while fixating his eyes.
In the EOG signals, however, it seems as if the eyes have had movements.
For the frequency analysis of baselines of track 1 and 2, we choose 20 s of V (n) of subject S4
who had the longest eye blink-free time span among the subjects. The spectrograms and
power spectral densities (psd) of V (n) of these baselines are shown in Figure 4.4. The
difference between spectrograms/psds seems to be considerable within 0.5–2.5 Hz range in
comparison to other frequencies which might correspond to the unwanted head vibrations of
track 2. Moreover,
1
The term fatigue refers to its definition in the field of civil engineering and not to the driver drowsiness.
H (n) [µV]
100
50
0
V (n) [µV]
200 blink
100
0
−100
0 5 10 15 20 25 30
Time [s]
Figure 4.2.: H(n) and V (n) of subject S2 for baseline P1 of track 1
H (n) [µV]
50
0
−50
−100
V (n) [µV]
200
blink
100
0
−100
0 5 10 15 20 25 30
Time [s]
Figure 4.3.: H(n) and V (n) of subject S2 for baseline of track 2
similarity of the two psds in higher frequencies implies that ground excitation does not
provide any artifacts in this range, i.e. electrodes do not move or vibrate despite the ground
excitation.
In order to quantify the effect of ground excitation versus the normal track for all subjects, a
moving standard deviation filter with the window length of 0.5 s and overlap length of 0.48 s
was applied to H(n) to quantify the variations in both baselines. In other words, at each time
sample n, the standard deviation of the samples within the last 0.5 s (25 samples at 50 Hz)
were calculated. We used moving standard deviation filter, because it describes best the local
variation of the EOG due to ground excitation. Afterwards, the mean of the calculated values
is used for further analysis of each baseline. The difference of the two mean values of the
calculated standard deviations for all subjects represents the level of the contribution of head
vibration in the eye movement signal. In order to avoid a false calculation, possible saccades
were removed at first from H(n). This was done by applying the saccade detection algorithm
which will be explained in Section 5.2. All calculated means of standard deviations are listed
in Table 4.1. Data of subject S3 has been excluded, as he did not follow the baselines’
instructions. Figure 4.5 also shows the boxplots of the values listed in Table 4.1. Interestingly,
most of the mean values of track 2 are larger in comparison to those of the track 1 regarding
all subjects. Therefore, it can be concluded that ground excitation leads to unwanted head
vibrations which can be measured by EOG signals, although the eyes have not moved.
Consequently, EOG is
power/frequency [db/Hz]
14
20
Time [s] 12
0
10
−20
8 −40
6 −60
0 2.5 5 7.5 10 12.5 15 17.5 20 22.5 25
Frequency
[Hz]
power/frequency [db/Hz]
(a) track 1
14
20
Time [s]
12
0
10
−20
8
−40
6
−60
0 2.5 5 7.5 10 12.5 15 17.5 20 22.5 25
Frequency
[Hz]
(b) track 2
4
10 Track 1
Track 2
2
10
psd
0
10
−2
10
0 2.5 5 7.5 10 12.5 15 17.5 20 22.5 25

Frequency [Hz]
(c) power spectral density
Figure 4.4.: Spectrogram and power spectral density (psd) of 20 s of V (n) of subject S4 for track 1 and
track 2
not a suitable measurement system for the collection of eye movements on roads with ground
excitation. This was, however, not the case for the highways used in other experiments of this
work.
4.1.4. Bumps and eye movements: related or unrelated?
For assessing the impact of road bumps on eye movements, the exact locations of the bumps
of track 3 should be extracted. In order to determine them, the wheel speed sensor data was
used which was recorded synchronously with EOG signals. By calculating the exponentially
2
weighted moving variance (ewvar) σ (n) (Friedrichs and Yang (2010a)) of the wheel speed w(n)
and
Table 4.1.: Means of moving standard deviations of H(n) for all rounds (R) and parts (P) of track 1 and
track 2, for all subjects (excluding subject S3)
Track 1 Track 2
Subject
R1,P1 R1,P2 R2,P1 R2,P2 R1 R2
1 7.3 8.7 9.5 9.2 10.8 8.9
2 6.3 7.4 6.5 6.6 9.8 9.9
4 5.5 5.8 6.8 5.3 7.0 6.4
5 6.3 7.9 6.8 5.2 7.3 7.4
6 6.2 6.3 5.8 6.3 7.9 7.2
7 5.7 7.6 5.5 6.6 14.1 11.7
8 8.3 11.2 7.4 8.5 14.3 12.0
mean of moving standard deviation for H (n)
14
12
10
R1,P1 R1,P2 R2,P1 R2,P2 R1 R2

Track 1 Track 2
Figure 4.5.: Boxplot of the values listed in Table 4.1
applying a threshold (5 (rpm)2) to it, we distinguished between bumps and even sections of
2
the track. σ (n) is calculated as follows
σ 2(n) = λσ2 σ 2(n − 1) + (1 − λσ2 )(w(n) − µ (n)) (4.3)
2
µ(n) = λµ µ(n − 1) + (1 − λµ)w(n) (4.4)
Nµ − 1
λ =
2 Nσ2 − 1 , (4.5)
, λ σ
=
µ
Nµ Nσ2
where µ(n) is the exponentially weighted moving average (ewma) of w(n). λ
2 and λ are the
forgetti ng factors adjusted by window size =
o µ 2(0)
and
Nσ2 4.6 N
(1) were set to the average ofw (n). Figure µ = 3 samples.
shows Thelarge
five detected values of σ
initialamplitude
µ bumps
σ2(n) [(rpm)2]
with the height of about 8 cm based on σ 2(n).
dete ted large amplitude bumps

20 c
10
-
0
0 1 2 3 4 5 6 7 8 9
Time [s]
Figure 4.6.: Detected large amplitude bumps based on the ewvar of wheel speed sensor data
Figure 4.7(a) shows the detected time intervals of multiple single small amplitude bumps (with
the height of about 5 cm) over V (n) for subject S2. Interestingly, as shown in this figure and
valid for all subjects, we did not find any considerable distortion in the EOG signal. This can
be justified by two reasons. One reason might be that in this experiment, we used a vehicle
with above normal damping and ride comfort and consequently the disturbing effect of small
amplitude road bumps was filtered out by the vehicle to a large extent. Moreover, our body
acts as a low pass filter (damper) to compensate the excitement of such small and short-
duration bumps. However, as Figure 4.7(b) shows the influence of bumps with larger
amplitudes and longer durations on EOG is very similar to that of track 2, i.e. occurrence of
low-frequency components. This is due to successive repetition of each bump category (each
5 times).
V (n) [µV]
200
100
0
−100
−200 detected small amplitude bump
Time [s]
(a) subject S2, small amplitude bumps
V (n) [µV]
200
100
0
−100
detecte d large am p litude bu m p interva ls
−200
0 1 2 3 4 5 6 7 8 9
Time [s]
(b) subject S6, large amplitude bumps
Figure 4.7.: V (n) of subject S2 and S6 for small (top plot) and large (bottom plot) amplitude bumps of
track 3
4.1.5. Patterns of eye movements due to road curvature
During the investigation of EOG signals in the curves of track 4, we observed a peculiar
pattern of the EOG signal (see Figure 4.8) which is similar to the sawtooth pattern of the okn
described in Section 3.3. Figure 4.8 shows this pattern for two right and left curves of track 4
(top and bottom plots, respectively). These two curves are labeled in Figure 4.1, too.
According to Figure 4.8, it can be concluded that the sawtooth pattern direction is related to
the curve direction. Moreover, the occurrence frequencies of the sawtooths and the
amplitudes of the fast phases (saccades) in the right curve are smaller than those of the left
curve. These results, which are valid for all subjects, might be related to different vehicle
speeds, different positions of the subjects in the vehicle and the radii of both curves. The
vehicle speed for all subjects was almost the same on this track due to the used adaptive
cruise control. In addition, we assume that the difference between the positions of the
subjects in the car is negligible. The radii of the curves, however, are to a larger extent
different from each other and are roughly 150 m and 50 m for the right and left curve,
respectively. We observed that the sawtooth pattern was more pronounced for curves with
higher curvature.
Many research studies were devoted to interpreting the pattern of eye movements in curve
H (n) [µV]
200
150
100
50
0
H (n) [µV]
−50
−100
−150
−200
−250 ∆tm
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time [s]
Figure 4.8.: H(n) of subject S8 for track 4, top/bottom: right/left curve
negotiation. Jürgensohn et al. (1991) stated that similar to okn, during curve negotiation, the
subject fixates on a point in front of the car for a certain period (slow phase). As soon as the
point is close to the car, the subject chooses another fixation point (fast phase). Based on Land
and Lee (1994), this moving point can be referred to the tangent point (tp) of the curve or a
point very close to it. As stated by Kandil et al. (2010) and Authié and Mestre (2011), this point
describes the intersection of a tangent through the vehicle and the inner lane marking. They
also believed that subjects rely heavily on the tp and suggested it as a useful source of
information for correct steering during curve negotiation. According to Authié and Mestre
(2011), the location of this point is dependent of the lateral position of the vehicle towards the
lanes and the vehicle’s orientation. Considering all of our results and observations, the
sawtooth pattern rarely occurred for curves with larger radii, meaning that the subject does
not necessarily follows the tp or any other point, while negotiating curves with lower
curvature.
Since our experiment was not equipped with a head-mounted camera, it is not clear where
the subjects were looking at during curve negotiation and whether the okn pattern of
Figure 4.8 is the result of reliance on the tp as the informative point. This can be clarified,
if the characteristics of sawtooth pattern, e.g. the average time interval of its occurrence
∆t, can be described as a function of the curve radius r, under the assumption that tp has
been observed continuously during curve negotiation. In Appendix A, this relationship is
investigated analytically for the left curve shown in Figure 4.1 with r = 50 m. We denote
the analytically calculated values of ∆t with ∆tc. Furthermore, the ∆tm value is available
from our measured data as shown in Figure
4.8. By evaluating the calculated ∆tc against the measured ∆tm, we can show whether subjects
relied on the region of the tp during the curve negotiation.
According to Appendix A, ∆tc is a function of curve radius r and the angular displacement of
the eyes δ, i.e. ∆tc = f (r, δ). Therefore, for the calculation of ∆tc, the value of δ is needed. In
order to find δ, we conducted an experiment in a stationary position, i.e. no driving, to
investigate the relationship between δ and the saccade amplitudes of okn pattern. There, the
±
subject was asked to look at points located horizontally between 45 ◦ with 5◦ spacing at the
◦
height of the eyes (without head movement). 0 was calibrated with respect to subject’s
position while looking straight ahead. The results agree with findings of Young and Sheena
(1975) and Kumar and Poole (2002) who stated that the amplitude of saccades is considered
linear to the gaze angle up to ±30◦. We measured that the saccades within the amplitude
range of the sawtooths of Figure
4.8 (roughly 50 µV) have happened by eye movements between 5◦ to 10◦, i.e. 5◦ ≤ δ ≤ 10◦.
Given δ = 4◦, 6◦, 10◦ and 50 m≤ r 150 m, possible values of ∆tc are shown in Figure 4.9.
As mentioned before, to evaluate ∆tc, we compared it with ∆tm. According to Figure 4.8,
∆tm ≈ 0.34 s, which agrees with the calculated values of ∆tc, namely 0.3 s < ∆tc < 0.4 s for
◦
r = 50 m and 4≤ δ 10◦. This implies that relying on the region of the tp was the case
during the curve negotiation of r = 50 m at v = 80 km/h in our experiment. Moreover,
this result states that tracking the tp is also a possible reason for the existence of okn in
our data.
0.7
∆tc [s]
0.6
0.5
δ = 4◦
0.4 δ = 6◦
δ = 10◦
0.3
50 60 70 80 90 100 110 120 130 140 150

Curve radius [m]
Figure 4.9.: Calculated sawtooth occurrence time period ∆tc versus curve radius r for δ = 4◦, 6◦, 10◦
By analyzing the V (n) signal, we realized that the occurrence of okn was not recognized
within the vertical eye movements for all subjects. For example, during the occurrence of the
thoroughly apparent sawtooth pattern in H(n) of subject S8 (Figure 4.8), no vertical eye
movements were observed in V (n). On the contrary, for subject S4, the vertical and horizontal
components of EOG revealed similar patterns.
It is clear that the observed inevitable sawtooth pattern due to curve negotiation is not related
to the driver’s inattention or drowsiness. Therefore, we suggest the exclusion of tortuous
road sections for further investigation of driver eye movements. Whether features of the okn
depend on the curve parameters and vehicle velocity is, however, out of scope of this work
and opens rooms for future work.
Conclusion
We studied the relationship between driver eye movements and different real driving
scenarios by EOG signals in a fully controlled experiment. In this pilot study, we explored
whether and how driver eye movements are influenced by the road properties, independent
of driver’s drowsiness or distraction. In addition, the usage of EOG measurement system for
in-vehicle applications was studied.
It can be concluded that ground excitation and large amplitude bumps add an extra pattern
to the EOG signals which was characterized as a low frequency component. On the other
hand, monitoring driver eye movements seems to be undisturbed by a single small amplitude
bump. Moreover, the sawtooth patterns of okn during curve negotiation are not drowsiness-
related. Therefore, considering all results provided in the pilot study, we conclude that EOG
is a robust and reliable measurement system for the collection of eye movement data on real
roads and highways. Therefore, in other experiments explained in the following sections a
similar measurement system has been used, because the parts of the highways selected for
conducting the real road experiments were all free of ground excitation, large amplitude
bumps and very high curvature.
4.2. Real road experiments
The real road experiments are the experiments conducted on real roads and cover the
collection of data sets related to the awake and drowsy driving in this work.
4.2.1. Daytime driving with no secondary tasks
Two daytime drives with no secondary tasks have been conducted at different times. The first
one was conducted in May 2012 and the second one in September 2013. Since, except for
some additional measurement instruments, both experiment systems and procedures were
the same, we combined the collected data and call them as the daytime drive with no secondary
tasks.
Subjects
In total, 18 voluntary subjects, 3 females and 15 males, with an average age of 41 .1± 10.7
years (27–62 years) participated in the experiment who were all the employees of Daimler
AG. All of the subjects were additionally trained for driving the experiment vehicle with a lot
of measurement systems. These subjects are labeled S26 to S43 in Chapters 7 and 8.
Material and experiment procedure
The same Mercedes-Benz E-Class used for the pilot study was also used for this experiment.
It was equipped with an EOG and ECG measurement system and a head tracker. For 9
subjects a vital camera for measuring the heart and respiration rate and for the other 9
subjects a driver observation camera were additionally installed in the car. Only EOG data
has been analyzed in this work. In addition, lots of vehicle-related measures such as vehicle
speed, lane lateral distance, steering wheel angle and gps were also recorded at the sampling
frequency of 50 Hz. The kss was also collected every 15 min via a touchscreen which was
prompted automatically by a beep tone. After rating the drowsiness level, the subjects should
have also answered a question about the acceptance of a drowsiness warning with either
correct, acceptable or false. During the whole experiment, four ir-cameras were installed in the
car to record videos from the driver’s face, the vehicle interior, the road ahead and behind
synchronously. Therefore, an offline analysis of all driving sections was always possible.
The A81 highway route in Germany was selected for this experiment as shown in Figure 4.10.
Since our goal was to collect EOG data related to alert and less drowsy phases of the drive,
the experiment was conducted at 9 a.m. or at 1 p.m. to have more fit subjects and low
highway traffic. On average each subject drove about 260 km unaccompanied. All subjects
were asked not to perform any secondary tasks such as listening to the radio, operating the
navigation system or talking with mobile phone and to obey all traffic rules, while not
driving faster than 130 km/h. They were also allowed to use the adaptive cruise control.
4.2.2. Daytime driving with secondary tasks
This experiment was initially designed for studying the variation of the asr during
visuomotor and auditory secondary tasks under real driving conditions by Sonnleitner et al.
(2011). Since the EOG data was collected as well, we used it in this work for studying gaze-
shift induced blinks
4.2 Real road experiments 47
Figure 4.10.: Daytime drive experiment’s route (ViaMichelin, 2014), about 130 km
as will be discussed in Chapter 6. The following explanations are derived from Sonnleitner et
al. (2011) and E b r a h i m e t a l . (2013c).
Subjects
A total of 26 voluntary employees of Daimler AG, 7 females and 19 males, participated in the
experiment with the average age of 43.7± 8.7 years (25–56 years). All of the subjects were
additionally trained for driving the experiment vehicle. The participants of this experiment
were partially different from those of the previous experiment. These subjects are labeled S1
to S26 in Chapter 6.
Vehicles
Two Mercedes-Benz S-Class vehicles (W221) and one E-Class vehicle (W212) were used in
this experiment equipped with an extra brake and gas pedal on the passenger seat similar to
driving school vehicles. This was done due to safety reasons during performing the
secondary tasks. All vehicles were also equipped with the EEG (16-electrode-cap) and EOG
measurement systems similar to the previous experiment. ir-cameras were installed as well.
Vehicle-related measures were also collected as explained before.
Experiment procedure
During the experiment, the subjects performed the primary driving task together with four
blocks of secondary tasks lasting 40 min on the same highway route as in the previous
experiment (see Figure 4.10). The secondary tasks contained both visuomotor (representative
of navigation system demands) and auditory (comparable with the mobile phone
conversation) tasks. All subjects were instructed to always prioritize the primary task and to
drive under official traffic regulations. The maximum speed allowed was 130 km/h. During
the tasks, no overtaking maneuver was allowed for safety reasons. Moreover, a trained
investigator accompanied subjects during the experiment to intervene in case of safety-
critical situations using the extra pedals.
Each block contained 3 min of visuomotor task, 1.5 min of driving with no secondary task, 3
min of auditory task and finally 1.5 min of driving with no secondary task as shown in
Figure 4.11. Start and end markers of each block were recorded automatically. This means
that the time gaps before and after the beginning and end of blocks were discarded.
visuomotor driving auditory driving
3 min 1.5 min 3 min 1.5 min

Figure 4.11.: One block of daytime real road experiment with secondary tasks
Visuomotor secondary task
For the visuomotor task, a 2 ×2 matrix of four Landolt rings was shown on a display located at
the central console on the right side of the navigation system (see Figure 4.12). The subject
had to determine which side of the screen (right or left) contains the ring with different
direction of opening by pushing on two adjacent buttons of an external number keypad (4:
left, 6: right) located within driver’s reach on the lower central console. In this work, the
number of correctly identified rings is not evaluated, because just the gaze shifts between the
road and the screen are of interest as will be discussed in Chapter 6.
Auditory secondary task
During this task, the subjects listened to an audio book, in order to detect the German
definite article “die” by pressing a button fitted to their left index finger. At the end of each
block, subjects answered a question about the content of the presented audio book. Again,
the answers are not evaluated here.
Figure 4.12 shows the experimental setup of the mentioned secondary tasks.
4.2.3. Nighttime driving with no secondary tasks
This experiment was conducted in March 2010 to collect drowsiness related data under real
driving conditions for studying EEG-based features. Since EOG electrodes were used as well,
the experiment is studied in this work.
4.2 Real road experiments 49
button separate dis-

play
number
keypad
Figure 4.12.: In-vehicle setup of the daytime experiment with secondary tasks (taken from Sonnleitner
et al. (2011))
Subjects
In total, 46 voluntary subjects, who were all employees of Daimler AG, participated in this
experiment. Data of 16 subjects was removed due to technical problems for collecting EEG or
other sensor’s data. Out of the 30 remaining subjects, 14 of them aborted the experiment due
to severe drowsiness which can be considered as an objective measure reflecting subjects’
deep level of drowsiness. Only the data of these 14 subjects is of interest in this work. Due to
data quality problems of EOG signals, however, in the end only data of 10 subjects out of 14
(1 female and 9 males) are studied with the average ± age of 35.9 10.1 years (24–57 years).
These subjects did not participate in other experiments of this work.
The vehicles used in this experiment were all equipped similarly as in the previous
experiments. The selected route, shown in Figure 4.13, was from Stuttgart to Würzburg via
Ulm and driven directly back to Stuttgart again (not on the same road as before).
The driving task started around 10 p.m. The Mercedes-Benz E-Class vehicle used in the
previous experiment together with an S-class vehicle equipped similarly were used for
collecting the data. In addition, the secondary pedals were installed in the vehicle. The reason
was that all subjects were accompanied by an investigator who was responsible for
intervening and controlling the vehicle in case of safety-critical events. All subjects were told
to abort the experiment, if they felt drowsy. Every 15 min, kss data was collected as well. As a
rule, as soon as a subject estimated his drowsiness level at kss = 9 or two times at 8
successively, the experiment was aborted. On average, a subject aborted the experiment after
driving 244 km. Similar driving regulations as explained in Section 4.2.1 were held in this
experiment.
Figure 4.13.: Nighttime drive experiment’s route (ViaMichelin, 2014), about 450 km
4.3. Nighttime driving experiment in the driving simulator
Since severe drowsiness phases and occurrence of microsleeps cannot be induced in real road
experiments due to safety concerns, drowsy data was collected at the Mercedes-Benz
moving- base driving simulator. With a 360 ◦ projection screen, it is to a large extent
comparable with real driving (Zeeb, 2010). This experiment was conducted for collecting the
most relevant eye movements to drowsiness.
Subjects
25 employee of Daimler AG, 11 females and 14 males, with an average age of 33.9 ± 8.0 years
(25–56 years) drove at night starting either at 6 p.m. or 10 p.m. after a usual working day. No
driving simulator sickness was reported. These subjects are labeled S1 to S25 in Chapters 7
and 8 and did not participate in other experiments of this work1.
An S-Class Mercedes-Benz cabin and a highly monotonous, low traffic highway driving
condition during the night with two lanes were selected for this experiment. In addition to
the EOG and ECG data, kss and the acceptance of warning were also collected every 15 min
prompted by a
1
The subjects of the daytime driving with secondary tasks in Section 4.2.2, who are also labeled S1 to S26,
were not the same as those who participated in the driving simulator experiment.
4.3 Nighttime driving experiment in the driving simulator 51
dong tone sound. On the contrary to the real road experiments, subjective self-rating of the
drowsiness level was collected verbally. Similar to other experiments, an ir-camera was
installed in the car for recording the subject’s face during driving. 14 subjects were
additionally equipped with a head tracker and a head mounted eye tracker. These subjects
also performed a speech test right after alternate kss events. They were asked to repeat some
sentences for about 4 min. These parts of the experiment, which were collected for other
purposes, were removed for further analysis, since talking leads to noisy EOG data.
The very first minutes of the experiment were intended for getting familiar with the
simulator and accommodation of the eyes. Unlike the daytime experiments, where the
subject should have driven the whole route, in this experiment the subjects were asked to
drive as long as they could and even give effort to fight against drowsiness, if it was possible.
The length of the circular route was 200 km and it was repeated after completing one round
of it. On average, each subject drove 335 km with the maximum speed of 130 km/h. Two
construction sites were also included through the 200 km route at 62nd and 88th km to make
the driving scenario more realistic. In addition, some takeover maneuvers were also
designed. The subjects were allowed to activate the adaptive cruise control. They were asked
not to talk to the investigators in the control room who were responsible for observing the
subjects during the experiment and for documenting the experiment. In general, subjects
aborted the experiment due to sever drowsiness, either by themselves or suggested by the
investigators. We emphasize that the subjects were not necessarily drowsy during the entire
drive in this experiment.
Table 4.2 summarizes all conducted experiments.
Table 4.2.: Summary of experiments studied in this work
daytime experiment nighttime experiment
without with without
pilot study
secondary tasks secondary tasks secondary tasks
real or
real real real real simulated
simulated driving
number of subjects 8 18 26 10 25
driven distance – 260 km 260 km 244 km 355 km
driving duration
– 2:30 2:30 2:10 2:40
[hh:mm]
starting time – 9 a.m. or 1 p.m. 9 a.m. 10 p.m. 6 p.m. or 10 p.m.
EOG data collection yes yes yes yes yes
KSS data collection no yes no yes yes
accompanied by
yes no yes yes no
an investigator?
5. Eye movement event detection methods
This chapter discusses eye movement detection methods. First, it is necessary to clarify, why
the precise detection of eye movements provides the foundation for successful driver
drowsiness detection. If eye movement events are not detected properly, it is obvious that the
features being extracted from them lack useful information for further analysis such as the
classification. Improper event detection is the one with a high rate of missing events or false
detections. Under these circumstances, the relationship between features and driver
drowsiness cannot be determined correctly. As a result, detection of eye movements should
be done with care and high precession, since it directly affects the results of next analysis
steps.
It should also be mentioned that correct detection of an eye movement event, e.g. a blink, also
contains the detection of its start and end points precisely. Therefore, if a blink is only partly
detected, it cannot be counted as an acceptable detected event.
In this chapter, first, a well-known method of blink detection based on median filtering is ex-
plained. After discussing the shortcomings of this method, our developed detection
algorithms based on derivative signal and continuous wavelet transform will be explained as
alternatives to the median filter-based method. Moreover, it will be shown how our new
approaches complete each other in terms of detection of different eye moments. The
proposed derivative-based detection algorithm is responsible for the detection of rapid eye
movements, while the suggested wavelet transform-based algorithm detects slow eye
movements. At the end of this chapter, all studied event detection methods are discussed and
compared with each other.
Parts of the material in this chapter are drawn from Ebrahim et al. (2013a).
5.1. Eye movement detection using the median filter-

based method
Some examples of blinks during the awake and drowsy phases were shown in Figure 3.4.
During the awake phase, in which a person does not suffer from sleep deprivation, blinks
often follow similar characteristics, i.e. their amplitude and duration do not change
remarkably. Some examples of such blinks are shown in the top plot of Figure 5.1. This
implies that blink detection can easily be performed by applying some fixed criteria, e.g. by
comparing the V (n) signal with a fixed threshold. However, as mentioned before, drift,
which is not related to any type of eye movements, is inevitable in the EOG signal (see top
plot of Figure 5.1). Therefore, a fixed threshold does not lead to the correct blink detection.
A conventional method for eliminating the drift in the EOG signal and improving the blink
detection is to apply a median filter to V (n) and then subtract the result from the original V
(n) signal. This method has been applied in several studies for blink detection, such as in
Hargutt (2003), Martínez et al. (2008), Krupiński and Mazurek (2010) and Huang et al. (2012).
As a result, we have
Vˆ (n) = V (n) − Vmed (n), (5.1)
54 Eye movement event detection methods
400
200
[µV]
0
−200
V (n)
saccade
−400 Vˆ (n) Vmed(n)
600 8910 11
400
1 23 4 5 67
[µV]
200
−200
0 2 4 6 8 10 12 14
Time [s]
Figure 5.1.: Drift removal by applying a median filter to V (n) to improve blink detection. Top: awake
phase, bottom: drowsy phase
where Vmed(n) refers to the median filter processed V (n) with empirically chosen window size
f
of wmed = 42( ) + 1 samples (f = 50 Hz) in this study. At sample n, Vmed(n) represents the
median of V (n) calculated within the interval
− [n wmed + 1 , n]. The top plot of Figure 5.1 shows
the result of applying this method. It is clear that due to the eliminated drift in Vˆ (n), the
blinks
can now be easily detected by applying an amplitude threshold like thamp = 100 µV.
The bottom plot of Figure 5.1 shows V (n) signal of the same subject during the drowsy phase.
Subject’s level of drowsiness was assessed based on the video analysis and collected kss
values. As shown in this plot and Figure 3.4, blinks in the drowsy phase differ widely in
amplitude and duration from those of the awake phase. The same median filter has been
applied to process
V (n) of the drowsy phase as well. In this case, not all blinks were detected correctly in Vˆ (n)
signal. The half of the fourth blink, as an example, has been removed by applying the median
filter as magnified in Figure 5.2. In addition to the amplitude, the duration of this blink is
different in V (n) and Vˆ (n).
Vˆ (n) [µV]
V (n) [µV]
100 200
0 100
−100 0
4.2 4.4 4.6 4.8 5

Time [s]
Figure 5.2.: Information loss of slow blinks by median filter method
Moreover, the blinks with longer durations (3rd, 5th, 9th and 10th events in Figure 5.1)
almost disappeared after median filtering. Setting a small value for thamp also does not help, as
saccades or noise might be incorrectly detected as blinks. The problem is that the efficiency of
the median filter-based method is highly dependent on the chosen wmed. The more wmed of the
median filter
matches the blink duration, the less blink information is lost in Vˆ (n). Since the blink duration
not only varies inter-individually, but also for an individual according to the level of
drowsiness, applying a fixed window size median filter would not lead to successful blink
detection for the drowsy phases. In addition, another deficiency of this method is that
saccade detection becomes
5.2 Eye movement detection using the derivative-based method 55
impossible, as all saccades shown in Figure 5.1 have been filtered out as drift. Therefore, a
median filter is not suitable for our application, because, here, all fast eye movements (blinks
and saccades) are of interest and a median filter removes both slowly varying drift as well as
some blinks and saccades.
5.2. Eye movement detection using the derivative-based method
In the previous section, we saw that the median filter-based method is a powerful blink
detection method for detecting sharp and short blinks of the awake phase. However, during
the drowsy phase, most of the events are missed or detected with low precision. To overcome
the mentioned problem, this section describes a method which benefits from the derivative of
the EOG signal for blink detection. Some previous studies also used derivative signal for
detecting blinks such as Jammes et al. (2008),Hu and Zheng (2009) and Wei and Lu (2012).
However, their proposed algorithms had some weaknesses. Jammes et al. (2008) mentioned
that their detection algorithm is unable to distinguish between longer eye closures and vertical
saccades occurred during looking at the dashboard, because they look very similar to each
other. Distinguishing between these eye movements has been mentioned as an important
issue for driver drowsiness detection in other studies as well (Svensson, 2004). In addition,
Jammes et al. (2008) showed that their algorithm did not detect slow blinks. Wei and Lu (2012)
also applied the blink detection method suggested by Jammes et al. (2008) and used
frequency-based methods (Fourier and wavelet transform) additionally to extract the number
of slow eye movements in the horizontal EOG signal.
A clear disadvantage of the mentioned studies is that they only concentrated on the detection
of blinks in v(n). However, as mentioned in Sections 3.1 and 3.3, observing the scene ahead by
saccadic eye movements is essential for the safe driving. Unfortunately, distinguishing
between saccades and blinks (either fast blinks or micro sleep events) in V (n) of EOG signals
have not been addressed in previous derivative-based algorithms. Here, we introduce a novel
derivative-based approach which takes saccade detection into consideration as well, such
that blinks and saccades are detected simultaneously as fast eye movements and are
distinguished form each other. This property of our algorithm tackles the existing problem in
the conventional derivative-based blink detection algorithms.
The derivative of the V (n) signal, V t(n), calculated by the Savitzky-Golay filter (Savitzky and
Golay, 1964) with the polynomial order 1 and the frame size 7, is shown in Figure 5.3. In
the Savitzky-Golay filter approach, a polynomial of degree 1 is fitted to 7 samples of V (n)
successively in the least-squares sense. V t(n) at the midpoint of the 7 samples is obtained by
performing the differentiation on the fitted polynomial rather than on the original V (n). The
mentioned parameters of the filter are only used for the detection of blink events1.
In the following, our blink detection algorithm based on V t(n) is explained.
1. Detecting potential blinks
According to Figure 5.3, potential blink events can be detected by setting an amplitude
threshold thvel regarding the blink velocity V t(n) to consider all of its peaks, namely
1V t(n) > thvel . (5.2)
1 three successive sign changes are searched for.

Afterwards, around all accepted peaks,
These points define start, middle and end points of a blink as a, b and c, respectively as
1
For the sake of accuracy, another parameter set is used for the extraction of blink features.
V (n)V ′(n) H(n)

300 7500
B B
200 5000
[µV/s]
[µV]
100 2500
a
a c c
0 b 0
b
C
−100 A C −2500
A
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Time [s]
Figure 5.3.: V (n) and its derivative V 1 (n) representing eye blinks during the awake phase
shown in Figure 5.3. We have considered the first point after a sign change. Negative to
positive transitions of V t(n) are defined by a and c, while positive to negative
transitions by b. These three respective points of V (n) are also marked as A, B and C in
Figure 5.3. a to b (A to B) and b to c (B to C) transitions describe closing and opening of
eyes during a blink event. At the end of this step, all potential blinks are detected.
2. Calculating the blink amplitude of potential blinks
After detecting all potential blinks, the blink amplitude is extracted. This comprises
both closing and opening amplitudes, namely −B A and B C. The difference between
these amplitudes is negligible for normal blinks. Usually, the amplitudes measured at
the beginning (A) and at the end (C) of a blink remain unchanged, as for the first blink
in Figure 5.3. For saccadic blinks, however, this difference is non-zero and is equivalent
to the amplitude of the saccade time-locked to the blink. Therefore, in order not to
consider the amplitude of the saccade in the blink amplitude, the blink amplitude for
the i-th blink is defined as
(
ampi = min Bi − Ai, Bi − Ci . (5.3)
(Bi−Ai)+(Bi−Ci)
Wei and Lu (2012), however, used as the amplitude of the blink. Their
2
definition ignores the difference between the amplitude of saccadic and non-saccadic blinks.
3. Categorizing potential blinks with respect to their amplitude
Now, the question is whether all detected patterns are true eye blinks. In order to assess
this, the histogram of the amplitude in (5.3) was analyzed as shown in Figure 5.4 for 11
subjects. These subjects participated in the driving simulator experiment (see Section
4.3). The histograms are normalized with respect to the maximum number of
occurrences for each subject separately. According to Figure 5.4, for almost all subjects,
two clusters of amplitude are distinguishable. The question is what these clusters refer
to. The cluster with the smaller amplitude describes the vertical saccades and
microsleep events. This means that although the focus of detection was on blinks, other
eye movements have been detected as well. The reason is that the chosen thvel was small
enough to take other fast eye movements besides blinks into consideration. Figure 5.5
and the highlighted area in Figure 5.3 show saccades detected by the explained
detection algorithm.
As explained in Section 3.3, saccades and blinks with long eye closures have similar shape
Normalized number of occurrences
1
S 5 S1 6 S17 S18 S 9
1 1
0.5
0
1
S20 S 1 S 2 S 3 S 4
2 2 2 2
0.5
0
1 0 200 400 0 200 400 0 200 400 0 200 400
S25 amp [µV] amp [µV]
0.5
Normalized nr. of occurrences
th2-means
0 th3-means
0 200 400
amp [µV]
Figure 5.4.: Normalized histogram of all detected potential blinks and their clustering thresholds by the
k-means clustering method for 11 subjects
−50 1000
B B
AC
[µV/s]
−100 0
[µV]
a c aA c
b b
C
−150 −1000
V (n) V ′(n)
0 0.5 1 1.5
Time [s]
Figure 5.5.: Simultaneous detection of saccades by eye blink detection algorithm
and form (see Figures 3.4(b) and 3.5). Therefore, analogous to saccades, opening and
closing stages of such blinks are detected by the algorithm and considered in the group
with smaller amplitudes in Figure 5.4. After identifying different clusters in Figure 5.4
(blinks versus saccades/microsleeps), a clustering method such as the k-means
clustering (see Appendix C) is required to find the exact border between them. At first
sight, applying a 2-class clustering seems to be sufficient. However, in addition to
saccades and microsleep events, the data includes blinks from both awake and drowsy
phases. In fact, three clusters are available: 1) saccades and microsleeps (similar to
Figure 3.4(b)), 2) blinks during the drowsy phase or with longer eye closure and smaller
amplitude due to drowsiness (similar to Figure 3.4(c)) and 3) blinks during the awake
phase or with short eye closure (similar to Figure 3.4(a)). Therefore, applying a 3-class
clustering algorithm is recommended. The thresholds of both 3-means, th3-means, and 2-
means, th2-means, are shown in dashed and solid lines, respectively, in Figure 5.4. th3-means
on the left side refers to the threshold between saccades/microsleeps (cluster 1) and
blinks of the drowsy phase (cluster 2). The threshold on the right side discriminates the
second and the third clusters. Obviously, by
applying the 2-class clustering, many decreased amplitude blinks due to drowsiness
will be incorrectly clustered as saccades or vice versa as for subjects S18 and S22.
− both B A > th3-means,left and B C > th3-means,left are accepted
Finally, all events fulfilling
as blinks. It is noticeable that the thresholds differ from person to person which implies
that no fixed threshold should be applied for distinguishing between the clusters.
4. Distinguishing between vertical saccades and blinks with longer eye closure
The goal of this step is to distinguish between saccades and other eye movements which
are all clustered in a common group in the previous step. There, only the minimum
−
amplitude of B A and B C was of interest, while now the actual amplitude of these
events are studied. The amplitude of the i-th eye movement of this group, ampem,i, is
defined as
ampem,i = 1Ci − Ai . (5.4)
In fact, the relative variation of the amplitude1 is considered, overcoming the overshoots
in saccadic eye movements (see Figures 3.5(c) and 5.3). Similar to the previous step, the
histograms of the amplitudes calculated by (5.4) are analyzed as shown in Figure 5.6 for
subjects S15, S19 and S22.
normalised histogram
95% percentil
threshold of 2-class
1 clustering
S15 S19 S22
0.5
0
0 200 400 0 200 400 0 200 400
ampem[µV]
Figure 5.6.: Normalized histogram of all detected potential saccades and blinks with long eye closure.
Their clustering thresholds are also shown.
For subject S15, two classes are again distinguishable. The group with smaller
amplitudes refers to vertical saccades, while the other group describes blinks with long
eye closure. Due to smaller vertical than horizontal space of human eyes, the amplitude
of vertical saccades are limited. As a result, vertical saccades with large amplitudes
comparable to the microsleep events shown in Figure 3.4(b) do not occur during
common driving tasks. On the other hand, it is clear that if microsleep events similar to
the saccadic pattern shown in Figure 3.4(b) do not occur often, the clusters will not be as
distinguishable as for subject S15. This is the case for subject S22, while for subject S19,
the number of such events was so small that only one cluster can be distinguished.
Therefore, for histograms similar to subjects S15 and S22, the k-means (k = 2) algorithm
is applied and for histograms without distinct clusters, the 95th percentile is used as the
border between vertical saccades and microsleep events. The thresholds are shown in
Figure 5.6.
5. Plausibility check of detected fast eye movements
Since EOG signals are very sensitive to any muscle artifact around the electrodes, it
might be possible that some artifacts are confused with eye movements. A possible
method for overcoming this problem is to ascertain which eye movements are related to
each other and
to exclude unrelated ones. This is logical during driving, as it is assumed that the driver
looks straight ahead most of the time. Therefore, a main looking direction can be defined.
As a result, for all saccades representing looking away from the main looking direction,
another saccade in the opposite direction should be present as shown in Figure 3.5. All
detected saccades, which do not fulfill the mentioned criterion, are discarded. For
saccades occurring as saccadic blinks, a threshold, ths, is additionally required to avoid
confusing them with non-saccadic (normal) blinks. In fact, the |value − of A C in Figure
5.3 is crucial. Based on a similar argument, microsleep events can| be checked as well,
because eye closures should be followed by an eye opening during driving. It is
obviously supposed that none of the subjects falls thoroughly asleep until end of the
experiment. Finally, all detected eye movements, which are not assigned to adjacent
movements, are considered as false detections and are removed from the detected
events’ data set. In order| to−find ths, the histogram of the A C amplitude of the blinks
|
occurring both before and after detected saccades are analyzed. The reason is that if a
saccade is adjacent to a saccadic blink, four different combinations of them may occur:
• up-going saccade, down-going saccadic blink
• down-going saccade, up-going saccadic blink
• down-going saccadic blink, up-going saccade
• up-going saccadic blink, down-going saccade.
These pairs are shown in the first four plots from the left of Figure 5.7. The ths is found
by applying the k-means algorithm (k = 2) to the histogram of | A− C amplitude of these
blinks to distinguish between two clusters: saccadic versus non-saccadic blinks. After
| −A |C > ths are candidates to be considered as a
finding the threshold, all blinks fulfilling
pair together with a detected saccade. Moreover, it is also possible that two saccadic
blinks are joined together as shown in the first two plots from the right of Figure 5.7.
Finally, only the saccades of pairs similar to those shown in Figure 5.7 are considered as
correct eye movement detection.
300
200 V (n)
[µV]
100
0
−100
0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
Time [s]
Figure 5.7.: Possible combinations of two vertical saccades in V (n)
6. Horizontal saccade detection

Similar to the vertical saccades, horizontal saccades are detected by comparing| Ht(n)
|
with a threshold as explained in Step 1. As blinks are not available in H(n), detected
patterns are either saccades or artifacts. Therefore, only the saccades, which pass the
mentioned plausibility check, are considered as horizontal saccades in the end.
Figure 5.8 summarizes the described algorithm for blink detection.
A clear drawback of the suggested algorithm is that slow blinks might not be detected
correctly based on this method. If both closing and opening phases of the blinks are very
slow, then the derivative-based method definitely misses the detection of both of them. If
thvel in (5.2) is selected to be very low for detecting these slower blinks, noise or a lot of non-
relevant eye movements will be detected wrongly as well and removing them would be very
cumbersome.
V t (n)
detect potential blinks based on thvel
calculate ampli-
V (n) tude ampi of the
potential blinks
ampi
consider as blink yes ampi > th3-means,left
no
consider as saccade
or microsleep and
calculate ampem,i
ampem,i
consider as microsleep
yes ampem,i > th2-means
no
consider as saccade
apply plausibility check
Figure 5.8.: Flow chart of the derivate-based method for blink detection
For blinks, which are fast in one phase and slow in the other phase, either the opening or the
closing phase is detectable based on our proposed algorithm, not both of them. We call such
events incomplete events. The problem is that according to step 5 of the explained detection
algorithm, incomplete events do not pass the plausibility check and are consequently
discarded. We observed that it is the closing phase which gets slower due to drowsiness in
comparison to the opening phase. A solution to detect incomplete events as well is to adapt
the threshold applied to V t(n) in (5.2) for event detection. This approach was also applied in
some cases in this study. Therefore, as soon as an incomplete event is detected, the threshold
in (5.2) is reduced in order to find the missing pair of it, namely either opening or closing
phase of it. However, for slow velocity blinks in both closing and opening phases another
approach is needed which is explained in the next section.
5.3 Eye movement detection using the wavelet transform-based method 61
5.3. Eye movement detection using the wavelet transform-

based method
This section studies another method of eye movement detection which is based on the wavelet
transform (wt). The wavelet transform has been applied for detection of fast blinks and
saccades in brain computer interaction and activity recognition applications (Reddy et al.,
2010; Bulling et al., 2011; Barea et al., 2012). Obviously, relevant eye movements to such
applications are fully controlled and are different from spontaneous eye movements relevant to
driver drowsiness. Barea et al. (2012) only targeted the detection of saccadic eye movements
without discussing blink detection. In the studies by Reddy et al. (2010) and Bulling et al.
(2011), only the detection of fast blinks was addressed. Therefore, their proposed algorithms
are inapplicable to the detection of slow blinks due to drowsiness. Analysis of slower
movements has been investigated in Magosso et al. (2006) and Wei and Lu (2012). The former
concentrated on very slow eye movements which occur while the eyes are completely closed
for a long time. Such eye movements are interesting for sleep disorder-related research
studies and are out of the scope of driver drowsiness detection. The method suggested by
Wei and Lu (2012), however, was used for extracting a feature based on slow horizontal eye
movements. It did not target the detection of slow blinks in V (n).
Unfortunately, the introduced algorithms based on wavelet transform in previous studies are
not applicable to driver drowsiness detection, since they have only aimed to detect sharp
blinks similar to those occurring during the awake phase. In this section, however, after
highlighting the advantage of the wavelet transform over the Fourier transform, we will
propose two new algorithms. The first algorithm targets the detection of both slow and fast
blinks by applying continuous wavelet transform in Section 5.3.2. On the one hand, this
algorithm can be considered as a supplementary to the previous algorithm explained in
Section 5.2 for additionally covering the detection of slow blinks. On the other hand, it can
also be applied independently to detect all fast and slow blinks and saccades. Therefore,
events are never incomplete, if this method is applied. In fact, at the end of this section, all of
the relevant eye movements for driver drowsiness detection are investigated. The second
algorithm is a preprocessing step which benefits from the properties of the discrete wavelet
transform. First, it will be shown in Section 5.3.3 how this transform can be used to remove
noise and drift in the EOG signals. Then, the second new algorithm adaptively removes noise
in the collected data in order to avoid information loss. We applied both noise and drift
removal approaches based on wavelet transform to all EOG signals in this work before
applying eye movement detection algorithms. This helped to improve the performance of the
event detection.
The background theories in this section are taken from Burrus et al. (1998), Niemann (2003),
Keller (2004), Mallat (2009), Poularikas (2009) and Soman et al. (2010).
5.3.1. Discrete Fourier transform
Fast and slow eye movements can be characterized based on their frequency components.
There- fore, frequency analysis of the EOG signal is another approach for detecting different
types of eye movements.
A discrete time signal x(n) is analyzed in the frequency domain by applying the discrete Fourier
transform (dft) F{.} to it, defined as
{ }
F x(n) ≡ X (Ω) N−1
= x(n) e− , (5.5)
n=0
iΩn
where Ω =
2πk N with k = 0, 1, 2, · · · , N − 1 and number of time samples N .
Since the dft does not provide time localization information, the short time Fourier transform
(stft) was introduced as the solution to this problem. Unlike the dft, which considers the
whole signal from the first to the last sample at once, in stft, the signal is first multiplied by a
window function, which is non-zero for a short time, and then the dft is calculated.
Therefore, the resulting dft only represents the frequency components of the windowed part
of the original signal and consequently provides the time localization information. As an
example, a cosine signal x(n) with the frequency of 8 Hz (low frequency component) and one
discontinuity from n = 1.5 s to 1.502 s (high frequency component) is analyzed as shown in
Figure 5.9 (top left plot). The sampling frequency is 1000 Hz. The stft was applied by
choosing the Hamming window as the window function by considering different window
lengths Lwin, namely Lwin = 0.02 s, 0.2 s and 2 s. It is clear that depending on the Lwin value, stft
provides different frequency component information. The high frequency component of x(n)
is apparent with the best time localization for shorter Lwin, e.g. Lwin = 0.02 s (top right plot). In
this case, the 8 Hz frequency of the cosine as the low frequency component expands up to
100 Hz. By choosing a longer window,
e.g. Lwin = 2 s, however, the frequency resolution for cosine wave improves at a cost of less
time localization for the discontinuity (bottom right plot). For Lwin = 2 s, the high frequency
component is completely lost in the spectrogram.
All in all, although the stft is a good solution for adding time localization information to the
dft, its efficiency in terms of eye movement detection depends to a large extent on the chosen
window length. This fact is even highlighted, if non-periodic and stochastic signals like EOG
signals are being analyzed.
5.3.2. Continuous wavelet transform
Another possibility for providing time localization information in the frequency domain is
the wavelet transform. Similar to dft, which is based on a trigonometric function, this
transform also uses special functions which are called wavelets. According to Young (1993), a
small wave ψ(t), which is oscillatory (zero average) and decays quickly, with the following
characteristic
cψ = + ∞|
dΩ < ∞ (5.6)
−∞
Ψ(Ω)||Ω|
2
is called a wavelet, where Ψ(Ω) is the Fourier transform of ψ(t). (5.6) is an important condition
for the existence of the inverse wavelet transform (Niemann, 2003). ψ(t) is also referred to the
mother wavelet. Additionally, each mother wavelet has λ vanishing moments which fulfill the
following condition
+∞
tk ψ(t) dt = 0 , (5.7)
−∞
stft with Lwin = 0.02 s

3 150
Frequency [Hz]
Amplitude
2
100
50
0
−1
1 1.2 1.4 1.6 1.8 0
2 1 1.2 1.4 1.6 1.8 2
Time [s] Time [s]
stft with Lwin = 0.2 s stft with Lwin = 2 s

150 150
Frequency [Hz]
100
Frequency [Hz]
100
50 50
0
1 1.2 1.4 1.6 1.8 0
2 1 1.2 1.4 1.6 1.8 2
Time [s] Time [s]
−120 −110 −100 −90 −80 −70 −60 −50 −40 −30 −20
Figure 5.9.: The impact of window length Lwin on the efficiency of the stft
where 0≤k λ ≤1 and λ is a positive integer. Figure 5.10 shows examples of Haar, Daubechies,
− Mexican hat and Morelet mother wavelets. db2, coif2 and sym2 denote the
Coiflets, Symlets,
second vanishing moments of the corresponding mother wavelets. The Haar wavelet has only
one vanishing movement at λ = 0.
The wt can be performed either continuously or discretely. The continuous wavelet transform
(cwt) of x(t) is defined as
+
∞
{ }
1 ( t− + ∞
b)
W x(t) ≡ Xψ(a, b) = x(t) √ ψ∗ dt = x(t) ψa∗,b (t) dt , (5.8)
a a
−∞ −∞
where ψa,b(t) is a set of scaled (a ∈ R > 0) and translated (b ∈ R) wavelets originating from
1
1
the mother wavelet ψ(t). a is referred to as scale and the factor √anormalizes the energy of
the scaled wavelets. Obviously, in cwt, continuous values of a and b can be selected. Thus,
by this transform, a time-scale (two dimensional) representation of the original signal x(t) is
possible, since the variation of a and b results in the multiplication of all scaled and translated
variants of ψ(t) with x(t) (see Figure 5.11). The resulting time-scale plane is called the
X
scalogram. In matlab, scalograms represent the absolute value of the calculated ψ(a, b) and are
scaled between 0 and 240. This representation is used in this work as well.
1 ∗
The asterisk denotes the complex conjugate.
Haar Daubechies (db2) Coiflets (coif2) wavelet

wavelet wavelet 2
2
1
1
1
0
0 0
−1
−1
−1
0 0.5 1 0 1 2 3 0 2 4 6 8 10
Sample Sampl Sample
e
Symlets (sym2)
Mexican hat Morlet wavelet
wavelet wavelet 1
2 1
0.5
1
0.5
0
0
0 −0.5
−1
−2
0 1 2 3 −0.5 −1
−5 −2.5 0 2.5 5 −5 −2.5 0 2.5 5
Sample Sample Sample
Figure 5.10.: Examples of typical mother wavelets
Figure 5.11.: Scaling and translation of the mother wavelet with varying a and b
A large value of a yields a stretched ψ(t) and underscores slow changes of x(t), namely low
frequency components. On the contrary, a small scale a results in a compressed ψ(t) which is
proper for highlighting rapid changes of x(t), namely high frequency components. The
relationship between scale a and frequency is shown in Figure 5.11.
Soman et al. (2010) relates the wt to the correlation analysis such that large transform values
represent a well match between parts of the signal under investigation and wavelets. In fact,
cwt measures the similarity between x(t) and the wavelet set. Poularikas (2009) defines the
wt as the decomposition of x(t) into sets of basic functions ψa,b(t).
In order to calculate the cwt numerically, (5.8) should be discretized. Therefore, discrete
values of a, b and t are needed. In addition, the integral is replaced by the summation.
The upper and lower limits of the integral are also substituted by the upper and lower
limits of the domain of
x(t) and selected ψ(t).

The scalogram of Figure 5.12 shows the cwt of the x(t) signal in Figure 5.9 for ≤ 1 a 256
with Haar as the mother wavelet (see Figure 5.10). On the contrary to the stft, here, both
frequency components are visible in the scalogram. Larger scales, e.g. a = 120, represent
the low frequency component of x(t) (8 Hz), while the high frequency component (the
discontinuity) is shown by lower scales. Therefore, wt is a powerful method for analyzing
signals such as EOG with different frequency components at different time samples.
240
200
Scales a
160 200
120
80
40 150
1
1 1.2 1.4 1.6 1.8
2 Xψ(a, b)
Time b [s]
20
100
15
Scales a
10 50
1
1 1.2 1.4 1.6 1.8 2
Time b [s]
Figure 5.12.: Scalogram of the cwt for x(t) signal shown in Figure 5.9, top plot: 1 ≤ a ≤ 256, bottom
plot: 1 ≤ a ≤ 20
For the EOG signals, the goal is to detect blinks in all phases of the drive and to distinguish
between saccades and blinks. To this end, Figure 5.13 shows the scalograms of the cwt with
Haar, Morlet, Daubechies and Coiflet mother wavelets for 20 s of the awake and drowsy
phases of the drive. The scalogram was calculated based on (5.8). The scale a has been varied
from 1 to≤50 (1 a 50). Comparison of the scalograms considering all mother wavelets
indicates that time localization quality of blinks are the same for all of the mother wavelets.
Sharp blinks of the awake phase are representative of abrupt changes which are highlighted
in low scales. For the drowsy phase, however, the blinks are more visible for larger values of
a in comparison to the awake phase due to their lower frequency and slow changes. This
X to the smaller values of ψ(a, b) (darker color of the scalogram) at lower scales in
corresponds
Figure 5.13. Overall, the scalograms show that a > 15 does not provide accurate time
localization information.
Figure 5.14 showsXψ(a, b) of Figure 5.13 at a = 5, 10 and 15. It can be seen that for higher
scales, the amplitude of the cwt is higher at a cost of worse time localization for all
mother wavelets. In addition, detection of peaks in the cwt by the Haar wavelet seems to
be easier than other mother wavelets. This is due to the similarity between pattern of
blinks and Haar
awake phase drowsy phase

V (n) [µV]
V (n) [µV]
200
200
100
100
0
0
−100
−100
0 5 10 15 20 0 5 10 15 20
Time [s] Time [s]
50 cwt with Haar cwt with Haar
50
wavelet wavelet
Scales a
Scales a
40
40
30
30 200
20
20
10
10
1 1
0 5 10 15 20 0 5 10 15 20 150
Time Time
[s] [s]
cwt with Morlet cwt with Morlet 100
wavelet wavelet
50 50
Scales a
Scales a
40 40
50
30 30
20 20
10 10
1 1
0 5 10 15 20 0 5 10 15 20
Time [s] Time [s]
cwt with Daubechies (db2) wavelet cwt with Daubechies (db2) wavelet
50 50
Scales a
Scales a
40 40
30 30 200
20 20
10 10
1 1 150
0 5 10 15 20 0 5 10 15 20
Time Time
[s] (coif2)
cwt with Coiflet [s]
cwt with Coiflet (coif2)
wavelet 100
wavelet
50 50
Scales a
Scales a
40 40
30 50
30
20 20
10 10
1 1
0 5 10 15 20 0 5 10 15 20
Time [s] Time [s]
Figure 5.13.: Scalograms of the cwt with different mother wavelets for V (n) signal of the awake
phase (left plots) and the drowsy phase (right plots) of the drive
wavelet. As an example, considering a = 10 for the Morlet wavelet (third row in Figure 5.14),
it is clearly difficult to identify the start and end of an eye blink. For the drowsy phase, as
mentioned before, a = 5 seems not to be a suitable scale due to the low amplitude of Xψ(a, b).
Overall, the detection of blinks can be performed by defining a threshold regarding Xψ(a, b)
with
specific values of a, becauseXψ(a, b) does not suffer from the drift problem introduced in Section
2.1.2. In fact, similar to stft, due to varying characteristics of eye movements, for detection of
all events more than one scale is needed.
The results with the Haar wavelet are very similar to the negative of the derivative of the
EOG signal V t(n) (Mallat, 2009; Barea et al., 2012). Mallat (2009, Chapter 6.1.2) proved
mathematically that applying the wt with a mother wavelet with λ vanishing moments can
be interpreted as the λ-th order derivative of the original signal. Hence, applying the Haar
mother wavelet to a signal in the wt yields the first derivation of that signal at low scales.
Figure 5.15 demonstrates − V t(n) together with ψ(a, b) with the Haar mother
wavelet at a = 5, 10, 15, 30 and 100 for the same EOG signals shown in Figure 5.13.
Except X for the fact that at low scales ψ(a, b) is much smoother than V t(n), for the detection
of fast blinks both signals provide recognizable peaks at low scales. However, slower blinks,
e.g. around t = 10 s in the right plots of Figure 5.15 (shown X with an arrow), are difficult to
t
detect in ψ(a, b) with a = 5, 10 and V (n), because the amplitudes X are not
significantly high. Interestingly, the amplitude of ψ(100, b) is very large around t
= 10 s which makes the wavelet transform at large scales suitable for detection of slow
eye movements. This fact is highlighted in Figure 5.16. In the top plot of this figure, the
first and the last 20 s represent fast (highXfrequency) and slow (low frequency) blinks,
respectively. The bottom plot showsX ψ(a, b) with a = 10, 30 and 100. At a = 10, only fast
blinks are recognizable in the ψ(a, b) with accurate time localization
≈ information. At
X a = 30, some of the slower eye movements are also evident, such as at t 28 s. However,
in ψ (100, b), all of the slower movements can be detected due to their large amplitudes. In
fact, the bottom plot of Figure 5.16 clarifies that larger amplitudes at higher scales
represent low frequency components of the EOG signals. Although at a = 100 high
frequency movements are
also recognizable, extracting the exact location of these blinks is not as easy as at lower scales.
Our approach for detection of fast and slow eye movements by CWT
If only detection of fast eye movements based on the cwt signals is of interest, the event
detection algorithm explained in Section 5.2 can be Xapplied to e.g. ψ(10, b) due to the
similarity between the result of cwt and V t(n) signals. For the detection of both fast and slow
blinks, however, the following algorithm has been applied.
Similar to (5.2) in the previous algorithm,Xψ(a, b) signal is compared with a threshold to detect
relevant peaks. We have empirically selected a = 10, 30 and 100 to cover the following. a = 10
and 100 are suitable for the detection of fast and slow blinks, respectively. We have used a =
30 to improve the time localization of slower blinks detected X in ψ(100, b) as explained in the
next steps.
Figures 5.15 and 5.16 show that the amplitude of the cwt signals have different ranges
depending on the value of a. Hence, different thresholds should be set for event detection at
X peaks of ψ(a, b) at each a by applying
each scale. The asterisks in Figure 5.17 show all detected
different thresholds, separately.
According to the bottom plot of Figure 5.16, an event might be detected at several scales
Xin ψ (a, b) signals depending on its velocity. First, we analyze the lowest scale, namely a =
X
10. For each detected peak of ψ(10, b), we consider a ∆t time offset
around its time index tpeak,10. Empirically, we selected ∆t = 0.3 s. If other peaks at other
scales, namely a = 30 and 100, are also − detected in the time interval [tpeak,10 ∆t, tpeak,10 +
∆t], they will be merged, since they are referring to the same event which is already
detected at a lower scale. Otherwise, that peak

V (n) [µV]
V (n) [µV]
200
200
100
100
0
0
−100 −100
12 13 14 15 16 17 9 10 11 12 13 14 15
Time [s] Time [s]
cwt with Haar wavelet cwt with Haar wavelet

400 200
Xψ(a, b)
Xψ(a, b)
200
0
0
−200
−200
−400
12 13 14 15 16 17 9 10 11 12 13 14 15
Time [s] Time [s]
cwt with Morlet wavelet cwt with Morlet wavelet
400 200
Xψ(a, b)
Xψ(a, b)
200
0
0
a=5
−200 a = 10
−200
−400 a = 15
12 13 14 15 16 17 9 10 11 12 13 14 15
Time [s] Time [s]
cwt with Daubechies (db2) cwt with Daubechies (db2) wavelet

400 wavelet 200
Xψ(a, b)
Xψ(a, b)
200
0
0
−200
−200
−400
12 13 14 15 16 17 9 10 11 12 13 14 15
Time [s] Time [s]
cwt with Coiflet (coif2) wavelet cwt with Coiflet (coif2) wavelet
400 200
Xψ(a, b)
Xψ(a, b)
200
0
0
−200
−200
−400
12 13 14 15 16 17 9 10 11 12 13 14 15
Time [s] Time [s]
Figure 5.14.: cwt with different mother wavelets for V (n) signal of the awake phase (left plots) and
the drowsy phase (right plots) of the drive with a = 5, 10 and 15
will be analyzed further. The same procedure is applied to a peak which is detected only at
a = 30 and 100. Mathematically, we have

V (n) [µV]
V (n) [µV]
200 200
100 100
0 0
−100 −100
100 2.5 50 1
Xψ(5, b)
−V ′(n) [mV] −V (n) [mV] −V (n) [mV] −V (n) [mV] −V ′(n) [mV]
−V ′(n) [mV] −V ′(n) [mV] −V ′(n) [mV] −V ′(n) [mV] −V ′(n) [mV]
0 0 0 0
−100 −2.5
−50 −1
Xψ(10, b)
250 1
200 2
0 0 0 0
−200 −2 −250 −1
′
Xψ(15, b)
250 1
200 2
0 0 0 0
−200 −2
′
−250 −1
Xψ(30, b)
250 1
300 3
0 0
0 0
−250 −1
′
−300 −3
Xψ(100, b)
500 1
250 2.5
0 0
0 0
−250 −2.5 −500 −1
12 13 14 10 12 14
Time [s] Time [s]
Figure 5.15.: Comparison of Xψ(a, b) with the Haar wavelet at a = 5, 10, 15, 30 and 100 with the negative
of the derivative of the EOG signal −V 1(n) for the awake and drowsy phases of the drive
• a = 10: if tpeak,a=10 ∈ [tpeak,10 − ∆t, tpeak,10 + ∆t], then merge tpeak,10 and tpeak,a/=10.
Otherwise, accept tpeak,a/=10 as a new peak.
It should be mentioned that positive and negative peaks are considered separately.
Therefore, only peaks with the same sign are compared with each other. The circles in Figure
5.17 show the accepted peaks at each specific scale a, while their counterparts at other scales
were discarded due to merging. For example, the first blink was detected in Xψ(10, b) by a
maximum and
300
V (n) [µV]
200 1200
100
0 1000
−100
800
−200
0 5 10 15 20 25 30 35 40
Time [s]
600
100
Scales a
400
30
200
10
0
0 5 10 15 20 25 30 35 40
Time [s]
Figure 5.16.: Comparison of cwt at a = 10, 30 and 100 for the detection of fast (the first 20 s) and
slow (the last 20 s) blinks
V (n) [µV]
200
−200
Xψ(10, b)
200
0
−200
Xψ(30, b)
500
−500
Xψ(100, b)
2000
detected peaks at each
1000 scale accepted peaks at
each scale
0
−1000
0 5 10 15 20 25 30 35 40
Time [s]
Figure 5.17.: Detected and accepted peaks at different scales of Xψ(a, b) signals
a minimum peak. Thus, the corresponding peaks detected in Xψ(30, b) and Xψ(100, b) were
ignored. At t ≈ 27 s, however, only in Xψ(100, b) a negative peak was detected. This figure also
X in ψ(100, b) have fulfilled our condition,

shows that during the first 20 s, a lot of peaks
although no blink in EOG signal can be related to them. These peaks refer to the time interval
between blinks which are all non-relevant. The events related to these peaks can be then
discarded by considering some constraints on the amplitude and duration of a detected
event.
Figure 5.18 summarizes the proposed blink detection algorithm.
For all accepted peaks at a = 10, the start, middle and end points of a blink can be either
t
extracted from Xψ(10, b) or from V (n) based on sign changes as explained before. For all other
peaks, we only useXψ(30, b) for improving time localization information. In fact, ψ(100, b) is
only used for roughly finding the occurrence of a potential slow blink.
In general, the efficiency of the cwt method in event detection applications highly depends
on the selection of a suitable scale value. We saw that depending on the value of a, an event
might be missed thoroughly. Kadambe et al. (1999) found the suitable value of a for detection
of events of the ECG signals by comparing the number and the location of detected peaks in
different scales. If similar number of peaks at similar locations were detected at two
successive scales, then increasing the scale values is suggested to be of no benefit. Otherwise,
the scale should be increased. This method, however, is not suitable for our application, since
different events are not necessarily recognizable at a specific scale. Therefore, similar to our
algorithm in Figure 5.18, we suggest the usage of three scales for the full coverage of all fast
and slow blinks. The first scale, should be low enough ≤ (5 afast 10) to pronounce fast blinks. On
the contrary, the second scale needs to be large to highlight slow movements (a≈ slow 10 afast). In
order to improve the time localization information of detected blinks by aslow, a middle-
valued a (2 afast ≤ atime-loc ≤ aslow /2) is needed as well.
The next step will be the extraction of start and end of detected events from the
corresponding wavelet signal. We showed that Haar wavelet processing of the EOG signal is
similar to the high pass filtering and consequently its results are very similar to the time
derivative of the EOG signal. Thus, the detection algorithm explained in Section 5.2 can now
be applied to the result of the wavelet-transformed EOG signal as well for extracting start
and end points of a detected event.
5.3.3. Discrete wavelet transform
Discrete wavelet transform (dwt) can be introduced based on two independent approaches.
The first approach defines it as the discretization of a and b in the cwt which is performed
based on the dyadic grid as follows
a = 2−j , b = k τ0 2−j, (5.9)
where j, k ∈ Z and kτ0 is the dilation step. Consequently, ψa,b(t) changes to
1 ( t− kτ
0
j/2 ( j
2−j−j)
ψj,k(t) = √
2 = ψ2 t− k . (5.10)
2− 2 τ0
For simplicity, we consider τ0 = 1 here. Clearly, for j = k = 0, we have ψ0,0(t) = ψ(t).
In addition to the mentioned approach, which introduces dwt as the counterpart of the cwt,
the second approach, however, defines the dwt independently based on the Haar scaling
functions and wavelet functions and the idea of nested spaces. In the following, this
approach is studied. It will be shown how dwt can be applied to EOG signals during
preprocessing step to perform noise and drift removal.
72
V (n)
calculate Xψ(a, b) for a = 10, 30 and 100
detect peaks of Xψ(10, b) detect peaks of Xψ(30, b) detect peaks of Xψ(100, b)

peaks peaks peaks
yes tpeak,a/=10 ∈ yes tpeak,a/=30 ∈ tpeak,a=100 ∈ yes

[ tpeak,10 − ∆t, tpeak,10 + ∆t] [tpeak,30 − ∆t, tpeak,30 + ∆t] [tpeak,100 −∆t, tpeak,100 +∆t]
no no no Ey
e
merge tpeak,10 accept tpeak,a=10 merge tpeak,30 accept tpeak,a/=30 accept tpeak,a/=100 merge tpeak,100
and tpeak,a/=10 as new peaks and tpeak,a/=30 as new peaks as new peaks and tpeak,a=100 mo
ve
me
nt
merge all peaks
eve
nt
det
Figure 5.18.: Flow chart of the cwt-based method for blink detection
ect
io
n
me
Haar scaling function
The Haar scaling function φ(t) is defined as
1≤ ≤0 t 1
φ(t) = (5.11)
0 otherwise.
It is clear that the product of φ(t− m) and φ(t − n) for m =/ n and m, n Z is equal to zero
∈
for all values of t. This expresses the orthonormality for the set of time-shifts of φ(t),
namely
+∞
φ(t − m) φ(t − n) dt = 0 for m /= n and m, n ∈ Z (5.12)
−∞
+∞
φ(t − m) φ(t − m) dt = 1 . (5.13)
−∞
For j = 0, the mentioned set spans the space V0 as
V0 ≡ Span{φ(t − k)} = Span{φ0,k} , (5.14)

k k
where k ∈ Z and Span{.} denotes the closed space1. In addition to the previous space, for j =
1, the space V1 can be defined which contains φ(t − k) scaled by 2, namely
√
V1 ≡ Span{ 2 φ(2t − k)} = Span{φ1,k}. (5.15)
k k
√
The factor 2 is a normalization constant which keeps the energy of signals in both spaces
similar.
φ(t − k) in V0 can easily be represented by the shrunk φ(2t − k) in V1 leading to V0 ⊂ V1, e.g.
φ(t) = φ(2t) + φ(2t − 1) (5.16)

φ(t − 1) = φ(2t − 2) + φ(2t − 3) . (5.17)
Similarly, for the j-th scale, the space is defined as
Vj ≡ Span{2j/2 φ(2j t − k)} = Span{φj,k } (5.18)

k k
which leads to
· · · V−1 ⊂ V0 ⊂ V1 ⊂ · · · ⊂ V∞ . (5.19)
This is depicted in Figure 5.19(a). Vj−1 ⊂ Vj implies that in comparison to Vj , a part is
missing in Vj−1. This missing part can be explained in terms of the Haar wavelet function.
In general, V0 ⊂ V1 leads to the following expression of φ(t)
√
φ(t) = h(n) 2 φ(2t − n), (5.20)
n
1
The space containing the set of functions expressed by the linear combination of φ(t − k) is called the span
of the basis set φ(t − k). If the space comprises the limits of the expansions as well, then it is a closed space.
(a) Scaling function space (b) Scaling and wavelet function spaces
Figure 5.19.: Schematic of spaces spanned by scaling and wavelet functions
where n ∈ Z and h(n) denotes the normalized coefficients. Accordingly, in example (5.16), we
1 1
√ 1
√
have h(0) = h(1) = √ such that φ(t) = √ 2 φ(2t) + √ 2 φ(2t − 1).
2 2 2
Haar wavelet function
The Haar wavelet function ψ(t) is defined as

  1
 1 0≤t≤
1
ψ(t) −1 ≤
2 t≤
12 (5.21)
=  0 otherwise.
Obviously, the set of translates of ψ(t), namely ψ(t − k), k ∈ Z also contains orthonormal pairs.
Similar to V0 and V1, W0 and W1 can be defined which are spanned by ψ(t − k) and ψ(2t k),
i.e.
W0 ≡ Span{ψ(t − k)} (5.22)

k
√
W1 ≡ Span{ 2 ψ(2t − k)}. (5.23)
k
However, due to the characteristics of ψ(t), we have W/0 W1 which means that the translates of
⊂ the fact that W0⊥ W1. Generalization
ψ(t) cannot be represented by translates of ψ(2t) despite
of this point to other spaces formed by the j-th scale of ψ(t) leads to
· · · W−1 ⊥ W0 ⊥ W1 ⊥ W2 · · · . (5.24)
Previously, it was mentioned that since V0⊂ V1, only the members of V0 can be represented
by the members of V1. Now, however, by adding members of W0 to V0, it is also possible to
φ(t)+ψ(t)
represent functions φ(2t− k) of V1 by members of W0 and V0, e.g. φ(2t) = 2 . In fact, W0
is the missing part in V0 which makes V0 to be a subset of V1. Mathematically, it is expressed
as
V1 = V0 ⊕ W0 with V0 ⊥ W0. ⊕ denotes the direct sum of the vector spaces. Similarly, for V2
we have V2 = V1 ⊕ W1. This is shown schematically in Figure 5.19(b). Finally, generalization
of this truth for the j-th scale expresses Vj as
Vj = Vj−1 ⊕ Wj−1 (5.25)

= Vj−2 ⊕ Wj−2 ⊕ Wj−1
.
Vj = V0 ⊕ W0 ⊕ · · · ⊕ Wj−2 ⊕ Wj−1
(5.19) and (5.25) make the multi-resolution analysis by the dwt possible.
√ of V , namely
Since W is also a subset
1 1
√ W ⊂ V , ψ( ) can also be represented by functions of
V1 such
t0 as ψ(t) = √ 2 φ(2t)
1
− √ 2 φ(2t
0
− 1).
1
Consequently, similar to (5.20), ψ(t) can also
2 2
be expressed as
√
ψ(t) = g(n) 2 φ(2t − n), (5.26)
n
where n ∈ Z and g(n) refers to the normalized coefficients. By scaling and translating φ(t) and
ψ(t) in (5.20) and (5.26) by t → 2jt − k, we have
√
φ(2j t − k) = h(n) 2 φ(2j t − 2k − n)
+1
(5.27)
n
√
ψ(2j t − k) = g(n) 2 φ(2j t − 2k − n).
+1
(5.28)
By replacing m = 2k + n, we have
√
φ(2j t − k) = h(m − 2k) 2 φ(2j t − m)
+1
(5.29)
m
√
ψ(2j t − k) = g(m − 2k) 2 φ(2j t − m).
+1
(5.30)
According to (5.18) and (5.20), a signal x(t) of space Vj+1, namely x(t) ∈ Vj+1, is defined as
j +1
x(t) = c j+1(k) 2 j+1
2 φ(2 t − k), (5.31)
k
where cj+1(k) denotes the approximation coefficient. In fact, in (5.31), x(t) is being approximated
by a set of scaling functions. Based on (5.25), which implies Vj+1 = Vj Wj, x(t) can also be
expressed by the functions φ(t) and ψ(t) of the lower scale j as follows
⊕
k k
2 2
j j
x(t) = c (k) 2 j φ(2j t − k) + d (k) 2 j ψ(2j t − k), (5.32)
where dj(k) refers to the detail coefficient.
j
j
In order to calculate cj(k), the projection of x(t) on 22 φ(2 t − k) is calculated, namely
cj(k) = x(t) 2j/ φ(2j t − k)dt .

2
(5.33)
By replacing (5.29) in the previous equation we have
h(m − 2k) x(t) 2 j

( +1)/2
φ(2j t − m) dt.
+1
cj(k) = (5.34)
m
By comparing (5.34) with (5.33), cj(k) and similarly dj(k) are defined as
cj(k) = h(m − 2k) cj+1(m) (5.35)

m
dj(k) = g(m − 2k) dj+1(m). (5.36)

m
(5.35) and (5.36) are very similar to the definition of discrete convolution and the concept of
digital filtering which is defined as
y(n) = u(n − k) x(k). (5.37)

k
Consequently, according to Addison (2010), h and g in the previous equations perform the
low pass and high pass filtering, respectively. Accordingly, cj(k) and dj(k) are referred to as
coefficients of low pass and high pass filters applied to x(t) in the framework of a digital filter
bank. The factor 2 in h(m − 2k) and g(m − 2k) leads to down sampling of the signal x(t).
x(t) can, in addition to the representations in (5.31) and (5.32), also be represented in lower
scales with regard to (5.25). Therefore, for an upper scale j = J to the lowest scale j = 0 we
have
x(t) = cJ (k) φJ,k(t) (5.38)

k
= cJ−1(k) φJ−1,k(t) + dJ−1(k) ψJ−1,k(t)

k k
J−1
= cJ−2(k) φJ−2,k(t) +
dj(k) ψj,k(t)
k k j=J−2
.
.
J−1
= c0(k) φ0,k(t) + dj(k) ψj,k(t) .

k k j=0
Figure 5.20 pictorially shows the J-stage decomposition of (5.38). The box containing the
downward arrow and 2 represents the down sampling of h and g in (5.35) and (5.36).
Figure 5.20.: J-stage decomposition tree

Figures 5.21 and 5.22 show the three-stage decomposition of the EOG signal with Daubechies
(db41) wavelet (see Appendix E) for the awake and drowsy phase, respectively. Both figures
show that by increasing the number of decomposition stages j (e.g. j = 3 instead of j = 2),
information loss in the approximation coefficient c0 increases which is definitely helpful for
noise reduction purpose. It can be seen that in the case of j = 3, the information loss in c0 is
larger than that of c12. Therefore, depending on the application, one of the c2, c1 and c0
coefficients in Figure 5.22 are a desired denoised version of the original EOG signal. Noise
removal by dwt will be explained in this section.
V (n) [µV]
300
200
100
600
400
c0
200
450
300
c1
150
400
c2
200
100
0
d0
−100
50
0
d1
−50
20
0
d2
−20
0 4 8 12 16 20
Time [s]
Figure 5.21.: Three-stage decomposition of the EOG signal during the awake phase by db4 wavelet
Reconstruction
As explained in the previous parts, the dwt decomposes the signal into approximation and
detail coefficients at a coarse resolution. Similar to other transforms, this transform can be
performed inversely to reconstruct the original signal. This is explained by considering (5.31)
and (5.32) at
1
4th vanishing moment
c1 in the case of j = 3 corresponds to c0 in the case of j = 2.
2
V (n) [µV]
200
c0 250
0
250
c1
150
c2
75
d0
0
−75
25
d1
0
−25
50
0
d2
−50
0 4 8 12 16 20
Time [s]
Figure 5.22.: Three-stage decomposition of the EOG signal during the drowsy phase by db4 wavelet
scales j + 1 and j. By replacing (5.27) and (5.28) in (5.31) and (5.32) we have
j +1
x(t) = c +1(k)
j 2 j+1 φ(2
2 t − k)
k
=
c (k) 22 j φ(2j t − k) + d (k) 2 2j ψ(2j t − k)
j j
k k
= cj (k) h(n) 2 j+1 φ(2j t − 2k − n) + d (k)jg(n) 2 j+1 φ(22j t − 2k − n).

2 +1 +1
k n
k n
(5.39)
j+1 j+1
We multiply both sides of the above equation by 2 2 φ(2 t − m) and integrate them which
leads to (5.33) for j + 1, namely cj+1(k) on the left side. The right side equation vanishes
due to orthogonality of φ(.) at all values except for m = 2k + n (or n = m − 2k) leading to
k k
cj+1(m) = cj(k) h(m − 2k) + dj(k) g(m − 2k). (5.40)

The above equation differs from the convolution equation of (5.37) in the 2k factor. In fact, for
each value of k, only the odd or the even indexed h(m) and g(m) are used. This corresponds
to up-sampling of c and d by e.g. adding zero values between existing values and then
applying h(m) and g(m) filters. This step can be repeated again for calculating the coefficients
of the next stage as visualized in Figure 5.23. Therefore, if all coefficients are used without
any changes, the signal under investigation is perfectly reconstructed. However, by removing
some or parts of the coefficients c and d, which correspond to e.g. drift or noise in the signal,
the reconstructed signal differs from the original one. Thus, this is a very advantageous
property for noise or drift removal.
Figure 5.23.: J-stage reconstruction tree
Our approach for adaptive noise removal
As mentioned in previous parts, the multi-resolution analysis by wavelets makes the

decomposition of signals possible. Therefore, by decomposing a noisy signal, the noise
component can be extracted to a large extent. An example of denoising was shown in Figure
5.22. In this figure, the detail coefficients (especially d2), unlike the approximation coefficients,
seem not to contain relevant information and consequently can be considered as noise. By
defining thresholds the contribution of all decomposed coefficients in the reconstruction
chain can be controlled. Clearly, the goal is to remove noise and to keep the useful parts of
the signals, e.g. eye blink peaks, as good as possible.
As Figures 5.21 and 5.22 show, depending on the noise level, the d1 coefficient might also
be involved in the noise reduction process. Figures 5.24 and 5.25 show examples of the EOG
signals denoised by removing only d2 (Figures 5.24(a) and 5.25(a)) and by removing both d1
and d2 (Figures 5.24(b) and 5.25(b)) in the reconstruction step, respectively. In both figures,
V1(n) and V2(n) represent the denoised versions of V (n). The denoising threshold of V1(n) was
set such that the coefficient d2 was completely removed during the reconstruction, while other
coefficients contributed to the reconstruction step thoroughly. Accordingly, the maximum
value of d2 in Figures 5.21 and 5.22 has been considered as the corresponding threshold, i.e.
(
thdenoising,d2 = max d2(n) . (5.41)
Hence, all values of d2, which were smaller than thdenoising,d2 , were discarded during the recon-
struction step. On the contrary to V1(n), V2(n) was calculated by removing both d2 and d1,
i.e.
(
thdenoising,d2 = max d2(n) (5.42)
(
thdenoising,d1 = max d1(n) . (5.43)
The residuals are also plotted (third row in Figures 5.24 and 5.25) and are defined as the
difference between the original and denoised signals, namely
ε1(n) = V (n) − V 1(n) (5.44)

o ( n) = V (n) − V (n). (5.45)2
2
The psd of the residuals, namelyE1(f ) and 2(f ), are also shown in the last rows of Figures 5.24
and 5.25. In the last plot of Figure 5.24(a), the highest power of ε1(n) is concentrated above
15 Hz, while in the last plot of Figure 5.24(b), frequencies between 5 to 10 Hz show dominant
power. In the time domain, this is evident in larger values of ε2(n) at the blink locations which
corresponds to a larger information loss of blink peaks in V2(n) in comparison to V1(n)
(compare ε1(n) and ε2(n) in Figure 5.24). As a result, removing coefficient d1 seems not to be
E
necessary for this part of the EOG signal. However, the comparison of 1(f ) and 2(f ) in Figures
5.25(a) and 5.25(b) indicates that the information loss between 5 to 10 Hz is still smaller than
the noise removed at higher frequencies. Correspondingly, the amplitude of ε2(n) is in the
same range as that of ε1(n). Therefore, removing coefficient d1 seems not to be critical in this
case. V (n) [µV]
V (n) [µV]
300
300
200
200
100
100
300
[µV]
300
[µV]
200
200
1(n)
2(n)
100
100
ε1(n) [µV]V
ε2(n) [µV]V
50
50
0
0
−50 −50
0 5 10 15 20 0 5 10 15 20
Time [s] Time [s]
40 40
E1 (f )
E2 (f )
20 20
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Frequency [Hz] Frequency [Hz]
(a) Reconstruction by removing d2 (b) Reconstruction by removing d2 and d1
Figure 5.24.: Example 1: denoising of the EOG signal by removing different coefficients during the
reconstruction
In fact, these two examples show that an adaptive noise removal procedure is needed due to
the different levels of noise in different parts of the EOG signal. Otherwise, an inflexible
reconstruction strategy causes the following problems:
• by removing both d1 and d2, the information loss of less noisy parts of the EOG signal is
inevitable (Figure 5.24(b)).
• by removing only d2, noisy parts of the EOG signals are sacrificed to save blink peak
V (n) [µV]
V (n) [µV]
200 200
100 100
0 0
−100 −100
Vr (n) [µV]
Vr (n) [µV]
200 200
100 100
0 0
ε1(n) [µV] 1
ε2(n) [µV] 2
−100 −100
30
30
0
0
−30
−30
0 5 10 15 20
0 5 10 15 20
Time [s]
Time [s]
40 40
E1 (f )
20 E2 (f ) 20
0
0 5 10 15 20 25 0
0 5 10 15 20 25
Frequency
Frequency [Hz]
[Hz]
(b) Reconstruction by removing d2 and d1
(a) Reconstruction by removing d2
Figure 5.25.: Example 2: denoising of the EOG signal by removing different coefficients during the
reconstruction
information of less noisy parts (Figure 5.25(a)).
Thus, in this work, the following compromise was made. The average value ofE 2(f ) in two
frequency bands, namely 5 Hz ≤f ≤ 10 Hz as the low frequency components and f 15 Hz
as the high frequency components ≥ have been compared with each other and the following
rules have been applied
if E2 (5 Hz ≤ f ≤ 10 Hz) ≥ E2 (f ≥ 15 Hz) × 1.1, remove d2 (5.46)

if E2 (5 Hz ≤ f ≤ 10 Hz) < E2 (f ≥ 15 Hz) × 1.1, remove d2 and d1 (5.47)
where E2(f ) denotes the average of 2(f ). According to these rules, in Figure 5.24, only the co-
efficient d2 is removed in the reconstruction step, while in Figure 5.25 both d1 and d2
coefficients are removed.
Figure 5.26 shows the scatter plot of ε1(n) and ε2(n) plotted in Figure 5.25. Although different
coefficients were removed during the reconstruction step, the values of the residuals are very
similar to each other. This underscore the strength of the proposed denoising procedure, i.e.
the blink peaks have not been influenced very differently by removing different coefficients.
The introduced denoising procedure has been applied to all collected EOG signals in this
work in the framework of preprocessing before applying the proposed eye movement
detection methods.
40
data points
best linear fit
ε2(n) [µV]
20
−20
−40 y = x + 0.00 13
−40 −20 0 20 40
ε1(n) [µV]
Figure 5.26.: Scatter plot: ε1(n) versus ε2(n) shown in Figure 5.25
Our approach for drift removal
Similar to the noise removal, multi-resolution analysis makes drift removal possible. Here, the
question is which components should be taken into consideration in the reconstruction step
in order to only remove the drift. Tinati and Mozaffary (2006) suggested a drift removal method
for the ECG signals by calculating the energy of the wavelet decomposition coefficients at
different levels and comparing it with a threshold. If the calculated energy is higher than a
pre-defined threshold, then the current decomposition level is suitable and the signal can be
reconstructed. Otherwise, the number of decomposition stages should be increased.
Based on our observations of the EOG signals, drift is a low frequency component, which is
usually represented below 0.3 Hz. Therefore, 6 or 7 decomposition stages should be enough
for reconstructing the drift signal. Figure 5.27 shows two examples of V (n) signals in awake
(top plot) and drowsy (bottom plot) phases. The drift removal in this figure is based on the
same approach used for denoising. The only difference is the coefficients which are used for
the reconstruction. For 6- and 7-stage decompositions, we only used the approximation
coefficients c5 and c6 for reconstruction, respectively, and ignored all other coefficients. The
resulting signals of reconstruction are drift signals which are shown in green (6-stage) and
magenta (7-stage) in Figure 5.27. By subtracting the drift signal from the V (n) signal, the
drift is removed and the
result is called Vˇ (n).
In the top plot of Figure 5.27, which refers to the awake phase, it seems that the number of
decomposition stages has not deteriorated V (n). The form of blinks in both Vˇ (n) signals
is
similar to V (n). However, in the bottom plot, which refers to the drowsy phase, the
closed phase of the microsleep event located≈ between t 5 s and t 8 s (shown with an
arrow) suffers from a new unwanted deformation after the 6-stage decomposition. This
phenomenon also occurred in other similar events and is a representative example.
Therefore, we have used the 7-stage decomposition for the rest of this work and considered
only c6 in the reconstruction for extracting the drift. Similar to the noise removal, drift
removal was also applied to all collected EOG signals in this work in the framework of
preprocessing before applying the proposed eye movement detection methods.
5.4 Comparison of event detection methods 83
600
[µV]
300
−300
1200 V (n)
drift signal by 6-stage
900 decomposition
Vˇ (n) by 6-stage decomposition
[µV]
600 drift signal by 7-stage

decomposition
300
Vˇ (n) by 7-stage decomposition
0
−300
0 5 10 15 20 25
Time [s]
Figure 5.27.: Two examples of drift removal with the wavelet decomposition and reconstruction for
awake (top) and drowsy (bottom) phases of V (n)
5.4. Comparison of event detection methods
In this section, the introduced event detection methods are evaluated by comparing them
with a reference. To this end, we have labeled eye movement events in our offline EOG signals
based on the synchronously recorded video data from the subjects’ face. By comparing the
detected events with the labeled ones, events are then assigned to these three categories: true
positive (tp), false positive (fp) and false negative (fn) according to Table 5.1. Clearly, true
negative (tn) cannot be assessed in this application.
Table 5.1.: confusion matrix: events of video labeling versus those of the proposed detection methods
detection
event detected not detected
labeled True Positive (tp) False Negative (fn)
video labeling
not labeled False Positive (fp) True Negative (tn)
After counting all detected and missed events, the corresponding metrics for the evaluation
of detection methods are calculated. The metrics used in this work are recall (rc) and precision
(pc) which are defined as follows
TP
RC = × 100 (5.48)
TP + FN
TP
PC = × 100 . (5.49)
TP + FP
rc describes the proportion of correctly detected events among true ones ( TP + FN), while pc
represents the proportion of correctly identified events among all detected ones ( TP + FP).
5.4.1. Median filter-based versus derivative-based method
First, we compare the performance of the event detection by the median filter-based method
introduced in Section 5.1 with that of the proposed derivative-based method in Section 5.2.
As mentioned before, both methods are only suitable for detection of fast eye movements.
Therefore, only the detection of these eye movements, i.e. fast blinks and vertical saccades,
are compared with each other. Figure 5.28 shows the rc and pc values for seven subjects.
100 100
80 80
RC [%]
60 60
RC [%]
40 40
20
20
0
16 17 18 19 20 21 24 0
16 17 18 19 20 21 24
Subjects Subject
(a) Awake phase s
(b) Drowsy phase
100 100
80 80
PC [%]
60 60
PC [%]
40
40
20
20
0
16 17 18 19 20 21 24 0
16 derivative-based
vertical saccade: 17 18 method
19 blink:
20 derivative-based
21 24 method
Subjects blink: median filter method Subject
(c) Awake phase s
(d) Drowsy phase
Figure 5.28.: rc and pc of vertical saccade and blink detections for the derivative-based algorithm and
the median filter-based method during the awake and drowsy phases
As the goal is the detection of saccades and blinks not only during the awake phase, but also
during the drowsy phase of the drive, the rc and pc during these phases were calculated
separately with regard to the collected kss values. This helps to highlight the efficiency of the
detection methods, when different forms of events are present in the data. For each phase, 10
min of the collected EOG is evaluated 1. To this end, we defined the awake ≤ phase by kss 5
≥ phase by kss 8 (Figures 5.28(b) and 5.28(d)).
(Figures 5.28(a) and 5.28(c)) and the drowsy
These definitions with the gap of 2 kss steps, namely kss = 6 and 7, emphasize the difference
between events of the two phases under investigation.
The calculated rates show that during the awake phase (Figures 5.28(a) and 5.28(c)), both
the proposed algorithm and the median filter-based method detected all true blinks correctly
(all rcs = 100% in Figure 5.28(a)). However, for all subjects, the median filter-based
method always
Only 10 min of the collected EOG was labeled.
1
had a smaller pc (see Figure 5.28(c)). The reason is that most of the saccades combined with
head rotation were wrongly considered as blinks, especially for subject S21. During the
drowsy phase (Figures 5.28(b) and 5.28(d)), for subjects S18 and S24, the blink detection using
the proposed algorithm outperformed the median filter-based method by about 20%. For
saccade detection, the proposed algorithm achieved pc > 95% and rc > 80% for all subjects
during the awake phase. For subject S16 all existing saccades were detected correctly.
Moreover, lower pc values for saccade detection in Figure 5.28(d) in comparison to Figure
5.28(c) imply that the saccade detection during the drowsy phase is more difficult, since small
amplitude blinks due to drowsiness might be mistaken for saccades.
It should be mentioned that not only the detection of events but also the quality of the
extracted features out of the detected events plays a key role in assessing drowsiness. The
blink amplitude as defined in (5.3) and the duration as the time difference between points C
and A in Figure 5.3 were extracted from all detected blinks using the median filter-based
method and the proposed algorithm. The moving average of these features over 15-min
windows is shown in Figure 5.29 for subjects S15, S16 and S18 as representative examples. In
addition, the numbers of blinks per minute are shown in the last row. The background colors
≤ level by the subjects
refer to the self-rated drowsiness ≤ ≤as awake (kss 5), medium (6 kss
7) and drowsy (kss ≥ the ranges of the calculated features are
8). In most of the plots,
different with respect to the applied detection methods.
S15 S16 S1
610 470 780
blink duration [ms]
260 250 258
blink duration [ms]

8
560 250 425 239 650 252
510 240 380 228 520 246
230
[µV]
210 410 390 340 270 [µV]

Average
Average
Average blink amplitude
Average blink amplitude
195
165 370 350 240 180
160
120 330 310 140 90
36
Average nr. of blink per min.
42 34
28
32 19
20
0 60 22 4
0 60 120 180 0 60 120
Time [min] Time [min] Time [min]
new method median method awake medium drowsy
Figure 5.29.: Average duration (first row), amplitude (second row) and number of blinks (third row)
versus self-estimated drowsiness level for subjects S15, S16 and S18 based on the
derivative-based algorithm and the median filter-based method
For subject S15, the amplitudes of detected blinks by both methods are very similar.
Neverthe- less, the evolution of blink duration runs counter to the evolution of drowsiness
using the median filter-based method for about one hour. This means that during this time,
based on the blink detection results using the median filter-based method, blink duration is
negatively correlated with drowsiness, while the proposed algorithm shows positive
correlation. The reason is that
although the median filter-based method has detected some of the blinks during the drowsy
phase, the start and end points of them were not extracted correctly as also shown in Figure
5.2. Moreover, for about 30 min, a large number of blinks was not detected (see left plot of
last row in Figure 5.29). The differences in the evolution of blink amplitude, duration and
number plotted for S15 are similar to the subjects S15, S19, S20, S21 and S24. For S16,
however, it is the number of detected blinks which is similar for both methods. This is also
shown in Figure
5.28 where rc and pc were very close to each other. Nevertheless, the increasing behavior of
blink duration is very weak using the median filter-based method in comparison to that of
the proposed algorithm. The features of subjects S22 and S23 also have the same differences
as subject S16. For subject S18, despite an equal number of detected blinks during some parts
of the drive using both methods, the extracted amplitudes and durations of these parts also
deviate from one another. Such deviation is also the case for subjects S19 and S25.
According to Figure 5.29, the average number of blinks has increased for subjects S15 and
S16, while for subject S18, it has decreased. One explanation is that, the average duration of
the blinks for subject S18 increased to a larger extent in the course of time in comparison to
other subjects. Therefore, in comparison to other subjects, the eyes of subject S18 were closed
for a longer time during the experiment which leads to a smaller number of blinks.
All in all, according to Figure 5.29, the correlation between self-estimated drowsiness level
and extracted features of eye movements based on the proposed algorithm seems to be strong
enough to assess drowsiness, especially since similar results were also achieved in previous
studies (Dong et al., 2011) (decreased blink amplitude and increased blink duration in the
course of time). Extracting other features and applying complex classification methods are
the next steps toward drowsiness detection based on eye movements which will be studied in
Chapters 7 and 8.
Due to unsatisfactory results of the median filter-based method for both blinks and saccade
detection and their corresponding features, it will not be analyzed further in this work. In
addition, as mentioned before, the median filter-based method is unable to detect slow
blinks.
5.4.2. Derivative-based method versus wavelet transform-based method
This section compares the rc and pc of the blinks detected by the derivative-based method
(Section 5.2) and continuous wavelet transform method (Section 5.3.2) for the 10 subjects
under study. We expect that for the awake phase, both algorithms perform similarly. On the
contrary, depending on the number of slow blinks during the drowsy phase, the wavelet
transform method is expected to outperform the derivative-based method.
Figure 5.30 shows the calculated rc and pc values for 10 subjects regarding both methods.
Subject S3 had no awake phase according to his kss values. These results are based on 20 min
of labeled EOG data per subject, i.e. 10 min for each phase. First, for each subject, based on
the kss values, the awake and drowsy segments of the drive were defined. Afterwards, for
each phase, 10 1-min segments were randomly chosen for labeling and further evaluation. On
average, for the awake phase 263 and for the drowsy phase 453 blinks were labeled per
subject.
As expected, during the awake phase, both methods are very similar in detection of fast
blinks (all rcs > 95%) and the differences are negligible. Interestingly, the good results of
the wavelet transform method are obtained at a cost of lower pc values which are the
result of confusing saccades combined with head rotation with blinks. For the drowsy
phase, however, for most of the subjects the wavelet method has detected more blinks,
especially for subject S2. This
100 100
80 80
RC [%]
RC [%]
60 60
40 40
20 20
0 0
1 2 3 4 5 6 7 8 9 11 1 2 3 4 5 6 7 8 9 11
Subjects Subjects
(a) Awake phase (b) Drowsy phase
100 100
80 80
PC [%]
60 60
PC [%]
40 40
20
20
0 derivate-based method wavelet transform method
1 2 3 4 5 6 7 8 9 11 0
1 2 3 4 5 6 7 8 9 11
Subjects Subjects
(c) Awake phase
(d) Drowsy phase
Figure 5.30.: rc and pc of blink detection for the derivate based algorithm and the wavelet transform
method during the awake and drowsy phases
subject showed lots of slow blinks during the drowsy phase. The pc values of this phase are
also smaller after applying the wavelet transform method.
It should be mentioned that according to our observations during the experiments, slow
blinks might not occur for all subjects to the same extent during drowsiness. For some of our
subjects, almost no slow blinks occurred during the experiment, although the subjects
experienced lots of microsleeps. For these subjects, only the duration of the closed phase
increased as shown in Figure 3.4(b). The velocity of the opening and closing phases varied to
a smaller extent. For some other subjects, however, as shown in Figure 3.4(c), it was the
velocity of the opening and closing phases which varied severely due to drowsiness in
comparison to the duration of the closed phase. Therefore, depending on the characteristics
of the blinks during the drowsy phase, different detection approaches should be applied.
Conclusions
All in all, we conclude that if the application is limited to the detection of sharp blinks, where
saccades are considered noisy eye movements to be ignored, the median filter-based method
is an appropriate detection method to be applied to the EOG or similar signals. As soon as
the vertical saccades should be recognized and distinguished from blinks, we suggest the
usage of the derivative-based approach. In addition, the derivative signal method is more
robust against the
duration of the event to be detected in comparison to the median filter-based method.

Therefore, for the detection of blinks, whose duration varies in the course of time, the median
filter-based method with a fixed window length is not suitable.
Further, it was shown to be possible to detect vertical saccades and blinks simultaneously in
vertical EOG signals based on the derivative signal. In addition, a 3-means clustering
algorithm is recommended to distinguish between saccades and blinks in those applications
where the data of both awake and drowsy phases are available. This helps to prevent
confusing a driver’s decreased amplitude blinks with saccades or other eye movements.
Moreover, blinks with long eye closure and microsleep events, whose patterns deviate from
those during the awake phase, were detected and distinguished from saccades based on the
statistical distribution of the amplitude. This method is a reliable approach as long as the
velocity of eye lid movements is larger than the velocity of non-relevant phases of the EOG
signal.
We saw that slow blinks are also another issue, if the derivative signal is used for their
detection. In order to detect all fast and slow blinks simultaneously, we suggest the
continuous wavelet transform method. This method covers the detection of all types of blinks
by selecting appropriate scaling values a. However, by applying the wavelet transform
approach, smaller values of the pc in both awake and drowsy phases are expected in
comparison to the derivative-based detection method as shown in Figures 5.30(c) and
5.30(d).
Online versus offline eye movement detection methods
Another aspect for comparing each of the studied methods with each other is whether they
can be implemented online to be evaluated directly during the experiment. This is the case
for the median filter-based method, which was implemented by simulink to be applied
online. The derivative-based method, however, has lots of parameters, which should be adjusted
individually based on the entire available data. The question is, how many events are needed
for finding a suitable threshold, e.g. for distinguishing between blinks and saccades (see
Figure 5.4). To answer this equation, we implemented the derivative-based detection algorithm
online and updated the mentioned threshold for each subject after detecting 20 new events. In
other words, after detecting 20 new events, the threshold was recalculated with respect to all
available detected events up to that instant of the time. Figure 5.31 shows two representative
results of the found online thresholds th2-means for subjects S36 and S39.
These two subjects were awake during the whole drive and did not suffer from lack of
vigilance. This information is based on the collected kss data and the offline video analysis of
± area of 15 µV tolerance around the true threshold, namely offline calculated
the drives. The
th2-means, is also shown in dashed lines. For subject S36, the first values of the found online
thresholds are not reliable, because they fluctuate to a large extent due to lack of available
events in the clusters, i.e. vertical saccades versus blinks. After detecting 160 events, the
online calculated th2-means is very close to the true offline th2-means. On the contrary, for subject
S39, the online th2-means does not fluctuate at the beginning, but starts with a large value and
converges to the offline th2-means after roughly 2000 events.
It will be explained in Section 7.2 that the number of blinks for humans is between 15 and 20
blinks per minute. If we consider 5 vertical saccades per minute as well on average, in total 25
events should be available for detection per minute. As a result, the found values of 160 and
2000 events correspond to about 6 and 80 min of EOG data or drive time, respectively.
Time [min]
0 40 80 120 160 200 240 280 320
500
online th2-means
offline th2-means
400 offline th2-means ± 15 µV
[µV]
300
200
0 1000 2000 3000 4000 5000 6000 7000 8000
Nr. of events
(a) subject S36
Time [min]
0 20 40 60 80 100 120 140 160 180
250
225
[µV]
200
175
0 500 1000 1500 2000 2500 3000 3500 4000 4500

Nr. of events
(b) subject S39
Figure 5.31.: Setting the threshold for distinguishing between blinks and vertical saccades in an online
implementation of the detection method
The detection algorithm based on the continuous wavelet transform method is indeed more
time-consuming for the online application, because several coefficients have to be calculated
in addition to the peak detection procedure. Therefore, among these methods, the derivative-
based eye movement detection method seems to detect events more precisely with regard to
the time needed by the algorithm in its online implementation.
Final applied algorithms for preprocessing of EOG signals

and eye movement detection
In previous sections, we studied three eye movement detection methods and discussed the
strengths and weaknesses of them in detail. In addition, due to the decomposition and re-
construction properties of dwt, two preprocessing steps for EOG data were investigated. As
mentioned before, prior to applying any of the detection methods, first, we preprocessed all
collected EOG data by applying noise and drift removal algorithms to them. These steps
improve the event detection procedure.
For the detection of fast eye movements, we applied the derivative-based algorithm explained
in Section 5.2. In order to overcome the shortcoming of the derivative-based algorithm in
detection of slow blinks, we combined the wavelet transform method with the detection
algorithm based on the derivative signal to improve the dr values. The reason is the smaller
values of the pc
in both awake and drowsy phases for the wavelet transform approach in comparison to the
derivative-based detection method. The combination has been performed such that only slow
blink events, which were detected in X ψ(30, b) and ψ(100, b) were added to the detected events
based on the derivative signal. All events detected by Xψ(10, b) were considered to be fast eye
movements which were already detected using the derivative signal. In other words, a fast
blink, which was only detected using the wavelet transform, was discarded.
6. Blink behavior in distracted
and undistracted driving
Among different physiological symptoms for assessing driver drowsiness or inattention,

drivers eye movements, in particular blinks, are more associated with drowsiness (Hargutt,
2003; Schle- icher et al., 2008; Dong et al., 2011). As an example, previous studies introduced
increased blink frequency as a measure related to drowsiness over time (Stern et al., 1994;
Summala et al., 1999; Sirevaag and Stern, 2000; Hargutt, 2003).
Can all blinks during driving be interpreted as drowsiness-related patterns? Or is the
occurrence of some blinks situation-dependent? Do all blinks during driving reflect the
alertness of the driver to the same extent? This chapter aims to answer these questions which
describe the situation dependency of the blink occurrence. If the occurrence of some blinks is
situation-dependent, then such blinks might (locally) affect the amount of correlation
between the extracted blink features and driver’s state of vigilance in warning systems.
Therefore, they should be taken out of consideration or handled differently for assessing
driver drowsiness. Phases of visual or cognitive distractions during data entering into the
navigation system or during a mobile phone conversation are examples of situation
dependency of the blink occurrence. In this context, Liang and Lee (2010) reported increased
blink frequency in a driving simulator during cognitive and combined cognitive and visual
distraction.
In order to investigate the aforementioned questions, first, it is necessary to know which type
of blinks is relevant for drowsiness detection. As mentioned in Section 3.3, blinking occurs
either voluntarily or involuntarily, i.e. spontaneously and reflexively. Among all blinks, we
have mentioned that spontaneous blinks are the relevant blinks to be explored for drowsiness
detection.
In this chapter, we study the blink behavior of subjects who participated in the experiment
described in Section 4.2.2. These subjects performed the primary driving task and the
visuomotor or auditory secondary tasks. This means that they were instructed to experience
fixed-position gaze shifts, namely gaze shifts between pre-defined fixed positions. These
gaze shifts were the result of the visuomotor secondary task which represents visual
distraction.
Evinger et al. (1994, page 342) stated based on Beideman and Stern (1977):
Gaze-evoked blinks rarely occur when making a saccade to a rear view mirror before
shifting lanes, but often accompany the saccade back to straight ahead position.
Inspired by this fact, it is also studied whether the occurrence of gaze shift-induced blinks is
direction-dependent.
Apart from the mentioned points, during undistracted driving, blinks mainly occur sponta-
neously or due to gaze shifts, e.g. during a look in the rear-view mirror. Moreover, we know
that the driver needs to scan the scene and the road ahead frequently to perform the driving
task properly. This raises a question: which gaze shifts (with respect to their amplitude) are
more probably accompanied by a blink during driving (without performing a secondary
task)?
92 Blink behavior in distracted and undistracted driving
This question will be investigated by following the experiment described in Section 4.3, in
which eye movements of drivers were collected by the EOG in a driving simulator. In this
experiment, we specifically study the relationship between the amount of gaze shift and the
blink occurrence.
The results of this chapter are taken from Ebrahim et al. (2013c). All blinks and saccades were
detected based on the algorithm described in Section 5.2 out of the EOG signals.
6.1. Time-on-task analysis of the saccade rate during

the visuomotor task
In this section, it is studied whether the saccade rate (number of saccades per minute) during
the visuomotor secondary task changed over time, i.e. the four blocks of the experiment. For
investigation of the variable time-on-task, first, the number of all detected saccades out of the
H(n) signal during the visuomotor secondary task was calculated. As mentioned in Section
4.2.2, this task was repeated four times by each subject and each time for 3 min. The saccade
rate was calculated during each block for each subject separately. Figure 6.1(a) shows the
mean and the standard deviation of the calculated values for each block over all subjects.
65
80
Saccade rate [1/min]
Saccade rate [1/min]
60
70
55 60
50 50
45 40
40 30
1 2 3 4
Blocks 1 2 3 4
Blocks
(a) mean and standard deviation representation
(b) boxplot representation
Figure 6.1.: Saccade rate for the variable time-on-task (four blocks) for all subjects
By applying the analysis of variance (one-way repeated measures anova), explained in

Appendix D.5, it is examined whether the saccade rate has changed significantly over time. It
should be mentioned that all assumptions of the anova are fulfilled. They are as follows:
independency1 and normality of observations (see Appendix D.2) and the homogeneity of
their variances, i.e. sphericity (see Appendix D.6). The corresponding H0 hypothesis states
that the means of different groups of measurements (blocks) are equal. Since the calculated p-
value is 0.30 at the confidence level of 95% (α = 0.05) with F3,75 = 1.24, the H0 hypothesis is
not rejected (0.30 > 0.05). Consequently, we cannot show the dependency of the numbers of
gaze shifts during the visuomotor task upon the variable time-on-task considering all
subjects and all blocks.
6.2. Time-on-task analysis of the blink rate
Similar to the previous step, here we explore whether the blink rate (number of blinks per
minute) changed over time (four blocks) during the primary and secondary tasks separately.
1
We suppose that the occurrence of an eye movement, e.g. a saccade, at a time instant is independent of the
occurrence of another saccade at a different instant.
6.3 Saccades time-locked to blinks during the visuomotor task 93
The calculated values by applying the anova for repeated measures are F3,75 = 1.05 with p-
value = 0.37 for the visuomotor task, F3,57 = 0.27 with p-value = 0.85 for the driving task and
F3,75 = 1.08 with p-value = 0.36 for the auditory task. According to the p-values, which are
all larger than α = 0.05, we cannot show that the difference between the means of blink rates
during four blocks are significant.
6.3. Saccades time-locked to blinks during the visuomotor task
This section studies the number of saccades time-locked to blinks, i.e. saccades occurring
simultaneously with blinks (see Figure 3.5(c)), during the visuomotor task. A saccade was
considered as time-locked to a blink, if it overlapped a blink within at least 80% of its
duration. Therefore, all horizontal saccades and all blinks during the visuomotor task were
first detected in H(n) and V (n) signals, respectively. Then, the number of saccades
accompanied by blinks (in percent) was calculated as shown in Figure 6.2 for each subject
and each block separately. On average over all blocks, for 14 subjects (2, 4, 6, 7, 8, 10, 12, 13,
14, 16, 21, 22, 23, and 25) over 80% of all saccades due to the visuomotor task were time-
locked to blinks. For 9 subjects (1, 3, 5, 9, 15, 19, 20, 24 and 26) gaze shift-induced blinks
accompanied from 50% to 75% of saccades.
Interestingly, the calculated percentage was less than 50% for only 3 subjects (11, 17 and 18).
All in all, for most of the subjects gaze shift induced the blink occurrence which is in
agreement with the statement of R e c o r d s (1979).
Block 1 Block 2 Block 3 Block 4
100
Percentage of saccades time-locked to blinks [%]
80
60
40
20
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Subjects
Figure 6.2.: Percentage of saccades time-locked to blinks for all subjects and all blocks during the visuo-
motor task
6.4. Direction-dependency of blinks time-locked to saccades
As mentioned at the beginning of this chapter, inspired by Evinger et al. (1994), Figure 6.3
shows whether the occurrence of gaze shift-induced blinks depended on the direction of the
saccade. On this account, we considered only saccades either towards the road or towards the
screen displaying the Landolt rings during the visuomotor task, i.e. saccades occurring
between fixed positions. The bars of this figure represent the number of saccades time-locked
to blinks (in percent) with respect to their direction. The results of all blocks were combined,
because the saccade rate was shown to be independent of the variable time-on-task. With the
exception
of subject S19, for all subjects the number of saccades time-locked to blinks was larger while
Percentage of occurrence [%]
moving focus towards the road (average = 88% ± 16.7). However, for the other direction,
namely towards the screen, we found different behaviors (average = 61% ± 31.1).
saccades towards the saccades towards the screen
100
road
80
60
40
20
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Subjects
Figure 6.3.: Percentage of saccades time-locked to blinks with respect to saccade direction averaged
over all blocks during the visuomotor task
Saccades towards the screen [%]
In order to categorize the behaviors, the values of bars of Figure 6.3 are plotted versus each
other in Figure 6.4 (dark bars as the x-axis, light bars as the y-axis).
100
y=x
80 Cluster A
60
40
20 Cluster B
0
0 20 40 60 80 100
Saccades towards the road [%]
Figure 6.4.: Scatter plot: number of saccades accompanied by blinks with respect to their direction
during the visuomotor task for all subjects. Ellipses show two clusters.
The red line in this plot refers to the y = x line and is plotted for a better understanding.
In fact, this line shows to what extent the number of saccades time-locked to blinks
towards the road deviated from that of the equivalent saccades towards the screen. The
ellipses show distinguishable clusters A and B. Cluster A contains 14 subjects, for whom,
at least 65% of saccades induced the occurrence of blinks in both directions. ± On average,
95% 3.5 of saccades towards ± the road and 85% 9.0 of saccades towards the screen were
accompanied by blinks for these subjects. Consequently, for this cluster, the occurrence
of gaze shift-induced blinks was less direction dependent. On the other hand, for the
subjects of cluster B, the direction of saccades seems to affect the occurrence of blinks to a
greater extent. For this cluster, on average, 93% ± 3.4 of saccades towards the road were
accompanied by a blink, while only 27% ± 15.4
6.5 Blink rate analysis during the secondary and primary tasks 95
of saccades towards the screen induced the occurrence of blinks. Therefore, in one direction
(towards the road), the blink occurrence was more dominant. For subjects S15, S17 and S18,
similar to cluster B, the saccades towards the road have induced more blinks than those
towards the screen. However, the number of induced blinks for these subjects was less in
comparison to that of the cluster B.
6.5. Blink rate analysis during the secondary and primary tasks
Here, it is studied whether performing secondary tasks affected the blink rate in comparison
to the driving task. In order to analyze this, for all subjects, all detected blinks during each of
these tasks were considered, independent of the fact of whether they were gaze shift-induced
or not. Left plots of Figures 6.5(a) and 6.5(b) show the scatter plots of blink rates for driving
versus visuomotor and auditory tasks with the correlation values of ρp = 0.32 (p-value <
0.001) and ρp = 0.78 (p-value < 0.001), respectively. ρp denotes the Pearson correlation
coefficient which will be explained in Section 7.5. The closer the value of ρp to 1/0, the
smaller/larger is the impact of the corresponding secondary task on the blink rate. According
to the calculated ρp values and the scatter plots, it can be concluded that performing the
visuomotor task affected the blink behavior to a larger extent in comparison to the auditory
task (0.32 < 0.78). Moreover, based on the statistical test explained in Appendix D.4, the
correlation between blink rate during the auditory task and that of the driving task is
significantly different from the correlation between blink rate during the visuomotor task
and that of the driving task.
Similar scatter plots for the subjects of clusters A and B are also shown in Figure 6.5, compar-
ing the blink rates during both the visuomotor (middle and right plots in Figure 6.5(a)) and
auditory tasks (middle and right plots in Figure 6.5(b)) versus the driving task. According to
the correlation values, it seems that for cluster B with direction-dependent gaze shift-induced
blinks, the blink rate was less affected during the visuomotor task in comparison to cluster A,
because the ρp values are larger than those of the cluster A.
Table 6.1 shows the results of applying anova for repeated measures to investigate the
significant differences of the blink rate between tasks. In fact the difference between blink
rate during driving task and each secondary task is studied. Overall, the mean of blink rate
for visuomotor task is significantly different from that of the driving task (p-value = 0.012 < α
= 0.05). This is also the case for the auditory versus driving task (p-value = 0.011 < α). Thus,
the results state that performing a secondary task (either visuomotor or auditory) affected the
blink rate. According to the values related to cluster B with direction-dependent gaze shift-
induced blinks, we cannot show that the blink rate changed significantly during the
secondary tasks (p-value = 0.831 > α and p-value = 0.800 > α). However, for cluster A, it is the
opposite which means performing both secondary tasks led to a significant variation of the
blink rate in comparison to that of the primary drive task.
Table 6.1.: Values of anova to assess the significant difference between means of blink rates for all tasks
visuomotor vs. drive auditory vs. drive
test statistic p-value test statistic p-value
all subjects, all blocks F1,25 = 7.26 0.012 F1,25 = 7.48 0.011
subjects of cluster A F1,13 = 14.35 0.002 F1,13 = 5.66 0.033
subjects of cluster B F1,7 = 0.05 0.831 F1,7 = 0.07 0.800
all subjects of cluster subjects of cluster

subjects A B
ρp = 0.32 ρp = -0.07
80
p-value < 0.001 80 ρp = 0.7 3
Blink rate during drive task [1/min]
p-value = 0.6 80 p-
lue 0.0 1
60 60 va < 0
60
40 40 40
20 20
y=x 20
linear fit
0 0 0
0 20 40 60 80 0 20 40 60 80 0 20 40 60 80
Blink rate during visuomotor secondary task [1/min]
(a) bilnk rate of visuomotor vs. driving task
all subjects of cluster subjects of cluster B

subjects A ρp = 0.9 0
ρp = 0.78 ρp = 0.56 80
80 p- lue 0.0 1
p-value < 0.001 80
Blink rate during drive task [1/min]
p-value < 0.001 va < 0

60
60 60
40 40 40
20
20 20
0
0 20 40 60 80 0 0
0 20 40 60 80 0 20 40 60 80
Blink rate during auditory secondary task [1/min]
(b) bilnk rate of auditory vs. driving task
Figure 6.5.: Scatter plots of blink rate for visuomotor vs. driving and auditory vs. driving task. Pearson
correlation coefficient (ρp) and the corresponding p-values are provided as well.
6.6. Impact of the visuomotor task on the blink behavior
In this section, it is studied how the blink behavior was affected during performing the
visuomotor task. Figure 6.6 shows what percentage of blinks during the visuomotor task was
gaze shift- induced on average over all blocks. In other words, after detecting all blinks in V
(n) signal, only those blinks time-locked to saccades of the H(n) signal were considered. ± On
average, 91% 8.7 of blinks occurred simultaneously with a gaze shift. Consequently, during
the visuomotor task the occurrence of spontaneous blinks was highly modulated by the gaze
shift frequency. That means the subjects either did not blink or blinked simultaneously
during the gaze shifts. To put it another way, the frequency of the gaze shifts during the
visual distraction completely modulated the occurrence of blinks. This fact is shown in the
left scatter plot of Figure 6.7 where the blink rate is plotted versus the saccade rate during the
visuomotor task for all subjects. Middle and right plots of this figure show the same values
for clusters A and B, respectively. It can be seen that the subjects of cluster A blinked as often
as having gaze shifts. On the contrary, subjects of cluster B experienced a larger number of
gaze shifts in comparison to the blinks.
Figures 6.8 and 6.9 show two representative examples of the EOG signals during the
visuomotor and driving task. In Figure 6.8(a) (subject S8), during the visuomotor task, not
only the blink frequency was completely modulated by the saccade frequency, but also the
visual task led to the
6.7 Amount of gaze shift vs. the occurrence of gaze shift-induced blinks 97
100
80
60
[1/min] of gaze-shift induced blink [%]
40
20
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Subjects
Figure 6.6.: Percentage of blinks time-locked to saccades for all subjects averaged over all blocks during
the visuomotor task
all
subjects of cluster subjects of cluster B
subjects
Percentage
A
80 80
y =x
80 linear fit
Saccade rate during visuomotor task
60 60 60
40 ρp = 0.52 40
40
p-value < 0.001 =
20
ρp0.62
20 =
ρp0.95 20 p- < 0.0
value 01
0 p- < 0.0
value 01
0 20 40 60 80 0 0
0 20 40 60 80 0 20 40 60 80
Blink rate during visuomotor task [1/min]
Figure 6.7.: Scatter plot: blink rate versus saccade rate during the visuomotor task
increase in the number of blinks in comparison to the driving task (Figure 6.8(b)). This subject
belongs to the cluster A with direction-independent gaze shift-induced blinks. On the
contrary, for subject S1 from cluster B with direction-dependent gaze shift-induced blinks, the
number of blinks during the visuomotor task was decreased in comparison to that of the
driving task (Figures 6.9(a) and 6.9(b)). Overall, it is clear that during the visuomotor task,
blink frequency depended thoroughly on the saccade frequency.
6.7. Amount of gaze shift vs. the occurrence of gaze

shift-induced blinks
This section explores whether the occurrence of gaze shift-induced blinks was correlated with
the amount of the gaze shift. During the experiment in the driving simulator described in
Section 4.3, the subjects experienced gaze shifts between various positions without any
instruction. In order to show whether the occurrence of gaze shift-induced blinks was
independent of the fact that the subjects were instructed to have gaze shift (as in the previous
experiment) and whether this is positively/negatively correlated with the amount of gaze
shift, all horizontal saccades during the drive were studied. This analysis is performed for the
first 12 subjects of the corresponding experiment.
A single saccade, e.g. gaze shift of some degrees to the right, measured by EOG occurs with
H(n) [µV]
400
400
200
200
0
0
−200
−200
−400
−400
V (n) [µV]
400
400
200
200
0
0
−200
−200
−400
0 10 20 30 −400
0 10 20 30
Time [s] Time [s]
(a) visuomotor task
(b) driving task
Figure 6.8.: EOG signals during the visuomotor and driving task for subject S8
H (n) [µV]
400 400
200 200
0 0
−200 −200
−400 −400
V (n) [µV]
200 200
0 0
−200 −200
0 10 20 30 0 10 20 30
Time [s] Time [s]
(a) visuomotor task (b) driving task
Figure 6.9.: EOG signals during the visuomotor and driving task for subject S1
different ampem amplitudes in (5.4) from person to person. This measured value depends on
the skin type, its cleanliness, etc. Therefore, it is possible that gaze shifts and saccadic eye
movements of equal size are not detected for all subjects, because detected saccades with
similar amplitudes do not necessarily refer to the equal amount of gaze shift. To overcome
this, only horizontal saccades were considered whose amplitudes were equal or larger than
that of a look at the speedometer1. In fact, all glances at the speedometer out of the V (n)
signal were extracted as the minimum detectable gaze shift for each subject. Then one
standard deviation of the mean
1
The amount of gaze shifts during moving focus to the speedometer depends also on the body size. Such gaze
shifts are larger for a tall person with a large upper body part in comparison to a shorter person. Nevertheless,
we suppose that the difference between body sizes is negligible among our subjects.
of them was used as the threshold for detecting saccades of the H(n) signal. Thus, the
threshold for the horizontal saccade detection was chosen individually based on the vertical
saccades of V (n), i.e. the amplitude of the glance at the speedometer. Figure 6.10 summarizes
the explained algorithm.
Detecting glances at the speedometer
Extracting saccadic amplitude
(pure vertical saccade) (controlled by video) Calculating mean(ampem,v) – std(ampem,v)
ampem,v
V(n)
Extracting potential horizontal saccades th

Extracting saccadic amplitude
ampem,h discard
H(n) ampem,h ≥ th
no
yes
save as a horizontal saccade
Figure 6.10.: Algorithm for determining the threshold of horizontal saccade detection
Figure 6.11 shows the histogram of the absolute amplitude of all horizontal saccades fulfilling
the criterion shown in Figure 6.10 for one of the subjects. It can be seen that small-amplitude
saccades (e.g. ampem < 200 µV) occurred more often than the large-amplitude ones which
refer to the gaze shifts during the glances at the side or rear-view mirror (e.g. ampem > 300
µV). First, we need a threshold to distinguish between small and large-amplitude saccades.
To this end, the k-means clustering algorithm (k = 2) (see Appendix C) was applied to divide
the saccadic amplitude into two categories: small versus large-amplitude clusters. The
threshold dividing the clusters is shown in Figure 6.11 in solid line. For most of the subjects,
the number of saccades belonging to the small-amplitude cluster (Ns) was larger than that of
the other cluster (Nl) which makes the further analysis and the comparison of clusters
difficult. Therefore, as the second step, the numbers of occurrences of small and large-
amplitude saccades were needed to be balanced. Consequently, out of Ns numbers of saccades
of small-amplitude cluster, Nl of them were selected randomly. For subjects with Ns < Nl, the
selection procedure was performed the other way around (see Figure 6.12). Afterwards, the
occurrence of gaze shift-induced blinks with respect to the saccadic amplitude was studied.
The selection of Nl out of Ns events (or vice versa) was repeated at least 100 times for each
subject separately to ensure the independency of the result on the chosen saccades out of the
small/large-amplitude cluster. Dark histograms of Figure
6.13 show the amplitude of the horizontal saccades of both clusters for the 100th iteration.
The solid line also indicates the border between the clusters. After balancing the number of
events in each cluster, it has been calculated how many of the saccades were accompanied by
blinks. The histograms of the amplitude of these saccades are shown in light color in Figure
6.13.
The scatter plot in Figure 6.14 quantifies the result of the histograms of Figure 6.13. The
numbers of small-amplitude saccades accompanied by blinks (x-axis) are plotted versus the
same values for large-amplitude saccades (y-axis) in percent averaged over 100 repetitions of
selecting Ns out of Nl (or vice versa). For all 12 subjects, the values are on the left side of the y =
x line. This implies that independent of the fact of whether the subjects were instructed to
carry out a visuomotor secondary task in addition to the driving task, they automatically
blinked more often during large-amplitude saccades in comparison to the small-amplitude
ones. In other words, for gaze shifts with larger amounts, the probability of a simultaneous
blink occurrence was higher.
Number of ocurrences
threshold of 2-means
clustering
70
60
50
40
30
20
10
0
0 100 200 300 400 500
|ampem,h| [µV]
Figure 6.11.: Histogram: absolute amplitude of saccades out of H(n) signal for subject S1
Consider in large amplitude cluster & extract the number

2-means clustering
th2-means
ampem,h ampem h ≤ th2-means
,
no
Nl
yes
Consider in small amplitude cluster & extract the number Select Ns amplitudes out of Nl
Ns
Ns ≥ Nl
no
yes
Select Nl amplitudes out of Ns
Figure 6.12.: The algorithm for balancing the number of small (Ns) and large-amplitude (Nl) saccades
To quantify whether the categorical data: “saccade amplitude: small/large” and “occurrence
of gaze shift-induced blink: yes/no” were independent or not, the contingency table (cross
tabula- tion) is studied. Table 6.2 shows the result for subject S1. By applying the Pearson’s
chi-square test (see Appendix D.8), it can be shown whether the observed categorical data
were related significantly to each other. Therefore, the H0 hypothesis is formulated as “there
was no relationship between two mentioned categories”. By considering the confidence
level of 95% (α = 0.05), for all subjects (except for subject S11), the p-values were always
smaller than 0.001 which leads to the rejection of the H0 hypothesis. Therefore, the amplitude
of the gaze shift was responsible for inducing a blink, so that the larger the amount of the
gaze shift, the more probable was the blink occurrence.
Summary
In this chapter, we discussed the occurrence of gaze shift-induced versus spontaneous blinks
in real road and simulated driving. It was shown that during a visuomotor secondary task
performed in a real road scenario comparable with the navigation system’s demand, gaze
shifts
1
S1 S2 S3 S4
0.5
Normalized number of occurences
0
1
S5 S6 S7 S8
0.5
0
1
S9 S10 S11 S12
0.5
0
0 200 400 600 0 200 400 600 0 200 400 600 0 200 400 600
Amplitude of horizontal saccades [µV]
Considered horizontal saccades

Horizontal saccades accompanied by
blinks Threshold of 2-means clustering
Figure 6.13.: Normalized histogram: amplitude of all horizontal saccades (dark bars) and those accom-
panied by blinks (light bars) for 12 subjects
100
Large-amplitude saccades accompanied by blinks [%]
80
60
40
20
y=x
0
0 20 40 60 80 100
Small-amplitude saccades
accompanied by blinks [%]
Figure 6.14.: Scatter plot: number of saccades in percent time-locked to the blinks with respect to their
amplitude, i.e. small and large
induced the occurrence of blinks. This was also unrelated to the variable time-on-task. For 14
subjects out 26, the fact was independent of the gaze shift direction. On the other hand, for 8
subjects, the occurrence of blinks was more probable during the gaze shifts towards the road.
All in all, it seems that gaze shifts towards the road generally induce the occurrence of blinks
to a larger extent.
Comparing the blink rate during the secondary tasks and the driving task, it was shown that
the differences between the means were statically significant stating that performing a
secondary task
Table 6.2.: Contingency table: saccade amplitude versus occurrence of gaze shift-induced blinks, for
subject S1, first selection procedure
amplitude of saccades
events small large total
occurrence of gaze shift- yes 151 321 472
induced blink no 228 58 286
total 379 379
(either visuomotor or auditory) affects the blink rate. Moreover, it was shown that the
frequency of the gaze shifts during the visual distraction modulated the occurrence of blinks.
On the other hand, results of the experiment in a driving simulator led to a positive
correlation between the amount of gaze shifts and the occurrence of blinks in the case of no
secondary task. This means that the larger the amount of the gaze shift, the higher is the
probability of blink occurrence. Consequently, this study suggests those who solely consider
blink rate as a drowsiness indicator to handle gaze shift-induced blinks differently to
spontaneous ones, particularly if the driver is visually or cognitively distracted.
As mentioned in Sections 4.2 and 4.3, except for the experiment, which was studied in this
chapter, all other experiences were designed such that no secondary tasks were allowed to be
performed during the driving task. Therefore, the number of gaze-shift induced blinks was
much smaller in comparison to the dominant number of the non-gaze shift induced blinks.
However, in real life, similar scenarios to our experiment with visuomotor secondary task
occur very often,
e.g. while entering data into the navigation system. Therefore, the wrong interpretation of the
changed blink frequency due to gaze shifts should be avoided by the drowsiness warning
system. A possible solution for tackling this issue is to conditionally activate and deactivate
the warning system. If the driver starts operating the center console elements, as an example,
the warning system deactivates. Therefore, all induced blinks occurring due to gaze shifts
between the center console and the road ahead are disregarded. As soon as, the center
console is not being operated, the warning system reactivates and considers the detected
blinks.
7. Extraction and evaluation of the
eye movement features
In chapter 5, we explained how to detect relevant eye movements to driver drowsiness, i.e.
eye blinks and saccades, out of the EOG signals. Based on the detected events, in Chapter 6,
we investigated the relationship between the occurrence of the mentioned events. In this
chapter, first, two approaches for aggregating attributes and features of the detected events
are introduced and discussed. Our first aggregation approach is carried out with respect to
collected kss values. However, the second approach benefits from quick changes of
drowsiness level in the course of time. Since features based on physiological measures are
highly individual and vary from one subject to the next, we propose two baselining methods
to deal with this issue. Afterwards, it is explained which features might be of interest for
describing the driver’s state of vigilance. Some of the features have not been defined
consistently in previous studies. In this work, by considering all definitions, 19 well-defined
features are introduced and extracted for each detected event. To this account, this work is one
of the most comprehensive studies on eye blink features for in-vehicle applications and
under real driving conditions. For well-known features, a detailed literature review is
provided and our findings regarding feature’s evolution due to drowsiness are compared with
those of other studies. In addition, it is shown whether the extracted features change
significantly shortly before the occurrence of the first safety-critical event in comparison to
the beginning of the drive. To this end, the lane-keeping based and eye movement-based
drowsiness detection methods are challenged. Afterwards, based on the correlation analysis,
the linear and non-linear relationship between each feature and driver drowsiness is studied.
Since the quality of driver observation cameras in detecting eye blinks will not be as high as
that of the EOG, in the last section of this chapter, we investigates possible peak amplitude
loss of features for the case that camera replaces EOG.
7.1. Preprocessing of eye movement features
Before introducing relevant features, here, we discuss feature aggregation approaches in

order to reduce the number of samples of the feature space which is the space containing all
extracted features. The reason is that drowsiness is a phenomenon which evolves over time.
Therefore, it is unlikely to observe distinguishable characteristic differences between, e.g.,
two successive eye blinks. Consequently, for each feature a statistical measure, such as mean
over a specific time interval, is calculated. This approach was also used in several studies as
listed in Table 7.1. According to the table, the time interval for aggregating features varies
from 1 s (Picot et al., 2010) to 6 min (Knipling and Wierwille, 1994) and the most frequently
used statistical measure is the mean. In this work, we only used the mean as it seems to be
more informative regarding the kss values. Analysis of other statistical measures is provided
in Appendix F.1.
104 Extraction and evaluation of the eye movement features
Table 7.1.: Literature review of feature aggregation and the calculated statistic measure
Author aggregation window statistical measure
Knipling and Wierwille (1994) 6 min mean
Morris and Miller (1996) task duration mean
Dinges and Grace (1998) 1 min –
Caffier et al. (2003) – mean
Johns (2003) 1 to 6 min –
Svensson (2004) 20 s for EEG –
Åkerstedt et al. (2005) 5 min mean, standard deviation
Bergasa et al. (2006) 30 s mean
Ingre et al. (2006) 5 min mean, standard deviation
Johns et al. (2007) 1 min mean, standard deviation
Papade lis et al. 1 min mean
(2007) 20 s mean, sum, maximum
Damousis and Tzovaras (2008) 5 min mean, standard deviation, median
Schleicher et al. (2008) 20 s mean
Hu and Zheng (2009) 0.5 mean, median, ewma, ewvar
Friedrichs and Yang (2010a) Hz –
Rosario et al. (2010) 20 s mean
Picot et al. (2010) 1s mean, standard deviation
Sommer and Golz 3 min mean, maximum, minimum
(2010) 8s
Wei and Lu (2012)
7.1.1. KSS input-based feature aggregation
One of the methods applied for aggregating features is based on the kss inputs. As
mentioned in Section 2.2, the time interval around each kss input is expected to be to the
largest extent correlated with the true driver’s state of vigilance. Therefore, as the first
aggregation method, we calculated the mean value of each feature over the last 5 min before a
− interval of [tkss 5 min, tkss]. tkss denotes the time instant at which the kss
kss input, i.e. the time
value was collected. This is shown pictorially in Figure 7.1 for the 15-min time interval kss
data collection.
Figure 7.1.: kss input-based feature aggregation method
Clearly, an advantage of this method is that only parts of the drive, for which the self-rating
information is available, are analyzed further. However, a problem arises simultaneously by
applying this approach which is ignoring valuable and expensive data outside the mentioned
time interval, i.e. the intervals between kss inputs. In fact, the resulting number of available
7.1 Preprocessing of eye movement features 105
feature samples here depends on the number of times the kss data is collected. We mentioned
that kss data cannot be collected very frequently for the sake of monotonicity of the driving
condition. As a result, for each hour of driving, only a small number of kss values will be
available, namely only 4 values by collecting kss in 15-min intervals. Hence, this method is
not suitable, if small numbers of kss values are available. Overall, this method might lead to a
set of features which are not very informative due to lack of observation samples. Another
drawback of this method is its reliance on the collected kss values.
Schmidt et al. (2011) also used 5 min before a kss input for evaluating the short-term effect of
verbal assessment of driver vigilance which is in agreement with our approach.
In this work, in total 391 kss values were recorded and correspondingly 391 5-min windows
were available for extracting features. Figure 7.2(a) shows the distribution of the relative
frequency for each kss value in percent, i.e. the number of occurrences for each kss value with
respect to the total number of available kss values. The numbers on the top of the bars denote
the number of counts.
20 78 20
72 67
Relative frequency [%]
Relative frequency [%] 6

58 62 5 85 55 6 03
15 15
49 5 09 5 5 27
43 458
10 10
19
5 5
10 89
0 0 28
1 2 3 4 5 6 7 8 0
1 2 3 4 5 6 7 8 9
9 KSS
KSS
(b) Drive time-based method
(a) kss input-based method
Figure 7.2.: Relative frequency of kss values for two feature aggregation methods
7.1.2. Drive time-based feature aggregation
Another approach for aggregating features is considering a 1-min interval independent of the
kss inputs similar to Dinges and Grace (1998), Johns et al. (2007) and Papadelis et al. (2007).
This method has several advantages. First of all, all parts of the drive are analyzed, without
discarding any data. Moreover, on the contrary to the previous approach, for one hour of
driving, 60 feature values are extracted. Consequently, by analysis of a larger amount of data,
the resulting set of features is more informative.
In this work, the average 1-min intervals with no overlap was considered beginning from the
first minute of driving and excluding noisy parts of the EOG data. Figure 7.3 pictorially
shows this feature aggregation approach.
Non-overlapping windows make the extracted feature less statistically dependent in time1.
Two major drawbacks of this method are the underlying assumptions about the kss values.
First, similar to the previous approach, this method also strongly relies on the preciseness of
the subjective measure. Secondly, in order to have a corresponding kss value for each feature
extraction interval, we have assumed that a kss value remains unchanged between two
successive
1
In Chapter 6, we supposed that the occurrence of an eye movement, e.g. a blink, is independent of that of
the other blink in time. Other features of adjacent blinks, however, might be correlated with each other.
Figure 7.3.: Drive time-based feature aggregation method
kss inputs (see Figure 7.3). On this account, a preceding kss value was used up to the next
input. In other words, we have assigned a specific self-rating value to those parts of the drive
for which the subject did not rate his level of vigilance.
Comparison of Figures 7.3 and 7.1 reveals that each of the introduced approaches assigns a
different kss value to the time interval of [tkss −5 min, tkss]. The first approach uses the
following kss value, while the second approach holds the preceding value up to the time
instant of the collection of a new kss value, namely tkss.
Similar to Figure 7.2(a), Figure 7.2(b) shows the relative frequency of kss values given the
drive time-based feature aggregation method. In total, 4021 samples are available for each
extracted feature which is equal to 4021 min of driving (about 70 h).
7.1.3. Feature baselining
Since biological measures like blink characteristics are highly individual and vary from one
subject to the next (Dong et al., 2011), a baselining method is applied to suppress irrelevant
characteristics for further analysis. Assuming that all subjects were awake at the beginning of
the drive, which is not always the case in real life though, the average over the first tbaseline min
of each feature (e.g. tbaseline = 5 or 10 min) is used as the normalization factor for the rest of
that drive, namely
normalization factor = mean(feature([t = 0, · · · , t = tbaseline ])) . (7.1)
Therefore, for each sample xi of a feature we have
xi xi feature
xi,baselined { | ∈ . (7.2)
= normalization factor
In addition to (7.2), the standard score defined as
xi,z-score = {xi | xi ∈ feature} − (7.3)

µ
σ
can be explored as well, where µ and σ refer to the mean and standard deviation of the corre-
sponding feature calculated over the entire samples. Simon (2013) suggested considering µawake
and σawake which are calculated with respect to the awake phase of the drive, e.g. the first 20
min of the drive.
Among these methods, we obtained the highest correlation values between each of the
baselined features and kss values for the first approach with tbaseline = 10 min. Figures 7.4(a)
and 7.4(b)
7.2 Eye blink features 107
show one of the extracted blink features (MOV will be defined in next section) versus kss
values before and after baselining, respectively. Obviously, the growing trend of this feature
from kss 1 to 3 in Figure 7.4(a) is the result of individual differences in the values of this
feature and is consequently drowsiness-irrelevant. After baselining, the misleading trend is
filtered out. Hence, this preprocessing step of the extracted features has a crucial contribution
towards improved results for drowsiness detection, especially in the next step which is
classification.
Baselined MOV
MOV [mV/s]
1.4
8
6 1
4
0.6
2
0.2
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9
9 KSS
KSS
(a) before baselining (b) after baselining based on (7.2)
Figure 7.4.: MOV feature before and after baselining
7.2. Eye blink features
This section introduces the extracted features in this work and discusses their association
with the driver drowsiness based on the kss values. Additionally, for each feature a
comprehensive literature review is provided to give insight into the features and to compare
the findings of this study with those of the previous studies. Some of these features were
called similarly in previous studies despite their different definitions. Here, we take all
definitions into consideration for the sake of completeness.
Table 7.2 lists 19 features extracted in this study for detected events shown in Figure 7.5. In
the following, these features are defined and it is explained how to calculate them. All plots
refer to drive time-based features.
• A: Blink amplitude is defined as the minimum of the rise amplitude A1 and the fall
amplitude A2, i.e. A = min(A1, A2), where
A1 = V (middle) − V (start) (7.4)

A2 = |V (end) − V (middle)|. (7.5)
Start, middle and end points of a blink are shown in Figure 7.5. This feature was also
defined earlier in Chapter 5 in (5.3) for event detection. There, it was clarified why the
minimum value of A1 and A2 was used.
The study by Morris and Miller (1996) on sleep-deprived pilots showed a significant drop
in blink amplitude along with increase of pilot errors. They related this phenomenon
to “a lower starting position of the eyelid” which leads to a reduced distance between
eyelids to be traveled. This feature was also acknowledged as the best single predictor
of the performance error in their study. Jammes et al. (2008), on the contrary, reported
increased amplitude of very long blinks based on visual inspection. Svensson (2004),
who
Table 7.2.: Extracted blink features

No. Feature Description
1 A Amplitude
2 E Energy
3 MCV Maximum closing velocity
4 MOV Maximum opening velocity
5 A/MCV Ratio between A and MCV
6 A/MOV Ratio between A and MOV
7 ACV Average closing velocity
8 AOV Average opening velocity
9 F Frequency
10 T Duration
11 Tc Closing duration
12 To Opening duration
13 Tcl,1 Closed duration (first definition)
14 Tcl,2 Closed duration (second definition)
15 Tro Delay of reopening
16 perclos Ratio between T and Tcl,2
17, 18 ,19 Tx Duration from x% of the rise amplitude to
x% of the fall amplitude (x = 50, 80, 90)
studied the relationship between A and the blink velocity, assumed that this feature
evolves linearly over drowsiness. The results showed the drop of this features due to
drowsiness.
A
Figures F.2 and F.3 show the boxplots of normalized A, namelymax(A , versus kss values
for all subjects, except for the subject S2. For this subject all features are shown together
in Figure F.40. Moreover, Figure 7.6 shows all baselined drive time-based features and
their overall trend versus kss values regarding all subjects. According to these figures,
in our experiments, A has decreased for most of the subjects, as kss has increased (e.g.
subjects S1, S5, S13, S18 and S21). Nevertheless, for some subjects such as S4, S6 and
S10, an increasing trend of A is observable. This means that drowsiness led to blinks
with larger amplitude for these subjects as also reported by Jammes et al. (2008). The
reason might be that the subjects tried to keep themselves awake and to fight against
drowsiness by opening their eyelids to a larger degree. This results in larger blink
amplitudes. Interestingly, subjects with neither an increasing nor a decreasing trend in
the evolution of A (S29, S41, S40 and S43) have rated themselves mostly awake which is
highly plausible.
• E: Energy of a blink is defined as
end
E= L ( 2
n=start
V (n) − V (start) . (7.6)
Clearly, the energy of a blink strongly depends on the recorded V (n) values. Therefore,
the energy of two blinks with completely similar forms might differ depending on the
drift existing in the V (n) signal. In Chapter 5, approaches were introduced to remove
the drift of the EOG signals. Despite applying the drift removal step to all EOG signals
before event detection, subtracting the amplitude of the start value, namely V (start),
from all other values counteract the negative effect of the drift in the calculation of E.
This feature was also used by Friedrichs and Yang (2010a). Picot et al. (2010) calculated E
only for the closing phase. In addition to the mentioned definition of E, another
approach is to calculate energy within different frequency bands of the EOG similar to
the analysis
middle1
middle2
start
middl V (n)
end
200 V ′(n) 4 350 2
e
250 1
start end
[µV][mV/s]
100 2
[mV/s]
[µV]
150 0
0 0
50 −1
−100
closin opening −2
−2 −50
g
closing opening closed
0 0.2 0.4 0.6 0 0.4 0.8 1.2 1.6
Time [s] Time [s]
(a) awake phase (b) drowsy phase
T T
200 4 350 MCV 2
250 1
A1 A2
[mV/s]
[mV/s]
100 A1 A2
2
[µV]
[µV]
150 0
MOV
0 0
MCV 50 −1
MOV
−100 Tr Tro
o
−2
Tc To −50 Tc To −2
Tcl,1
0 0.2 0.4 0.6 0 0.4 0.8 1.2 1.6

Time [s] Time [s]
(c) awake phase (d) drowsy phase
200
x% of A1 x% of A2 4 350 2
x% of A2
250 1
A2
[µV][mV/s]
[mV/s]
100 A1 A1 x% of A1
2 A2
[µV]
150 0
0 0
50 −1
−100 −2 −50 −2
Tx Tx
0 0.2 0.4 0.6 0 0.4 0.8 1.2 1.6
Time [s] Time [s]
(e) awake phase (f) drowsy phase
Figure 7.5.: V (n) and its derivative V 1(n) representing eye blinks in awake and drowsy phases with the
corresponding features
Normalized A E M CV
1.5
1
1 1
0.5
0.5 0.5
M OV A/MCV A/MOV
Normalized
1 1.5 1.5
0.5 1
1
ACV AOV F
Normalized
1 3
1
2
0.5 1
0.5
T Tc To
Normalized
1.4
2
1.4
1.5
1
1 1
Tcl,1 Tcl,2 Tro

4
1.5 2
Normalized
3
1.5
2
1
1
1
PERCLOS T50 T80
3
Normalized
1 2
2
0.9
1 1
T90 123456789 123456789

KSS KSS
Normalized
123456789
KSS
Figure 7.6.: Boxplot of normalized drive time-based features combined for all subjects versus kss values
of EEG. This was suggested by Wei and Lu (2012) who asserted that the ratio between
energy of low and high frequency bands of EOG is more important for assessing the
driver vigilance than analyzing each frequency band separately. The reason is that
unlike the high frequency eyelid movements, the low frequency movements occur more
often during the drowsy phases.
Figures F.4 and F.5 show the relationship between this feature and kss values. Similar
to A, the overall trend of E for each individual subject is decreasing as drowsiness
increases. However, the overall trend in Figure 7.6 regarding all subjects does not show
any specific trend. Nevertheless, the interquartile range and the difference between
whiskers (see Appendix B) increase along with the increase of kss values.
• MCV , MOV : Maximum closing/opening velocity is the maximum value of| V t(n) during
closing and opening phases as shown in Figures 7.5(c) and 7.5(d). In this |work, we ex-
tracted these features out of the V t(n) signal calculated by the Savitzky-Golay filter with
empirically selected parameters of 5 for the polynomial order and 13 for the frame size.
According to Hargutt (2003) and Holmqvist et al. (2011), the closing phase of the eyes
occurs much faster than the opening phase. In our data, this was also the case as shown
in Figure 7.7 which compares the distribution of the MCV with that of the MOV . These
features are shown in Figures F.6, F.7, F.8 and F.9 versus kss values. It can be seen that
the overall trends of the velocities are decreasing within increasing drowsiness, e.g. for
Number of occurrences
subjects S4, S5, S12, S13 and S15.

1500
1500 histogram
est. dist. for MCV
est. dist. for MOV

1000 1000
500 500
0 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
MCV [mV/s] MOV [mV/s]
Figure 7.7.: Histogram of MCV and MOV with the estimated (est.) distribution (dist.) curves
• A/MCV , A/MOV : Johns (2003) introduced the amplitude-velocity ratio calculated by

A
MCV as a feature positively correlated with drowsiness. The ratio for MOV was defined
similarly to this feature, because Johns and Tucker (2005) showed that it also increases
due to drowsiness. Johns et al. (2007) asserted that these two features are almost in the
same range for different subjects and, therefore, it is not necessary to “adjust them for
individuals”. Damousis and Tzovaras (2008) used the A/MCV feature as an input to their
fuzzy system for drowsiness detection.
Figures F.10, F.11, F.12 and F.13 show these features versus kss values. For drivers, who
felt drowsy at the end of the drive (kss≥7), different trends are found. For subjects S4,
S6 and S10, as an example, these features increased, while for subjects S23 and S16 the
value of the features were almost stable during the drives. It can be seen that the range
of both features is almost the same considering all subjects which agrees with the
statement of Johns et al. (2007).
• ACV , AOV : The average closing and opening velocities were calculated by A1 and A2 ,
c To
respectively. Tc and To refer to the closing and opening duration. For blinks ofTthe drowsy
phase, the equivalent middle points were used, namely middle1 and middle2 as shown
in Figure 7.5(b). Figure 7.8 shows that in addition to the MCV , the average closing
velocity of the eyelids are also, in general, larger than their average opening velocity.
Figures F.14, F.17, F.16 and F.15 show these two features versus kss values. Similar
to the trends of MCV and MOV , these two features also, overall, have decreasing values
versus kss as also mentioned by Thorslund (2003). It should be mentioned that
A
Thorslund (2003)T defined as the half of the blink velocity which might not always be
valid, because it takes the duration of the closed phase of the blink into consideration as
well. Damousis and Tzovaras (2008) used the inverse of these two features together with
other features in a fuzzy logic-based system.

4500
4500
histogram
est. dist. for ACV
est. dist. for AOV
3000 3000
1500 1500
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
ACV [mV/s] AOV [mV/s]
Figure 7.8.: Histogram of ACV and AOV with estimated (est.) distribution (dist.) curves. The
outliers (values > 10 mV/s) are not shown.
• F : Blink frequency is defined as the number of blinks per minute (or another pre-
defined interval) which increases in earlier phase of drowsiness and decreases as
drowsiness increases, i.e. similar to the shape of an inverse U (Platho et al., 2013).
Here, we calculated F within 1-min intervals. This feature is also called blink rate and
since its value only depends on the detection of blinks (not the corresponding start
and end points), it is referred to as the easiest measurable blink feature (Holmqvist et
al., 2011).
On average, human blink rates range from 15 to 20 per min regarding spontaneous
blinks (Records, 1979). This range decreases to 3 to 7 blinks per min during reading
(Holmqvist et al., 2011). As mentioned in Chapter 6, this feature varies differently, while
performing a secondary visuomotor task. According to Records (1979), the blink rate
decreases as visual attention increases. The reason is the prevention of information loss
during eye closure moments. Since in our drowsiness-related experiments, it was not
allowed to perform any secondary tasks, we suppose that none of the occurred blinks
were task-related.
This feature is also highly dependent on the humidity of the vehicle interior. With the
air conditioning running or in a very dry condition, blink frequency might be different
(Friedrichs and Yang, 2010a). Friedrichs and Yang (2010a) also reported a large between-
subject variation of this feature. Moreover, Johns et al. (2007) believes that this feature is
“too task dependent and too subject-specific” to be considered for drowsiness detection. The
study by Hargutt (2003) also showed that this features is more correlated with
information processing and how demanding a task is (e.g. task duration and time-on-
task). Regarding these findings, some researchers have serious doubts about the
usefulness of blinks and their
corresponding features as a drowsiness indicator due to their variation and external

stimuli such as “road lighting, oncoming headlights, the air temperature and state of the
ventilation system” (according to Horne and Reyner (1996) cited by Liu et al., 2009). In a
study by Papadelis et al. (2007), also no significant difference of this feature between the
awake (the first 15 min) and the drowsy (the last 15 min) phases was found. Moreover,
driving time showed no significant interaction with F . Interestingly, right before a
driving error, however, F increased significantly in comparison to the beginning of the
measurement. Similarly, in a study on 60 subjects, blinks were measured at the
beginning and end of a working day (Caffier et al., 2003). Although no significant
difference between alert and drowsy phases was found, a slightly decreased blink rate
during drowsy phase was reported. However, these results contradict the findings of
Morris and Miller (1996) about the efficiency of F in a moving base-flight simulator with
10 sleep-deprived pilots.
Apart from the mentioned studies, which analyzed the interaction of F with drowsiness
separately, other studies showed that the combination of this feature with other features
leads to a satisfactory estimation of driver’s state of vigilance (Bergasa et al., 2006;
Suzuki et al., 2006; Friedrichs and Yang, 2010a).
Figures F.18 and F.19 show the boxplot of F versus kss values. By visual inspection of
the figures, it can be inferred that blink frequency has increased due to drowsiness for 8
subjects such as S3, S13 and S15. Nevertheless, it suffers from large individual
differences. For subjects S1 and S18, it has dropped, while for subjects S20, S21 and S22
it has not changed significantly. The inverse U-form shape mentioned in Platho et al.
(2013) is also not found in our data set.
• T : Blink duration is the time interval between the start and the end point of a blink.
Clearly, the value of this feature depends on the considered start and end points of the
blink which are not defined consistently among researchers. Our definition of start and
end points of a blink is in agreement with the points chosen by Hu and Zheng (2009)
and Wei and Lu (2012). On the contrary, Schleicher et al. (2008), as an example,
considered the point at which the maximum velocity of the eye opening phase occurs as
the end point. Other works also used our 17th feature, namely T50, as the blink duration
(Morris and Miller, 1996; Thorslund, 2003; Johns, 2003; Åkerstedt et al., 2005; Ingre et
al., 2006; Damousis and Tzovaras, 2008; Picot et al., 2010; Friedrichs and Yang, 2010a).
Caffier et al. (2003) assumed 50 ms as the shortest possible blink duration and blinks
with 300 ms < T < 500 ms as the “long-closure blink duration”. In their study with 60
subjects, they found a significant increase of the blink duration before (T = 192 ± 39
ms) and after (T = 316 ± 62 ms) a usual working day using anova. Moreover, T was
significantly correlated with subjective self rating (Pearson correlation coefficient ρp =
−0.358).
Figures F.20 and F.21 show the relationship between feature T and kss values. The
overall observed trend of T is an increasing one for drowsy subjects. For some subjects,
however, the trend of T is neutral, such as subjects S16, S17, S19, S20 and S21, although
they rated themselves as drowsy. Surprisingly, for subject S8 the trend is negative
meaning that the blinks were shorter as the subject felt drowsy. For awake subjects, the
feature remained almost stable.
• Tc, To: Duration of the closing and opening phases as shown in Figure 7.5. The distri-
butions of these two features are shown in Figure 7.9. In our experiments, on average
Tc = 120 ±
30 ms, while To = 247 40 ms, i.e. closing time is about two times shorter than
the opening time. In the study by Caffier et al. (2003), To was called the reopening time
and was shown to be highly correlated with T (ρp = 0.939), contrary to Tc (ρp = 0.310).
This is clearly due to the fact that To covers a larger part of T in comparison to Tc. In
their study, they found a 10% and 30% increase of Tc and To, respectively, before and
after a working day. However, these findings were not significant for all of their
subjects. Based on the weak correlation of Tc with the subjective self-rating, they had
doubts on the performance of Tc as a single measure for drowsiness detection.
Moreover, they used Tc < 150 ms as an additional criterion for blink detection.
Damousis and Tzovaras (2008) aggregated these features over 20-s time intervals in a
study based on a fuzzy system.
2000 2000
histogram
est. dist. for Tc
est. dist. for To
1500 1500
1000 1000
500
500
0
0
100 200 300 400 100 200 300 400
Tc [ms] To [ms]
Figure 7.9.: Histogram of Tc and To with the estimated (est.) distribution (dist.) curves
Figures F.22, F.23, F.24 and F.25 show Tc and To versus kss values. The plots indicate
that overall, Tc increased while To varied differently during drowsiness. This agrees
with the overall boxplots shown in Figure 7.6. For S11 as an example (Figures F.22 and
F.24), despite the increasing trend of Tc, To has evolved constantly.
By assuming that drowsiness affects both features linearly, we fitted two lines to the
scatter plot of these baselined features versus kss values as shown in Figure 7.10. They
show that from kss = 1 to kss = 9, Tc increased up to 1.5 times, while To scaled to 1.1
times.
3.5
baselined Tc 3.5 baselined To
best linear fit best linear fit
3
y = 0.054 x + 0.82 3 y = 0.013 x + 0.98
2.5
2.5
2
2
1.5
1.5
1
1
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9
KSS
KSS
Figure 7.10.: Best linear fit to all baselined feature values of Tc and To
• Tcl,1, Tcl,2: Closed duration is the time interval during which the eyes are closed. Two
definitions are used here. Tcl,1 is the time interval between the end of the closing phase
and the start of the opening phase. This definition is very similar to the plateau
duration defined by Friedrichs and Yang (2010a). Schleicher et al. (2008) called this
feature “delay of reopening”. The other definition is taken from Wei and Lu (2012) as Tcl,2 =
Tcl,1 + To. Caffier et al. (2003), however, introduced our 19th feature, namely T90, as the
closed time.
Figures F.26, F.27, F.28 and F.29 show both of the introduced features versus kss values.
It can be seen that during the awake phase, Tcl,1 did not change at all and remained the
same as one measured sample at 50 Hz, i.e. 20 ms. At kss≥7, larger values of Tcl,1 were
measured e.g. for S10, S15 and S16.
Figure 7.11 shows how each element of the blink duration evolved due to drowsiness
regarding all of our subjects. The top and bottom plots show the boxplot and the mean
values of Tc, Tcl,1 and To, respectively.
To
drows
y
Tcl, awak
e
1
drows
y
T awak
0 50 100 150 200 250 300 350 400 450
e Time
c [ms]
drows
y (a) boxplot representation
awak
e
drowsy
Tc Tcl,1 To
awake
0 50 100 150 200 250 300 350 400 450
Time [ms]
(b) calculated mean values
Figure 7.11.: Comparison between Tc, Tcl,1 and To during the awake and drowsy phases of the drive for
all subjects
• Tro: Tro denotes the delay of reopening of the eye and is defined as the time interval
between the start of the opening phase and the point of maximum velocity during this
phase (Damousis and Tzovaras, 2008; Hu and Zheng, 2009) as shown in Figures 7.5(c)
and 7.5(d). As also mentioned before, Schleicher et al. (2008) used the stopping point of
this feature as the end point of a detected blink.
Figures F.30, F.31 and 7.6 show that this feature increased during the drowsy phases of
the drive. The best linear fit to the values of this feature shown in Figure 7.12 indicates
an increase of 1.6 times due to drowsiness.
• perclos: As one of the most popular features for drowsiness detection, it was first intro-
duced by Knipling and Wierwille (1994) and refers to the proportion of time for which the
eyes are more than 80% closed (percentage of eye closure). This feature is originally a
camera-related feature and is usually calculated accumulatively over a pre-defined
interval between 1 and 5 min (Sommer and Golz, 2010).
In this work, on the contrary to eye tracking cameras, this feature is calculated by duration-
3.5
baselined Tro
3
best linear fit
y = 0.069 x + 0.79
2.5
1.5
1 2 3 4 5 6 7 8 9
KSS
Figure 7.12.: Best linear fit to all baselined feature values of Tro
based features of EOG (instead of amplitude-based ones) (Wei and Lu, 2012), i.e.
Tcl,2
perclos = . (7.7)
T
Another definition of perclos, which is mostly used given that the eye movement data
is collected by a camera (Li et al., 2011b; Horak, 2011), is as follows
T80
perclos = × 100. (7.8)
T20
Tx, x ∈ {20, 80} is our last feature and refers to the duration between following points:
from x% of the absolute rise amplitude |A1| to x% of the absolute fall amplitude |A2|.
In the following, the studies on the camera-based perclos are reviewed. Li et al. (2011b)
reported perclos feature as the best indicator of drowsiness. Sigari (2009), who also
studied a camera-based drowsiness detection method, suggested comparison of this feature
with a threshold for drowsiness detection as its increase is expected due to drowsiness.
In agreement with the previous study, Bergasa et al. (2006) and He et al. (2010) also
reported an increase of perclos during drowsiness in both driving simulator and real
night drives. “The perclos measure indicated accumulative eye closure duration over
time, excluding the time spent on normal eye blinks”, is the definition used by Bergasa et al.
(2006, page
68) and (Rosario et al., 2010, page 283). Rosario et al. (2010) studied the combination
of perclos and EEG (power of θ-waves) aggregated over 20-s intervals. Based on the
high correlation found between them during drowsy driving, they suggested perclos
as a non-intrusive reliable ground truth. Bergasa et al. (2006), however, calculated the
moving average of perclos within 30-s windows. In their study, although perclos
was the best single feature among other studied features, by combining it with other
features and applying a fuzzy classifier to them even better results were achieved than
considering it as a single feature.
Picot et al. (2009) showed that perclos can be measured with a 200-Hz video camera
with the same accuracy as with the EOG. In a later study, Picot et al. (2010) found
perclos as the best feature for drowsiness detection based on an experiment in a
driving simulator with 20 alert and sleep-deprived subjects who drove for 90 min. By
applying a fuzzy logic-based fusion method, the true positive rate had a negligible
increase, while the false positive rate was improved up to 8%. Dinges and Grace (1998)
also acknowledged this feature for its correlation with vigilance and categorized
following metrics as the perclos:
– perclos70 : proportion of time for which the eyes are more than 70% closed. This
feature was used by Friedrichs and Yang (2010a) and Li et al. (2011b) as well.
– perclos80 : as mentioned at the beginning.
– eyemeas (em): mean square percentage of the eyelid closure rating.
Interestingly, Papadelis et al. (2007) considered “per minute averaged blink duration” as a
feature called both perclose and perclos in their work. It is not clear, if it denotes a new
feature other than the common perclos, or whether they presented their own definition
of it, since both notations were used in their work. In their study with 20 sleep-deprived
subjects during a real night drive, this feature increased significantly comparing the
first and the last 15 min of the drive based on anova. Moreover, they reported its
increase shortly before a lane-based driving error.
In addition to the mentioned works, which found perclos as a very reliable drowsiness
measure, there are also some works which criticized it. Sommer and Golz (2010)
believed that perclos does not take the decreased average/maximum velocity of eye
movements (for both closing and opening phases) into consideration which is an observable
consequence of drowsiness. In addition, integrating this measure over a period makes it
less dynamic to temporary changes. They compared a camera-based perclos with a
combination of EEG and EOG for drowsiness detection and found the former as less
informative in terms of a drowsiness indicator. In addition, the high local correlation,
which they achieved between perclos and kss, seemed to heavily depend on the length
of the segment under investigation.
Another disadvantage of this feature is that it correlates better with drowsiness at the
late phase in comparison to the earlier phases (Bergasa et al., 2006; Friedrichs and Yang,
2010a). In general, for those, whose eyes remain wide open despite severe drowsiness,
this feature is not a good solution.
Barr et al. (2005) reviewed and introduced drowsiness detection systems based on perclos.
Figures F.32 and F.33 show the relationship between kss and this feature. On the
contrary to the findings in other studies, which asserted an increase of perclos due to
drowsiness, in our experiments, we have found a drop of it. This might be due to the
definition that we have used for perclos and the fact that this feature was originally
defined in the field of camera-based drowsiness detection rather than EOG.
• Tx, x 50,
∈ {80, 90 : Tx reflects the duration of the blink from x% of the absolute rise
amplitude| A1 |to the x% of the absolute fall amplitude |A2 of the blink (see Figures
|
7.5(e) and 7.5(f)). As mentioned before, some studies used T50 as the feature reflecting
the duration of the blink (Morris and Miller, 1996; Thorslund, 2003; Johns, 2003;
Åkerstedt et al., 2005; Ingre et al., 2006; Damousis and Tzovaras, 2008; Picot et al.,
2010; Friedrichs and Yang, 2010a). Caffier et al. (2003) also introduced T90 as the closed
time and showed that it is correlated with T and increases due to drowsiness. T80
was also used by Hu and Zheng (2009) together with other features as the input to
the classifier system.
Morris and Miller (1996) found the combination of T50 with A and long closure rate,
defined as the number of closures longer than 500 ms per minute, as the best 3-feature
combination for predicting pilot errors due to drowsiness. Damousis and Tzovaras (2008)
used an accumulated value of T50 over a 20-s window in a “fuzzy expert system” with
other features, but only for those eye blinks whose duration were longer than 0.2 s. In a
study by Anund (2009), the alteration of T50 for 17 sleep-deprived and non-sleep-
deprived subjects during free driving and car following situations was studied. In fact,
a complex
scenario was designed which forced the subjects to takeover. During all situations, T50
was always higher for sleep-deprived subjects, except for the takeover maneuvers
which remained similar for both groups. Ingre et al. (2006) studied the interaction
between kss and T50 and found large individual differences between T50 values. They
asserted that this is to a large extent “independent of subjective sleepiness”. This finding is
in agreement with that of Friedrichs and Yang (2010a) who also observed large between-
subject variation in T50 and consequently suggested applying a baselining step prior to
its analysis.
Similar analysis to the one performed for Tc and To in Figure 7.10 was also performed
for T50 with the best linear fit of y = 0.083 x + 0.69. This shows that in our study,
overall, T50 increased up to 1.95 times due to drowsiness. However, Åkerstedt et al.
(2005), who studied the influence of sleep deprivation due to night shift working on 10
subjects, found up to 1.4 times increase of T50 after 120 min of driving.
Figures showing the features T50, T80 and T90 versus kss are F.34, F.35, F.36, F.37, F.38
and F.39. It seems that all three features have similar increasing trends for each subject.
Moreover, if T50 has increased at higher kss values, T80 and T90 have also followed that
trend.
Figure 7.13 shows the scatter plot of T versus Tx, x ∈ {50, 80, 90 with the best lin-
ear fit which underscores the relationship between Tx and T . The calculated Pearson
correlation coefficients (will be defined in Section 7.5) between them are as follows 1:
ρT,T50 = 0.90, ρT,T80 = 0.82, ρT,T90 = 0.81 (all p-values < 0.001). According to the ρp values,
these features are highly and significantly correlated with each other.
In addition to the features mentioned here, blink interval and the occurrence of blink flurries
are also mentioned in previous works as drowsiness indicator features. Wei and Lu (2012)
defined the blink interval as the time between two successive blinks. Blink flurries are at least
three blinks occurring within 1 s (Platho et al., 2013). It is clear that these features are highly
correlated with each other and with the blink frequency. Shorter blink time interval within 1
min is the result of increased number of blinks or the occurrence of flurries. Similarly, longer
blink intervals are the result of a lower blink frequency or the absence of blink flurries within
a window.
We mentioned in Sections 7.1.1 and 7.1.2 that all features have been extracted within a pre-
defined time window (1 min for drive time-based features and 5 min for kss input based
features). As a result, the blink flurries (if they occur) might be either located within one
window or might totally be missed, if they are located on the window boundaries. In other
words, flurries which are distributed outside the window under investigation will be lost.
Moreover, the blink interval feature cannot be measured for the first and the last blinks of the
feature extraction window. Therefore, these two features are not explored in this work.
The 19 introduced features can be categorized in two groups: base versus non-base
features. We define base features as those whose values can be extracted directly from the
measured EOG signal or its derivative. In other words, a clear property of base features is
that they cannot be calculated through a combination of other features, since they can
only be measured. Here, following features are categorized as base features: F , A
(including A1 and A2), MOV , MCV , Tc∈, T{cl,1, To and Tro. We consider Tx, x 50, 80, 90 also as
a base feature, although the value of A1 and A2 are required for measuring Tx value. All
other features, namely E, A/MCV , A/MOV , ACV , AOV , T , perclos and Tcl,2 are non-base
features, because they are calculated (not measured) using one or two base features. This
categorization is of importance for the
1
The index p of the Pearson correlation coefficient ρp is not shown.
y = 0.58 x + 93.27 y = 0.31 x + 54.69

1500 best linear fit 1500
1250
T50 [ms] 1250
T80 [ms]
1000 1000
750 750
500 500
250 250
0 0 250 500 750 1000 1250 1500 0

0 250 500 750 1000 1250 1500
T [ms] T [ms]
y = 0.2 x + 40.01
1500
1250
T90 [ms]
1000
750
500
250
0
0 250 500 750 1000 1250 1500
T [ms]
Figure 7.13.: Scatter plot of T versus Tx, x 50, 80, 90 . The red line shows the best linear fit with its
∈{
equation on the top of each plot.
implementation of an in-vehicle driver drowsiness detection system as a product for sale. In

fact, the measuring system, i.e. a driver observation camera, should be able to provide an
image quality which makes the extraction of base features with an acceptable precession
technically possible. If these features are of poor quality, the remaining non-base features will
be even less precise.
Table 7.3 summarizes the experimental setup of most of the works reviewed in this section.
The works, for which none of the information was provided, were excluded. According to
this table, drowsiness detection has mostly been explored under simulated driving. Only two
studies were conducted on real roads, namely Bergasa et al. (2006) and Friedrichs and Yang
(2010b). Except for the studies by Caffier et al. (2003) and Schleicher et al. (2008), all previous
researches had a smaller number of participants than ours in their experiments. It should be
mentioned that the study by Caffier et al. (2003) was not designed as a driving task. Another
issue is the participant’s state of vigilance prior to the start of the experiment. Although this
information was not provided in all studies, in most of them, sleep-deprived subjects were
participated in the experiments. This leads to an imbalanced data set in terms of availability
of information about different levels of driver drowsiness and vigilance. Section 8.1.5 will
provide an in-depth discussion about this issue. As a result, a larger number of subjects,
considering both real and simulated driving and including different levels of driver vigilance
in the data set are the strength points of the conducted experiments in this work.
Table 7.3.: Literature review of the experiment setups. n. s.: not specified
nr. of real or sleep-deprived
Author
subjects simulated driving subjects?
Knipling and Wierwille (1994) 12 simulated yes
Morris and Miller (1996) 10 simulated n. s.
Dinges and Grace (1998) 14 none yes
Caffier et al. (2003) 60 none no
Johns (2003) 12 simulated (only 4) yes
Thorslund (2003) 10 simulated no
Svensson (2004) 20 simulated yes
Åkerstedt et al. (2005) 10 simulated no
Johns and Tucker (2005) n. s.
5 n. s.
Bergasa et al. (2006) n. s.
n. s. real
Suzuki et al. (2006)
21 simulated n. s.
Ingre et al. (2006)
10 simulated no
Johns et al. (2007)
8 simulated yes (not all)
Papadelis et al. (2007)
22 simulated yes
Damousis and Tzovaras (2008)
35 simulated n. s.
Jammes et al. (2008)
Schleicher et al. (2008) 14 simulated n. s.
Hu and Zheng (2009) 129 simulated n. s.
He et al. (2010) 37 simulated yes
Friedrichs and Yang (2010a) n. s. simulated n. s.
Rosario et al. (2010) 30 real no
Picot et al. (2010) 20 simulated yes
Sommer and Golz (2010) 20 simulated yes
Wei and Lu (2012) 16 simulated n. s.
5 n. s. n. s.
A literature review of all introduced features is listed in Table 7.4. This table shows which
features have been analyzed in different studies and which trends have been found for them
with respect to drowsiness.
7.3. Saccade features
Saccades were defined in Section 3.3 and were characterized as essential movements for per-
forming the driving task properly. The detection method of the saccades and the
corresponding start and end points were defined in Section 5.2 according to Figure 5.5. Many
of the features extracted for blinks can also be extracted similarly for saccades, such as
frequency, amplitude, duration, maximum and average velocity. In the following, we define
these features based on H(n):
• frequency or saccade rate: defined as the number of saccades which occur within a
specified time interval. In Chapter 6, we calculated this feature within the time interval
of 1 min. As mentioned in Section 2.1.2 and shown in Figure 2.5, performing an auditory
secondary task, which is representative of a cognitive task, leads to a smaller number of
horizontal saccades. In other words, cognitive load shrinks the scanning scene which is
in agreement with the findings of Rantanen and Goldberg (1999).
• amplitude: this feature was defined in (5.4) as the difference between the amplitude
H(n) of the start and end points of a detected saccade. It also characterizes the
amount of
7.3
Sa
Table 7.4.: Literature review of the features introduced in this work. Trends versus drowsiness are either pos.: positive or neg.: negative. n. s.: the
feature was studied without its trend being specified. * reduced vigilance, ** before a driving error, *** based on another end point for blinks cca
Features de
Author A E MCV MOV A/M CV A/M OV ACV AOV F T Tc To Tcl,1 Tcl,2 Tro perclos T50 T80 T90
fea
Knipling and Wierwille (1994) - - - - - - - - - - - - - - - pos. - - - tur
Morris and Miller (1996) neg. - - - - - - - pos. - - - - - - - n. s. - - es
Dinges and Grace (1998) - - - - - - - - - - - - - - - pos. - - -
Caffier et al. (2003) - - - - - - - - neg. pos. pos. pos. - - - - - - pos.
Johns (2003) - - - - - - - - - - - - - - - - pos. - -
Hargutt (2003) - - - - pos. - - - pos.* pos. - - - - - - - - -
Thorslund (2003) neg. - - - - - - - pos. - - - - - - - pos. - -
Svensson (2004) neg. - - - - - - - pos. - - - - - - - pos. - -
Åkerstedt et al. (2005) - - - - - - - - - - - - - - - - pos. - -
Johns and Tucker (2005) - - - - pos. pos. - - - pos. pos. pos. pos. - - - pos. - -
Bergasa et al. (2006) - - - - - - - - pos. pos. - - - - - pos. - - -
Suzuki et al. (2006) - - - - - - - - n. s. - - - - - - - - - -
Ingre et al. (2006) - - - - - - - - - - - - - - - - pos. - -
Johns et al. (2007) - - - - n. s. n. s. - - - - - - - - - - n. s. - -
Papadelis et al. (2007) - - - - - - - - pos.** pos. - - - - - - - - -
Damousis and Tzovaras (2008) - - - - n. s. - - - - - n. s. - - - n. s. - n. s. - -
Jammes et al. (2008) pos. - - - - - - - - - - - - - - - - - -
Schleicher et al. (2008) - - - - - - - - - pos.*** - - pos. - - - - - -
Hu and Zheng (2009) n. s. - n. s. n. s. - - n. s. n. s. - n. s. n. s. n. s. - - n. s. - n. s. n. s. -
He et al. (2010) - - - - - - - - - - - - - - - pos. - - -
Sigar i (2009) - - - - - - - - - - - - - - - pos. - - -
Friedrichs and Yang (2010a) n. s. pos. - - pos. - - - n. s. - - - - - - pos. n. s. - -
Rosario et al. (2010) - - - - - - - - - - - - - - - pos. - - -
Picot et al. (2010) - - - - n. s. - - - n. s. - - - - - - n. s. n. s. - -
Sommer and Golz (2010) - - - - - - - - - - - - - - - pos. - - -
Horak (2011) - - - - - - - - - - - - - - - n. s. - - -
Li et al. (2011b) - - - - - - - - - - - - - - - pos. - - -
Wei and Lu (2012) - n. s. n. s. n. s. - - n. s. n. s. n. s. n. s. n. s. n. s. - n. s. - n. s. - - -
12
1
gaze shift. However, the contribution of the head rotation remains undetermined. On
the contrary to blink amplitude, which is affected by drowsiness, saccadic amplitude is
a function of the angular distance to be traveled towards any destination angle. As a
result, it cannot be a drowsiness indicator in the general case.
• duration: defined as the time difference between start and end points of a saccade in
H(n). Schleicher et al. (2008) found that the standard deviation of the saccade duration
correlates best with the video-labeled drowsiness in a driving simulator study.
• maximum velocity: similar to MCV and MOV , this feature is calculated using the
derivative signal. Rowland et al. (2005) reported the drop of this feature due to sleep-
deprivation.
• average velocity: defined as the ratio between amplitude and duration of a saccade.
Schleicher et al. (2008) believed that for the data collection and measurement of saccades, a
sampling rate between 500 and 1000 Hz is required. Moreover, they stated that drowsy
drivers scan the scene ahead “unsystematically” in comparison to the awake ones. According
to our observations in the conducted experiments, the occurrence of saccades depends most
of the time directly on the surrounding events. Under highly monotonous driving conditions,
saccades occur seldom and irregularly, because almost nothing outside of the vehicle attracts
driver’s attention. Therefore, in this case, a smaller number of saccades is irrelevant to
drowsiness. On the contrary, under real driving conditions with high traffic density or on
urban streets with lower vehicle speed, the drivers scan the environment more frequently
leading to a larger number of saccades which is again independent of driver drowsiness. This
emphasizes the fact that assessment of frequency and amplitude of saccades make sense only
if the events occurring outside of the vehicle are to some extent quantifiable in terms of traffic
density with sensors such as radars. In addition, reproducible driving scenarios, which are
only possible in the driving simulators, are crucial for concluding conducted experiments.
Therefore, real road experiments are not suitable for assessing saccadic features due to
varying traffic density for each subject. Here, the experiment conducted in the driving
simulator was designed to be monotone to accelerate driver drowsiness. Hence, a low
saccade rate, as an example, cannot necessarily be the consequence of driver’s low vigilance.
Apart from the mentioned points, saccades are very fast eye movements and, as a result, a
higher sampling frequency is needed for assessing their duration or velocity. Considering all
these accounts, in this work, the saccadic features are not studied.
7.4. Event-based analysis of eye blink features
In this section, the variation of the introduced features before the occurrence of an event, e.g.
a safety-critical one, is studied and compared with a reference condition. Afterwards, based
on a statistical test analysis, it is studied whether the variation of the corresponding feature,
i.e. its decrease or increase, is statistically significant or not. Such analyses provide
information about predictability of the safety-critical events based on the feature variation in
the course of time.
As mentioned in Chapter 2, Schmidt et al. (2011) considered the moments of verbal assessment
of the driver’s state of vigilance as an event and compared the variation of blink duration
during the event with a baseline and with the time interval after the event. In other studies, a
safety-critical event was investigated instead, such as a lane departure, defined as one or two
wheels outside the lane marking, hitting the rumble strip (Papadelis et al., 2007; Anund et al.,
2008) and eye closures longer than 500 ms (Schleicher et al., 2008). Afterwards, the variation
of physiological measures such as blink or EEG features were explored within the time
interval shortly before
7.4 Event-based analysis of eye blink features 123
the event occurrence in comparison to a baseline or to the time interval shortly after the event
occurrence.
In Sections 7.4.1 and 7.4.2, similarly, the moment of the first unintentional lane departure and
the first unintentional eye closure longer than 1 s are considered as safety-critical events and
the alteration of blink features shortly before their occurrences are studied.
7.4.1. Event 1: Lane departure
Here, a lane departure event was found by analyzing the lane lateral distance signal
measured by a multi-purpose camera system (Seekircher et al., 2009). This signal represented
the distance between the vehicle center and middle of the lane. Since, in real life, not all lane
departures are unintentional, an offline video labeling was performed additionally to
validate detected lane departure events. Figure 7.14 shows examples of intentional and
unintentional lane departure events.
4
4
2
2
[m]
[m]
0
0
−2 −2
4 0 10 20 30 40 50
Time
2 [s]
[m]
0
measured lateral
−2 distance corrected
0 10 20 30 40 50 lateral distance
Time
road marking
[s]
(a) takeover maneuver and the corresponding lane (b) lane departure event with no discontinuity
departure events
Figure 7.14.: Examples of intended (takeover maneuver) and unintended (lane departure) lane change
events visible in the vehicle’s lateral distance signal
Top plot of Figure 7.14(a) depicts four discontinuity moments which occurred due to the
change to a new reference lane marking by the camera (Ebrahim, 2011). The red asterisks
show the moment when the vehicle center passed the lane marking. These discontinuities
were corrected in the bottom plot. The first two lane changes show a representative example
of a takeover maneuver lasting for about 10 s. However, based on the offline video analysis,
the third and the forth asterisks represent an unintended lane departure due to drowsiness
with the length of 2.5 s. The durations are calculated at the time instant that 40% of the
vehicle’s width crossed over a lane marking. This threshold, i.e. 40% of the vehicle’s width,
was found based on the 99th percentile of all measured lateral distances, excluding takeover
maneuver events1. On the other hand, offline video analysis of the drives showed that most
of the subjects did not interpret
Two succeeding lane changes, which occurred within at least 4 s and the maximum gap of 10 s, were
1
considered as potential takeover maneuver and were consequently excluded for the threshold calculation.
Moreover, the participants were instructed to keep right as much as possible during the experiment in accordance
with the German official traffic regulations. Therefore, lane changes occurred always in pairs.
crossing the lane marking within a certain limit as safety-critical, e.g. one tire over the lane
marking. This was evident, as their steering movements for correction were not large enough,
although they partly left the lane. In fact, the subjects steered with larger movements towards
the middle of the lane only after having reached a certain limit, not as soon as they slightly
crossed the lane marking with one tire. Since the width of a tire of the vehicle used in our
experiments (an S-Class) was 15% of the total vehicle width, a larger threshold was needed
for the definition of an unintended lane departure event. In addition to the given points, we
were looking for safety-critical lane departure events which did not occur frequently. For this
reason, the 99th percentile of all measured lateral distances is a representative value as the
threshold which is equivalent to the moment that about 40% of the vehicle crossed the lane
marking. The first time that the threshold was exceeded and validated by the offline video
analysis has been considered as a safety-critical lane departure event. Figure 7.14(b) also
shows an example of a lane departure, although no discontinuity is evident in the measured
lateral distance signal. Such lane departures were included in the detection of safety-critical
lane departure events, if they exceeded the mentioned threshold and were validated by the
offline video analysis.
Figure 7.15 shows the mean value of ewvar (Friedrichs and Yang, 2010a) (see (4.5)) of the
lateral distance for each kss value regarding 25 subjects (S1 to S25) who drove in the driving
simulator. The parameter Nσ2 was set to 250 samples. The error bars refer to the standard
deviation of the ewvar values. The moments of intended lane changes were excluded for the
calculation of ewvar. It can be seen that the variance of lateral distance increased due to
drowsiness in our experiment which agrees with the studies mentioned in Section 2.1.1.
Despite the fact that some subjects have rated themselves as very drowsy, this feature does
not follow any trend for them (e.g. subjects S16, S21 and S23) or the trend is very week
(subjects S6, S9, S10, S13 and S17).
It is interesting to know at which subjective drowsiness level, i.e. kss, the first unintended
lane departure has occurred. These values are listed in Table 7.5. Two subjects are excluded,
because the offline video analysis was not possible for them. For subjects, who never left the
lane unintentionally with respect to our threshold, the maximum rated subjective drowsiness
levels are listed. This clarifies two points: 1) If the maximum kss value for these subjects is
smaller than 6, then the nonexistence of the unintentional lane departure is due to high driver
vigilance, 2) If the maximum kss value for these subjects is larger than 6, the nonexistence of
the unintentional lane departure shows that the calculated feature is either not meaningful to
assess their drowsiness level or these subjects overestimated their drowsiness level.
According to the listed values, the lane departure event occurred mostly at a time that the
subject also believed that he was drowsy (17 subjects). However, there are 6 subjects who
never left the lane according to our criteria, although they rated themselves as drowsy.
Table 7.5.: Left table: number of occurrences of kss values at the time of first unintended lane departure
and number of occurrences for the maximum value of kss, if no lane departure was
detected. Right table: confusion matrix.
KSS KSS
awake drowsy awake drowsy total
1 2 3 4 5 6 7 8 9 yes 0 17 17
lane departure
first 0 0 0 0 0 0 1 6 10 no 0 6 6
lane departure
none 0 0 0 0 0 0 1 1 4 total 0 23
Figure 7.16 shows the boxplots which compare the mean value of a feature within the first 5 min
0.4 S1 S2 S3 S4 S5
ewvar of lateral ditsance
0.2
0
0.4 S6 S7 S8 S9 S1
0
0.2
0
0.4 S11 S12 S1 S14 S15
3
0.2
S18
0
0.4 S1 S1 S19 S2
6 7 0
0.2
0
0.4 S2 S2
1 2 S2 S24 S2
3 5
0.2
0
123456789
123456789 123456789 12345678 123456789
KSS
KSS KSS 9 KSS
KSS
Figure 7.15.: The mean of the ewvar of lateral distance versus kss values for 25 subjects who drove in
the driving simulator. The standard deviations are also shown.
of the drive1 with the 5-min interval before the lane departure for 23 subjects. Participants of
real road drives were not considered, because their collected data were mainly related to
awake driving. Therefore, no lane departure due to drowsiness occurred for them. For most
of the features, an increasing or a decreasing trend is apparent.
With a statistical test such as the paired-sample t-test (see Appendix D.1), it is possible to
show whether the difference between the means of the groups (i.e. the first 5 min of the drive
and the last 5 min before the event) are significantly different (Field, 2007). If the assumption
of this test is not fulfilled, i.e. normality of the difference between observations is not given,
the alternative non-parametric test, namely the Wilcoxon signed-rank test (see Appendix D.7) is
applied instead. The normality of the distribution of the difference between groups was
analyzed with the Lilliefors test (see Appendix D.2).
Table 7.6 summarizes the results of the mentioned tests including the test significance value
which is t0 for the paired-sample t-test or z0 for the Wilcoxon signed-rank test and the
corresponding p-values with respect to the significance level of 5%. The test significance
value of the Wilcoxon signed-rank test z0 indicates that the difference between the means of
the corresponding feature in both groups was not normally distributed. According to the
results, except for E, all results are statistically significant with p-value < 0.05. As a result, we
conclude that A, MCV , MOV , ACV , AOV and perclos decreased significantly 5 min before
the lane departure in comparison to the first 5 min of the drives. On the contrary, A/MCV ,
A/MOV , F , T , Tc, To, Tcl,1, Tcl,2,
As mentioned in Section 4.3, in the experiment conducted in the driving simulator, the very first 3 min of
1
the drives were removed from the collected EOG data sets, because it was considered as the phase that the eyes
needed to accommodate. Therefore, the first 5 min, which is studied here, does not include this phase.
A E MCV MOV
baselined 1 1
1 1.5
1 0.75 0.75
0.8
0.5
0.5 0.5
A/MCV A/MOV ACV AOV
1.6 1 1
baselined
1.6
1.3 1.3 0.75 0.8
1 1 0.5
0.6
F T Tc To
2.5 1.6 2 1.2
baselined
1.5 1.3 1.5 1.1

1
0.5 1 1
Tcl,1 Tcl,2 Tro PERCLOS
6 1.5
2 1
baselined
4
1.25
1.5
2 0.9
1
1
T50 T80 T90 first before
4 5 event
3 3
min.
baselined
3
2 2
2
1 1 1
first befor first befor first befor
e e e
5 min. even 5 even 5 event
t min. t min.
Figure 7.16.: The mean of all baselined features over the first and the last 5 min before the first
unintended lane departure event for 23 subjects who drove in the driving simulator
Tro, T50, T80 and T90 increased significantly before the mentioned event.
Now, the question is whether the found trends of the features are the consequence of
drowsiness or rather time-on-task. To answer this question, a similar analysis was performed
for 18 subjects who drove under real road conditions for covering the EOG data collection of
the awake phase. Since none of them experienced a lane departure event, the last 5 min of the
drive has been compared to the first 5 min. Almost all subjects rated their drowsiness level
with a higher value of kss at the end of the drive in comparison to the start of the driving.
Nevertheless, overall, none of them reported severe drowsiness.
Figure 7.17 shows the boxplots of all features for these subjects. Comparing this figure with
Figure 7.16, it can be concluded that first of all, larger overlap between the boxplots is
evident. Moreover, some features, such as T or Tc, show contradicting trends. Interestingly,
the decreasing trends of A, MCV , MOV , ACV and AOV are still visually distinguishable,
comparing the first and the last 5 min of the drives. However, a larger decrease of these
features has occurred in the interval before the lane departure shown in Figure 7.16. Similar
to Table 7.6, Table 7.7 shows the statistical comparison of the mean values of the features
shown in Figure 7.17. As can be
Table 7.6.: Results of paired-sample t-test (t0) and Wilcoxon signed-rank test (z0) for all features shown
in Figure 7.16. Red color indicates non-significant features.
Feature test significance p-value
A t0 = 4.07 < 0.05
E z0 = -0.17 0.87
MCV t0 = 10.81 < 0.05
MOV t0 = 11.31 < 0.05
A/MCV t0 = -6.69 < 0.05
A/MOV t0 = -9.46 < 0.05
ACV t0 = 12.52 < 0.05
AOV t0 = 6.03 < 0.05
F t0 = -2.82 < 0.05
T z0 = -3.57 < 0.05
Tc
t0 = -6.51 < 0.05
To
t0 = -5.03 < 0.05
Tcl,1
t0 = -3.40 < 0.05
Tcl,2
Tro z0 = -3.53 < 0.05
z0 = -3.62 < 0.05
perclos
T50 t0 = 5.74 < 0.05
T80 t0 = -5.29 < 0.05
T90 t0 = -4.71 < 0.05
z0 = -3.48 < 0.05
seen, the difference between the means of the first and the last 5 min of the drives is for most
of the features non-significant (p-value > 0.05). However, the mentioned features related to
the amplitude of a blink, namely A, E, ACV , AOV , MCV and MOV , show a significant
decrease. This might be due to the fact that these features are subject to time-on-task to a
larger extent in comparison to the duration-based features. Considering these findings, we
conclude that the significant variation of features before the unintended lane departure event
is mostly due to the driver drowsiness.
We emphasize that the goal of comparing Figures 7.16 and 7.17 was to highlight the contri-
bution of drowsiness towards the changes in feature mean values before the occurrence of the
lane departure event. Studying the intrinsic difference between driving simulator and real
road conditions is outside the scope of this analysis and will be explored in the next chapter.
7.4.2. Event 2: Microsleep
Similar to the previous approach, an unintended eye closure longer than 1 s, called a
microsleep, was considered as a safety-critical event. We investigate this event due to the
following reason. In our driving simulator experiment, we observed that at the earlier phase
of drowsiness some subjects had longer eye closures, although they did not leave the lane. In
other words, not all long eye closures led to a lane departure, although their occurrence was
the consequence of driver drowsiness.
For the first 14 subjects, who participated in the driving simulator experiment, an additional
Dikablis glasses (Ergoneers GmbH, 2014) was used. It is an eye tracking measurement system
which provides a signal with either zero or non-zero values. Non-zero values of the signal are
not of interest in this work. However, zero values occur, if the eye-tracker does not detect the
pupil. The pupil will not be detected either due to an eye closure (both intentional and
unintentional) or due to technical problems. Hence, zero sequences lasting longer than 1 s
refer to potential
A E MCV MOV
baselined 1 1 1
1
0.5 0.8 0.8

0.8
A/MCV A/MOV ACV AOV
baselined
1.1
1 1
1 1 0.9
0.9 0.8 0.8
0.9
F T Tc To
baselined
1.1
1 1 1 1
0.8
0.9 0.9
0.9
0.6
baselined
1.04 1.1 1.1 1.04
1 1
1 1
0.96 0.9
T50 T80 T90 first

baselined
before
1.2 1.2 1.2
5 event
1.1 1.1 min.
1
1 1
5 min. even 5 min. event 5 min. event

t
Figure 7.17.: The mean of all baselined features over the first and the last 5 min of the drive for 18
subjects who drove under real conditions
microsleep events. Since some long eye closures might have occurred intentionally or due to
technical problems, by offline video labeling, non-relevant events were discarded. At the end,
the first unintended eye closure lasting at least 1 s, namely a microsleep, was considered as a
safety-critical event. The microsleep event detection by the Dikablis glasses instead of the
EOG makes this analysis independent of the event detection approach in this work.
Similar to the lane departure events, we want to know at which subjective drowsiness level,
i.e. kss, the first microsleep occurred. It is also possible that some subjects never closed their
eyes unintentionally for at least 1 s. For these subjects, it is interesting to know what their
maximum subjective drowsiness level was given that no microsleep was detected. Here, it
also clarifies whether nonexistence of a microsleep was the result of high driver vigilance or
whether the subjects overestimated their drowsiness level. These values are shown in Table
7.8. Interestingly, it seems that two subjects have underestimated their drowsiness level by
choosing kss = 4 and 5, although they unintentionally closed their eyes longer than 1 s. This
was also validated by the offline video analysis. Moreover, three subjects rated themselves as
very drowsy, although no eye closure longer than 1 s was detected in their data. On the
contrary, nine subjects found themselves drowsy during the occurrence of the first
microsleep. There was no subject who rated himself as awake during the entire drive given
that no microsleep was detected.
Table 7.7.: Results of paired-sample t-test (t0) and Wilcoxon signed-rank test (z0) shown in Figure 7.17.
Red color indicates non-significant features.
A t0 = 4.61 < 0.05
E t0 = 5.17 < 0.05
MCV t0 = 5.69 < 0.05
MOV t0 = 4.03 < 0.05
A/MCV z0 = -0.37 0.71
A/MOV t0 = -0.79 0.44
ACV t0 = 5.32 < 0.05
AOV t0 = 3.62 < 0.05
F t0 = 2.44 < 0.05
T t0 = 1.40 0.18
Tc
t0 = 0.69 0.50
To
t0 = 1.13 0.28
Tcl,1
z0 = -1.86 0.08
Tcl,2
Tro t0 = 1.45 0.17
t0 = -0.45 0.66
perclos
T50 t0 = 0.51 0.62
T80 t0 = -1.46 0.16
T90 t0 = -2.04 0.06
t0 = -0.50 0.63
Table 7.8.: Left table: number of occurrences of kss values at the time of first microsleep and the
number of occurrences for the maximum value of kss, if no microsleep was detected. Right
table: confusion matrix.
KSS KSS
awake drowsy awake drowsy total
1 2 3 4 5 6 7 8 9 yes 2 9 11
microsleep
first 0 0 0 1 1 0 1 6 2 no 0 3 3
microsleep
none 0 0 0 0 0 0 1 0 2 total 2 12
Similar to Section 7.4.1, the mean value of the features within the first 5 min of the drive is
compared with the 5-min interval before the occurrence of the microsleep event as shown in
Figure 7.18. It can be seen that the interquartile range of the boxplots (see Appendix B) for
most of the features are not overlapping, except for A, E, AOV and Tcl,1. Comparing plots of
this figure with those of Figure 7.16, it can be deduced that a lane departure occurs during
later phases of drowsiness. As an example, before a lane departure event, ACV decreased to
60% of its magnitude at the beginning of the drive. However, before the first microsleep, it
has only dropped to 75% of its initial value. This is also the case for MCV , MOV and Tc. In
general, the comparison of both figures should be done with care, because different numbers
of subjects were considered for the event analyses. Apart from that on average, the first
unintended microsleep± occurred after 110 26 min of driving, while the first unintentional lane
departure±occurred after 128 40 min. These values agree with the conclusions made based on
the alteration of blink features. Moreover, the time difference between the occurrence of each
safety-critical event implies that driver physiological measures outperform the driver
performance measures in early prediction of the driver drowsiness.
Table 7.9 shows the results of the statistical tests. Most of the results are based on the
Wilcoxon signed-rank test which means that most of the differences between distributions
were not normally distributed. Clearly, it is due to the small number of available samples for
each feature.
A E MCV MOV
baselined
1.2
1 1
1.4
1 0.8
1 0.8
0.8 0.6
0.6 0.6
A/MCV A/MOV ACV AOV
baselined
1.4 1.1
1.5 1
1.2 1.25 0.9
0.8
1 1 0.7
0.6
F T Tc To
baselined
1.8 1.4
1.5 1.2
1.4
1.2 1.25 1.1
1 1
1 1
baselined
2 1.8
1.5 1.2 1
1.4
1 0.96
1
1
T50 T80 T90 first before
baselined
2 5 min. event
2
1.5 2
1.5
1 1 1
first before first before first before
5 min. 5 min. 5 min. event
event event
Figure 7.18.: The mean of all baselined features over the first and the last 5 min before an unintended
microsleep for 11 subjects who drove in the driving simulator
Except for A and E, all features varied significantly in comparison to the beginning of the drive.
7.5. Correlation-based analysis of eye blink features
In this section, the relationship between extracted features is analyzed statistically. First,
based on the correlation analysis, it will be shown to what extent each feature correlates with
the kss values. This is done for both drive time-based and kss input-based features to show
how informative they are in terms of a drowsiness indicator. Moreover, it is also studied
which features are correlated with each other. Such analysis is, in general, important for
knowing the amount of redundant information available in the extracted feature set.
7.5.1. Case 1: correlation between a feature and KSS values
Since our goal is the detection of driver drowsiness, first of all, the relationship of each fea-
ture with the drowsiness is explored. The Pearson product-moment correlation coefficient and
7.5 Correlation-based analysis of eye blink features 131
Table 7.9.: Results of paired-sample t-test (t0) and Wilcoxon signed-rank test (z0) for all features shown
in Figure 7.18. Red color indicates non-significant features.
A t0 = 1.15 0.26
E z0 = -1.07 0.28
MCV z0 = -5.74 < 0.05
MOV z0 = -5.32 < 0.05
A/MCV z0 = -6.45 < 0.05
A/MOV z0 = -6.45 < 0.05
ACV z0 = -5.99 < 0.05
AOV z0 = 3.91 < 0.05
F t0 = -7.01 < 0.05
T t0 = -6.01 < 0.05
Tc
z0 = -6.45 < 0.05
To
z0 = -5.06 < 0.05
Tcl,1
z0 = -2.30 < 0.05
Tcl,2
Tro z0 = -4.55 < 0.05
z0 = -6.32 < 0.05
perclos
T50 z0 = 7.09 < 0.05
T80 z0 = -5.86 < 0.05
T90 z0 = -5.81 < 0.05
z0 = -5.76 < 0.05
Spearman’s rank correlation coefficient are two possibilities for quantifying the linear and non-
linear association between features and kss values, respectively. This analysis is also called
inter-correlation.
It should be mentioned that, in general, for stepwise and ordinal values like kss, the
mentioned coefficients are not necessarily an optimal evaluation method. The reason is that
the values of a feature might vary, even though their corresponding kss values do not change.
This fact affects the values of the correlation coefficients and their interpretation severely.
Pearson product-moment correlation coefficient
The Pearson product-moment correlation coefficient ρp(x, y), which quantifies the linear asso-
ciation between two vector variables x and y (Artusi et al., 2002; Field, 2007), is defined as
follows
L
Cov(x, y) N ( − )( − )
(x y) = xi µx yi µy
i=1 (7.9)
ρp = , .
σxσy N
(xi − µx)2 (yi − µy)2
LN L
i=1 i=1
Here, vectors x and y denote all N samples of a feature and the corresponding kss values.
Cov(x, y) refers to the covariance calculated for x and y. µ and σ correspond to the mean and
standard deviation of x and y, respectively. Values of 0 < ρp(x, y) ≤ 1 indicate that x and y
are positively correlated. Similarly, values of− 1≤ρp(x, y) < 0 denote a negative relationship
between the variables. If two variables do not relate linearly to each other, ρp(x, y) is close to
zero. The closer the value of ρp(x, y) to ±1, the stronger is the linear association between x
and
y. Field (2007) categorizes the values of ρp as follows: small effect for ρp = ±0.1, medium effect
for ρp = ±0.3 and large effect for ρp = ±0.5. However, depending on the field of research and
the issue being addressed, the value of ρp might be interpreted differently.

Based on a hypothesis test, namely the t-test, it is possible to analyze the significance of the
calculated ρp and to show whether ρp is significantly different from zero as explained in
Appendix D.3.
Spearman’s rank correlation coefficient
The Spearman’s rank correlation coefficient ρs quantifies the “general monotonicity of the un-
derlying relationship” between two variables (Artusi et al., 2002). This means, if one variable
increases, synchronously the other one increases or decreases. In this case, the Spearman’s
rank correlation coefficient has a high value independent of the existing or non-existing linear
relationship between the variables. Therefore, Spearman’s rank correlation coefficient also
quantifies the amount of non-linear relationship between two variables. This is unlike the
Pearson product- moment correlation coefficient which quantifies to what extent an existing
relationship is close to a linear one.
Since the similarity between an arbitrary monotonic function and the underlying relationship
between variables is looked for, first, all samples of x and y are sorted descendingly. ρs(x, y) is
then calculated using the ranks of sorted values as follows
LN
6 (rank(x) − rank(y)i 2
i=1
ρs(x, y) = 1 i , (7.10)
− − 1)
N (N
2
where rank(x)i and rank(y)i denote the i-th rank of x and y. (7.10) is valid as long as there are
no identical values in variables x and y. For identical rank values, called ties, the average of
the found rank is used (rank) and the calculation of ρ (x, y) is more complex as follows
tie s
L
N1
L
N2 LN ( 2
N 2 −1)−1 3 −r
(rx,i 1
(ry,i−ry,i)−6
x,i)−
3
(N 2 2 rank(x)i−rank(y)
ρs(x, y) = i=1 i=1 i=1 tie tie
(
i
. (7.11)
L
N1 L
N1
3 −ry,i ))
N (N 2−1)− ( 3
x, −rx,i) )(N (N (r y,
i i
r
2−1)−
i=1 i=1
In the above equation, N1 and N2 denote the numbers of elements in x and y excluding their
duplicate values. rx,i and ry,i refer to the numbers of observations with identical ranks.
Similar to ρp, ρs can also be tested for showing its significant difference from zero. More
details is provided in Crawshaw and Chambers (2001).
In the following, the introduced correlation coefficients are calculated for kss input-based
and drive time-based features.
KSS input-based features: All baselined kss input-based features are shown in Figure 7.19
versus kss values. The features are baselined with respect to the average of the first two
intervals before a kss input. Spearman’s rank correlation coefficient ρs and Pearson correlation
coefficients ρp between these features and kss values are listed in Table 7.10. The values are
sorted with| | respect to ρs . As mentioned before, ρp shows the strength of the linear
relationship between features and kss values. However, drowsiness may also evolve non-
linearly in the course of time
7.5 Correlation-based analysis of eye blink features 133
A E M CV
2
Baselined
1 1
0.5 0.5
M OV A/MCV A/MOV
1.5
1.5
Baselined
1
0.5 1
ACV AOV F
1
Baselined
1 2
1
0.5
0.6
T Tc To
1.5
Baselined
1.5
1
1 1
Tcl,1 Tcl,2 Tro
1.4
4
Baselined
1.6
2.5
1
1 1
PERCLOS T50 T80
3
Baselined
1 2
2
0.9 1 1
T90 123456789 123456789
KSS KSS
Baselined
2.5
1
123456789
KSS
Figure 7.19.: Boxplot of baselined kss input-based features for all subjects versus kss values
which makes ρs a more suitable measure for studying the relationship between features and kss
values.
For all features, except for To, the calculated correlation coefficients are significantly different
from zero, i.e. p-value < 0.05. For most of the features | ρs| is also larger than | ρp which
underlines the non-linear relationship between feature’s evolution and kss | values.
| |
Furthermore, the highest value| | p
of ρ and ρ s occurs for different features. The features with the
highest amount of linear association with kss is ACV , while the highest| | ρs is achieved for Tc.
The negative signs show the inverse relationship between features and drowsiness. For
example, the drop of ACV occurred parallel with the increase of kss which means that the
subjects closed their eyes more slowly due to drowsiness (see Figure 7.19 for ACV ).
Table 7.10.: Sorted Spearman’s rank correlation coefficient ρs and Pearson correlation coefficient ρp be-
tween all kss input-based features and kss values (N = 391). All p-values were smaller than
0.05 except for red features.
Feature ρs ρp Feature ρs ρp
Tc 0.60 0.50 MOV -0.44 -0.41
A/MCV 0.57 0.48 A/MOV 0.44 0.40
ACV -0.56 -0.52 Tcl,2 0.40 0.35
T90 0.55 0.45 A -0.36 -0.29
perclos -0.53 -0.46 AOV -0.33 -0.27
T50 0.53 0.45 Tcl,1 0.26 0.28
MCV -0.53 -0.50 F 0.21 0.16
Tro 0.52 0.49 E 0.10 0.21
T80 To
0.50 0.43 0.06 0.05
T 0.45 0.40
Drive time-based features: Table 7.11 shows the sorted values of ρs and ρp correlation
coefficients between all drive time-based features and kss values. Here, also the values are
| | to ρs . All calculated correlation coefficients are significantly different
sorted with respect
from zero,
i.e. p-value < 0.05. Moreover, it can be seen that for most of the features we have | | ρ≥s |ρp .
This confirms that the relationship between drive time-based features and kss values is also
to a larger extent non-linear. In addition, the features have different rankings with respect to
the absolute values of ρs and ρp. As an example, with respect to | ρp , A/MOV seems to be the
|
best feature correlated with kss with a linear association. However, Tro is the most correlated
with drowsiness based on| ρ|s . Comparison of the correlation coefficient values in Tables 7.10
and 7.11 shows that the kss input-based features are more correlated with kss values. This is
in agreement with the hypothesis that kss values, in their best case, represent the driver’s
vigilance level within a short time interval prior to their collection. However, it should be
mentioned that| larger
| values of ρs do not guarantee that the combination of these features
with each other and using all of them simultaneously for driver drowsiness detection yields
better results. This will be discussed in the next chapter.
7.5.2. Case 2: Correlation between features
In addition to the relevance of the extracted features to drowsiness and their informativity, it
is also important to know whether they are redundant or not. The amount of correlation
between features can be used to analyze the degree of redundancy. In general, highly
correlated features are not desired, since they may carry the same information which is
already provided by the other feature. The between-feature correlation analysis is also
referred to as intra-correlation.
7.6 Eye blink feature’s quality vs. sampling frequency 135
Table 7.11.: Sorted Spearman’s rank correlation coefficient ρs and Pearson correlation coefficient ρp be-
tween all drive time-based features and kss values (N = 4021). All p-values were smaller
than 0.05.
Feature ρs ρp Feature ρs ρp
Tro 0.51 0.46 F 0.36 0.37
A/MOV 0.50 0.48 T50 0.36 0.35
Tc 0.50 0.45 perclos -0.33 -0.34
A/MCV 0.47 0.40 AOV -0.30 -0.28
MOV -0.43 -0.42 Tcl,1 0.26 0.26
ACV -0.43 -0.43 To 0.23 0.25
T90 0.42 0.39 A -0.23 -0.16
MCV -0.41 -0.41 E 0.11 0.19
T80 Tcl,2
0.38 0.37 0.06 0.21
T 0.37 0.35
In the following, the correlation analysis for kss input-based features and drive based-time
features are studied based on ρp. Figures F.41 and F.42 show this analysis based on |ρs|.
KSS input-based features: Figure 7.20 shows |ρp calculated between kss input-based fea-
tures. The calculated | ρ| p for feature pairs with | the red sign cannot be shown to be
×
significantly different from zero, because for these feature pairs we have p-value > 0.05.
According to this figure, for some feature pairs such as (ACV, MCV ), (AOV, MOV ), (Tc,
A/MCV ), (Tcl,1, T ), (T50, T ), (T80, T50), (T90, T50) and (T|90, T80), we have ρp > 0.9 with p-value <
0.05. It is interesting that all highly linearly correlated | features carry the same kind of
information, i.e. they are all related either to the velocity or to the duration of a blink. For
pairs, which are to a very small amount linearly associated | with each other, namely ρp < 0.1,
×
we found p-value > 0.05 (red sign in Figure 7.20). The very | small amount |of association, i.e.
0.1 <| ρp < 0.2, between pairs such as (F, A), (T50, A) and (F, MCV ) is also comprehensible and
reasonable, since each of these features has different underlying mechanism.
Drive time-based features: The absolute Pearson correlation coefficients between drive
time-based features are also shown in Figure 7.21. Similar to Figure 7.20, in this figure, fea-
ture pairs with p-value > 0.05 are also shown with a red×sign. Regardless of these pairs,
other pairs such as (F, MOV ), (F, ACV ), (F, AOV ), (T, F ), (To, AOV ), (To, ACV ), (Tcl,2, To),
(perclos, E), (T50, A) and (T80, A) are all to a very small amount linearly correlated with
each other, namely |ρp < 0.1. As mentioned before, almost all of these features in pairs are
|
based on different underlying mechanism. Interestingly, the drive time-based feature pairs
| | ρp > 0.9 are exactly the same as kss input-based feature pairs with
with | ρp > 0.9, except for
|
the pair (Tc, A/MCV ). Thus, we conclude that the feature aggregation method has not
affected the redundancy of features.
7.6. Eye blink feature’s quality vs. sampling frequency
This section studies how sampling frequency of raw EOG signals affects the quality of the ex-
tracted features. This analysis is important for evaluating the features extracted from the data
provided by the driver observation cameras rather than the EOG. In this context, Picot et al.
(2009) studied the correlation between EOG and a high frame rate camera.
1
T90
feature pairs with p-value > 0.05
T80
0.9
T50
PERCLOS 0.8
Tro
Tcl,2 0.7
Tcl,1
To 0.6
Tc
T 0.5 |ρp|
F
AOV 0.4
ACV
A/MOV 0.3
A/MCV
0.2
MO
V
0.1
MC
V
0
E
E MCV MOV A/MCV A/MOV A

A
Figure 7.20.: Absolute values of Pearson correlation coefficient calculated between kss input-based fea-
tures
T90 1
T80 feature pairs with p-value > 0.05
T50 0.9
PERCLOS
Tro 0.8
Tcl,2
Tcl,1 0.7
To
Tc 0.6
T
F
0.5 |ρp|
AOV
0.4
ACV
A/MOV
0.3
A/MCV
MOV 0.2
MCV
E 0.1
0
A
E MCV MOV A/MCV A/MOV A
Figure 7.21.: Absolute values of the Pearson correlation coefficient calculated between drive time-based
features
In the experiments conducted with the EOG measuring system, a camera cannot be used
simultaneously, since the attached electrodes around the eyes disturb the image processing
task of the camera. Therefore, we sampled EOG signals down to 40 Hz and 30 Hz which is
comparable to the data of the driver observation cameras on the market. In fact, first, we
artificially degraded the quality of the raw signals and then extracted all features as before.
Figures 7.22 and 7.23 show scatter plots of all 40 Hz and 30 Hz features versus those of the 50
Hz. Best least square linear fits are also plotted. According to the plots, for most of the
features a smaller sampling frequency leads to smaller feature values. It is also clear that
smaller sampling frequency results in peak amplitude loss. This fact shows itself in
amplitude-based features to a larger extent, e.g. in MCV and MOV . Interestingly, T seems to
be resistant to the reduction of the sampling frequency up to 30 Hz.
A [µV] E MCV [mV/s]
[(mV)2]
0.8
40 Hz 8
30
0.6 30 Hz
6
20
0.4 4
0.2 10 2
0
0 0.2 0.4 0.6 0.8 0 0
0 10 20 30 0 2 4 6 8
MOV [mV/s]
A/MCV [s] A/MOV [s]
8 0.5
0.6
6 0.4
0.4 0.3
4
0.2
2 0.2
0.1
0
0 2 4 6 8 0 0
0 0.2 0.4 0.6 0 0.2 0.4
ACV [mV/s]
AOV [mV/s] T [s]
6
3 6
4
2 4
2 1 2
0
0 2 4 6 0 0
0 1 2 3 0 2 4 6
50 Hz 50 Hz 50 Hz
Figure 7.22.: scatter plot: comparison of 50-Hz features with 40- and 30-Hz ones for the first 12 subjects
- part 1
Figure 7.24 compares AOV and MOV with respect to different sampling rates. As expected,
Tc [s] To [mV/s] Tcl,1 [ms]

2 6
40 Hz
1.5
1.5 30 Hz
4
1
1
2
0.5 0.5
0
0 1 2 0 0
0 0.5 1 1.5 0 2 4 6
Tcl,2 [ms]
Tro [ms] PERCLOS
6 0.8
0.8
0.6
4 0.6
0.4
0.4
2
0.2 0.2
0 0 0
0 2 4 6 0.8 0 0.2 0.4 0.6 0.8
0 0.2 0.4 0.6
T50 [ms] T80 [ms] T90 [ms]
6
4 3
4
2
2
2 1
0
0 2 4 6 0 0
0 2 4 0 1 2 3
50 Hz 50 Hz 50 Hz
Figure 7.23.: scatter plot: comparison of 50-Hz features with 40- and 30-Hz ones for the first 12 subjects
- part 2
for smaller sampling frequencies, MOV values are closer to the AOV values which is the result
of the peak amplitude loss.
10
8
MOV [mV/s]
2 50 Hz
40 Hz
30 Hz
0
0 2 4 6 8 10
AOV [mV/s]
Figure 7.24.: Scatter plot: comparing AOV and MOV extracted based on 30, 40 and 50 Hz sampling
rate. The lines indicate the best linear fits.
8. Driver state detection by
machine learning methods
On the contrary to previous chapter, where features were explored as separate and
independent source of information, in this chapter, we consider all extracted blink features of
Chapter 7 together. To this end, we introduce machine learning and different state-of-the-art
classifiers applied to the features for driver drowsiness detection. Artificial neural network
(ann), support vector machine (svm) and k-nearest neighbors (k-nn) are the sophisticated
classifiers used here. In order to evaluate the classification results, different metrics are also
introduced. In addition, we consider different data division approaches to study the
generalization aspects of the classifiers. Further, the issue of imbalanced data set for all
classifiers is addressed by balancing the data sets. It is also investigated whether artificially
balanced data sets can replace the expensive and demanding data collection during the
awake phase. After comparing all classifiers with each other and suggesting an optimal
classifier for driver drowsiness detection, the generalization of the data collected in the
driving simulator to real road conditions is scrutinized and two new approaches are studied.
Finally, we discuss approaches for feature dimension reduction in order to address issues of
an in-vehicle warning system.
8.1. Introduction to machine learning
Machine learning methods, as their name mentions, are the algorithms and rules which
machines like computers learn from some representative data with different attributes called
features. By applying these learned rules to unseen data, it is possible to automatically
classify the data based on its similarities with the seen data. In fact, machine learning
methods are classification tools which divide the feature space into different regions by
means of decision boundaries. The decision boundary is either a linear one (e.g. a line or a
hyperplane) or non-linear depending on the complexity of the problem.
Here, the goal is to classify driver state based on extracted eye blink features. In fact, the idea
is that it would be possible to learn drowsiness patterns in eye blink features and to use the
learned patterns for predicting driver drowsiness. Eskandarian et al. (2007), Liang et al. (2007),
Hu and Zheng (2009), Friedrichs and Yang (2010a,b) and Simon (2013) are examples of recent
studies carried out in the field of driver state classification, i.e. classifying vigilance,
drowsiness and attentiveness of the driver based on some physical or/and physiological
features. A detailed review is provided by Dong et al. (2011).
In this work, the machine learning method used for driver state classification is called a
classifier. Input of the classifier is the feature matrix F ∈ RD×N containing N feature vectors
xn, with
n ∈ {1, · · · , N} as
follows F = x1 x2 · · · xN , (8.1)
D×N
142 Driver state detection by machine learning methods
where N is the number of samples. xn ∈ RD represents the D-dimensional feature vector of the
n-th sample, namely xn = x1,n x2,n · · · xD,n . Here,
T we have D = 19 features.
The output of the classifier is class c which corresponds to the driver state in this work.
Therefore, depending on the number of the available states, we are dealing with a 2-class
(binary), a 3- class or, in general, an m-class classification problem. The classes are defined in
Section 8.1.1. Depending on the availability of the class membership for the samples of the
feature matrix ahead to the classification step, two types of classification methods can be
explored: supervised versus unsupervised classification. In this work, only supervised
classification is studied.
8.1.1. Supervised classification
Supervised classification is defined as a classification problem with available information

about class membership for each sample in the feature matrix. In our work, the kss inputs
collected during the experiments are the class labels. Therefore, for each sample n of the
feature matrix F, namely xn ∈ RD, there exists a corresponding discrete-valued class label cn.
The total data set S is then expressed as follows
S = {(x1 , c1 ), (x2 , c2 ), · · · , (xN , cN )} . (8.2)
The task of supervised classification begins with the division of the data set into two sets
called training and test sets. Based on the features and the corresponding classes belonging to
the training S set train, the classifier is trained by learning rules. The complexity of these rules
depends on the complexity of the classifier and the relationship between features and classes.
Afterwards, the rules will be applied to the features of the test Sset test, which is unknown to
the classifier, in order to estimate the class of its samples. Finally, the performance of the
classifier is evaluated by comparing the estimated class cˆ with the true class c of each sample.
A typical phenomenon, whose occurrence severely affects the performance of the classifier during
the training step, is either overfitting or underfitting of the classifier to the training data set as
shown in Figure 8.1. The former occurs, if the learned rules are highly adapted and fitted to
the samples in the training set, such that all samples are classified correctly in the training set
(Figure 8.1(a)). This leads to a zero training error rate which is defined as the ratio of wrongly
classified samples during the training phase. However, this does not directly imply a low
number of errors also for the test set. In fact, as soon as a new sample of the test set is applied
to the classifier, it fails to classify the unseen data correctly. This is also called lack of
generalization for the classifier. The reason is that the classifier is fitted to the noise rather than
the data (Zamora, 2001). Therefore, a zero training error rate never guarantees a small test
error rate. Overall, the ultimate goal is to construct a general classifier which not only
classifies training data correctly, but also classifies new unseen data with the similar
performance.
On the contrary to the overfitting, underfitting (Figure 8.1(b)) occurs, if the classifier is too
simple and even the training error rate is high. In fact, the classifier does not fit to the
underlying structure of the data. Obviously, it cannot be expected that such a classifier
performs better on unseen data. Consequently, as Figure 8.1(c) shows, a compromise, as a
trade-off between the mentioned phenomena, should be made. Therefore, sometimes even
wrong classifications in the training phase are acceptable. Finally, in spite of the existing
training error rate, the generalization aspect of the classifier improves, because the
classification result of the unseen test set improves.
8.1 Introduction to machine learning 143
(a) overfit (b) underfit (c) trade-off
Figure 8.1.: Examples of three classification rules
KSS values as the classes
In this work, the kss values collected during the experiments are used as available labels for
the supervised classification task. As mentioned in Section 2.2, the self-estimation of
drowsiness during driving is highly subjective. Moreover, for each kss value, it is very
probable that the subjects compare their current state with the previous ones for a better self-
rating. Hence, depending on the preciseness of the first selected kss value, there might be a
bias shift on the other selected values until the end of the drive. Similar results concerning
misjudgment of drowsiness after three hours of continuous monotonous daytime driving
were reported by Schmidt et al. (2009). To this end, we suppressed the probable inaccuracy of
the kss values by grouping them together to form 2-class (binary) and 3-class (multi-class)
problems as follows:
binary or 2-class: awake drowsy
kss values: 1 2 3 4 5 6 7 8 9
3-class: awake medium drowsy
We call these classes{awake, drowsy} for the binary case and {awake, medium, drowsy in the
3-class case. }
The distribution of the classes for the binary classification is awake = 55% versus drowsy =
45% and for the 3-class case, awake = 41%, medium = 29% and drowsy = 30%, while
considering drive time-based features. For the kss input-based features, we also have awake =
46% versus drowsy = 54%. Due to small number of available samples for kss input-based
features, the 3-class case is not studied for them in this work. Figure 8.2 summarizes all class
distributions and shows that in all cases, we have balanced class distributions.
awake medium drowsy
awake drowsy
30%
46% 45% 41%
54% 55%
29%
kss input-based features drive time-based features drive time-based features
Figure 8.2.: Distribution of classes for kss input-based and drive time-based features
8.1.2. Metrics for evaluating the performance of classifiers
As also mentioned in Section 5.4, the performance of a binary classifier can be evaluated by a
confusion matrix shown in Table 8.1.
Table 8.1.: Confusion matrix of a binary classifier
predicted state cˆ
state awake drowsy
awake True Positive (tp) False Negative (fn)
given state c
drowsy False Positive (fp) True Negative (tn)
For each class (awake or drowsy), a detection rate dr is calculated based on Table 8.1 and the
values of tp, fp, tn and fn. The detection rates of classes are as follows
DRawake = TP
TP + FN (8.3)
DRdrowsy = TN
TN + FP . (8.4)
They are also called sensitivity and specificity which are equivalent to probabilities P (cˆ =
awake| c = awake) and P (cˆ = drowsy c = drowsy). For the confusion matrix introduced
|
in section 5.4, tn is not defined and as a result (8.4) cannot be calculated. Therefore, there,
precision and recall were defined instead. Conventionally, the metrics in (8.3) and (8.4) are
referred to as tpr and tnr, respectively. However, in this study, for a better readability of
detection performance results, we use the term dr which refers to both of these metrics.
Similarly, the rates of wrongly classified samples with respect to the number of available
samples in each class are
FN
FNR = FN +
=1− DRawake =⇒ P (cˆ = drowsy|c = awake) (8.5)
TP FP
FPR = FP +
=1− DRdrowsy =⇒ P (cˆ = awake|c = drowsy) , (8.6)
TN
where fnr refers to the false negative rate or miss rate and fpr is the false positive rate and is
also called the false alarm rate. The term 1 −dr is used in this work to refers to both fnr and
fpr regardless of the classes.
Moreover, we define an average dr (adr) as the average of both dr values, namely
DRawake + DRdrowsy
ADR = . (8.7)
2
Balanced accuracy is another notation of adr.
In addition to the above metrics, accuracy (acc) and its complementary, namely error rate (er),
are defined
TP + TN
ACC = TP + FP + TN + =⇒ P (cˆ = c) (8.8)
ER −
= 1FN ACC =⇒ P (cˆ /= c) . (8.9)
For a multi-class problem with m classes and confusion matrix M = [Mi,j]m×m, these metrics
are calculated as follows (Kolo, 2011):
DR Mi,i (8.10)
m
i =L Mi,j
j=1
1m
ADR = Lm DRi (8.11)
i=1
m
L
Mi,
ACC = i , ER =1− (8.12)
i=1 ACC ,
m
m
L L
Mi,j
i=1 j=1
where i in DRi refers to the i-th class.
8.1.3. Subject-dependent classification
As mentioned before, in supervised classification, data division into training and test is
performed prior to the training of the classifier. Typically, training and test sets contain 80%
and 20% of the total samples in the feature matrix, respectively. In addition, the feature
matrix is split in a way that the class distributions of the training and test sets remain similar.
To put it another way, as an example, if 65% of the total samples in the feature matrix belong
to the awake class, 65% of the samples in the training and test sets also belong to this class. To
keep the classification results independent of the samples selected for each set, we have
randomly split the samples 100 times into 80% and 20% sets and call it repeated random sub-
sampling validation. Due to the fact that samples of the training and test sets are randomly
selected, both sets might contain samples of a specific subject. Obviously, if samples of a
subject are statistically dependent or correlated, the classifier takes advantage of it and
inflated classification results are expected. Therefore, due to dependency of the classifier on
the subjects in this type of data division, the classification problem is called a subject-
dependent (subj.-dep.) one. The final evaluation results are the average of dr and fdr values
over all permutations.
8.1.4. Subject-independent classification
Generally, the goal of all driver assistance systems is to be able to warn all drivers, even those
whose features are unknown to the warning system. Hence, another possibility for data
division of training and test sets is to sort them according to the subjects unlike the subject-
dependent case. In this type of data division, for the total number of s subjects, the classifier
−
is trained by samples of s 1 subjects and tested on the samples of the s-th subject. By
repeating this procedure s times, all subjects appear one time in the test set. This method is
similar to the leave-one-out cross validation (Duda et al., 2012). Since the constructed classifier
model is fully independent of the subjects in the test set, we call it a subject-independent (subj.-
indep.) classification.
In subject-independent classification, due to a varying number of samples in the test sets,
misclassified samples of each subject might be differently penalized, e.g. 1 misclassification
out of 5 samples corresponds to the rate of 20%, while 1 out of 10 corresponds to the rate of
10%. Thus, the dr and fdr values of different subjects cannot be directly compared with each
other. On
this account, the tp, fp, tn and fn values of all test sets are summed up to the overall dr and
fdr metrics.
At the first glance, it seems that subject-dependent data division is not needed to be taken
into consideration in our application for the sake of generality of the classification results.
However, in the following, it is explained why both subject-dependent and subject-
independent data divisions are studied in this work.
We mentioned in Section 7.2 that there exist subjects, for whom the evolution of feature
values due to drowsiness did not follow the overall trend found among the majority of
participants. This highly individual behavior is linked to the intrinsic nature of the
physiological measures. Clearly, if the data of these subjects is fed as the test set to the
subject-independent classifier, the result will not be satisfactory. The reason is that the
training data is not representative of the test data. Therefore, in order to cover individual
differences to a very large extent, a data set with a very large number of participants is
needed. This, however, is not a feasible solution in our application, because for the nighttime
drowsy data collection, preparation of the participant and performing the task takes about 4
hours. Hence, subject-dependent data division can be considered as a rough estimation for
the case that the training set is well enough representative of the test set.
Another motivation is related to future intelligent vehicles. If future vehicles make user
adapta- tion possible, then each driver is a known user to the vehicle system. In other words,
the driver creates a user profile for himself and the vehicle saves the settings of this user to
his profile, e.g. seat setting, mirror settings, etc. Each time when the driver gets into the
vehicle, he signs in and the vehicle adapts all saved settings. This idea can be extended to the
warning system as well. Accordingly, the warning system is able to learn the user’s behavior
and benefits from his features for future use. This is similar to subject-dependent data
division, because the user is not thoroughly unknown to the warning system.
8.1.5. Imbalanced class distributions
A very important issue, which severely impacts the performance of a classifier, is the
distribution of the classes versus each other. In other words, it is crucial whether the class
distributions are balanced or imbalanced. We showed in Section 8.1.1 that we have balanced class
distributions. The reason that we study imbalanced data sets here is that it is not only a
problem from the theoretical point of view for the classifiers, but also for the warning
systems which aim to prevent car crashes due to driver drowsiness. He and Garcia (2009)
reviewed and discussed the issue of imbalanced data which is summarized here.
A data set is referred to as imbalanced, if the proportion of one class to the other class is
within the following orders: 100:1, 1000:1 or even larger. For 100:1, as an example, this means
that the classes are distributed such that in one of them 100 times more samples are available
than the other one. The class with the larger number of samples is called the majority class
and the other class with the smaller number of samples is referred to as the minority class.
According to Abdellaoui (2013), in the case of a multi-class problem, the class with the
smallest number of samples is the minority class, while all other classes are considered as the
majority classes.
The phenomenon of imbalanced data is mostly highlighted in medical applications such as
distinguishing between ill and healthy patients as minority and majority classes,
respectively. There, the wrong classification of an ill patient as a healthy one is much more
crucial than the inverse
case. In our application, similarly, a drowsy driver should always be warned, otherwise a car
crash is inevitable.
In general, for drowsiness detection in real life applications, most of the time, sleep-deprived
or already drowsy subjects are preferred to participate in the experiment. Therefore, the
collected data sets are dominant in the drowsy events. This is clearly due to two factors: cost
and time. A subject, who is almost awake and fit for performing the driving task during the
experiment, needs a longer time to feel drowsy which is not desirable. In spite of this fact, the
state of an awake subject should be classified correctly to the same extent that a drowsy
subject is classified as drowsy. In other words, not to warn an awake subject is also a part of
the goals of driver state classification. Therefore, experiments solely conducted with less
awake subjects suffer from imbalanced classes which also affects the classification task.
We mentioned in Chapter 4 that the driving simulator experiment (Section 4.3) and real road
experiment (Section 4.2.1) covered the data collection of drowsy and awake phases,
respectively. Therefore, in order to highlight the issue of imbalanced data set for less awake
subjects, we have only considered the kss input-based features collected in the driving
simulator experiment and removed features of the real road drives from the feature matrix.
This led to a feature matrix with 261 samples. We used 20% of this feature matrix for defining
100 test sets with balanced binary classes. Hence, each test set contained 52 samples (26
samples for each class). The remaining 209 samples of the training sets had the following
imbalanced class distribution: awake = 24% and drowsy = 76%. In the end, this imbalanced
training set is used to train the classifiers.
Although balanced class distributions are always desired, in most of the applications in real
life, it is not possible to collect such data especially due to high cost. Therefore, one solution
to tackle this problem is to artificially balance the data before applying a classifier to it based
on some known methods. These approaches are applied to the imbalanced training set as
well to balance the class distributions. In our case, it is the awake class which should be
balanced artificially. If for the unseen awake data, the performance of the classifier trained by
artificially balanced data is as good as that of the classifier trained by fully balanced data,
then we can save time and cost by choosing sleep-deprived and drowsy subjects for our
experiments and balance the class distributions afterwards artificially. Otherwise, conducting
experiments for collecting the awake samples similar to the drowsy ones is inevitable despite
taking a lot of time and effort. In Sections 8.2.3 and 8.3.7, this issue is studied.
In the following, two methods for artificially balancing the data set are introduced.
Random undersampling and oversampling
A very simple method for artificially balancing the class distribution is to randomly remove
samples of the majority class in order to have balanced class distributions. Clearly, the
random selection of samples to be removed should be repeated several times to guarantee
that the classification results do not depend on the respective selections of samples of the
majority class. As a very straight forward method, the random undersampling method has a
disadvantage. Due to randomly removing samples of the majority class, definitely, some
valuable information for the classifiers regarding this class might be removed which leads to
poor classification results. Moreover, for highly imbalanced data, a large number of samples
have to be removed. The undersampling method will be used as the solution for dealing with
imbalanced data in Section 8.6.1.
Similar to the undersampling method, an oversampling is performed to deal with the imbal-
anced data issue, such that randomly selected samples of the minority class are duplicated in
the data set. This method is motivated by the prevention of information loss of the undersam-
pling approach. However, outlier samples could be selected to be added to the data and poor
classification results are expected. Moreover, in general, an overfitting is inevitable, if
multiple instances of a sample are added to the data. Some classifiers even require unique
samples for training, like the classifier which will be explained in Section 8.3.
Synthetic minority oversampling technique
An alternative to the oversampling method explained before is the synthetic minority

oversampling technique (smote) introduced by Chawla et al. (2002). According to this
algorithm, first Nsmote is needed to be defined which indicates the number of samples to be
added to the minority class. Afterwards, for each sample xi of the minority class (see
Figure 8.3(a)), the k-nearest neighbors (k-nn) are looked for. Figure 8.3(b) shows these
neighbors for k = 3. Finally, by randomly selecting one of the k-nearest neighbors xˇi , a new
synthetic sample xnew is added to the minority class by interpolation as follows
xnew = xi + (xˇi − xi ) ζ , (8.13)
where ζ ∈[0, 1] is a random number. xnew is shown in Figure 8.3(c). In fact, new samples
are generated with respect to the sample under investigation and some of its neighbors. In a
multi-class problem with m classes the smote is repeated m − 1 times to balance all classes.
(a) an imbalanced data set (b) k-nn (k = 3) for xi (c) a new synthetic sample
Figure 8.3.: Applying the smote to an imbalanced data set
As explained before, adding new samples to the data set might lead to overlapping samples
and consequently to the overfitting (He and Garcia, 2009). Moreover, depending on the
classifier being used, the location of the new samples might affect the performance of the
classifier. Therefore, it is recommended to apply data cleaning techniques afterwards. In fact,
by removing some samples of the new data set, such cleaning techniques improve the
separability of the available clusters and, in turn, the classifier performance.
A typical data cleaning method, which usually follows the smote, is the Tomek links (Tomek,
1976). The Tomek links are the finite pair of samples to be removed and are defined as
samples belonging to different classes, but located the closest to each other. In other words,
the cleaning algorithm comprises finding and removing all Tomek link pairs, because either
one of the samples of the pair is noise or both of them are close to the borderline (He and
Garcia, 2009).
Mathematically, a pair (xi, xj) with xi and xj belonging to different classes and dx ,xi asj the
distance between them is considered a Tomek link, if there exists no sample xt of the opposite
8.2 Artificial neural network classifier 149
class satisfying dxi,xt < dxi,xj or dxj ,xt < dxi,xj . The algorithm ends, if the nearest neighbor of
each sample belongs to the same class. Figure 8.4 represents an example, where first, the
smote is applied to an imbalance data and then by removing the Tomek links, the clusters are
easier to distinguish. This methods will be applied to our imbalanced data set in Section
8.2.3.
(a) initial data set (b) data set after applying

smote
(c) detected tomek links (d) data set after removing

Tomek links
Figure 8.4.: Applying the smote and the Tomek link cleaning technique to an imbalanced data
8.2. Artificial neural network classifier
The background theory of the artificial neural network classifier presented here is a summary
taken from Jain et al. (1996), Zamora (2001), Uhlich (2006) and Duda et al. (2012) on this topic.
Inspired by the human nervous system and more specific the human brain, which is capable
not only to learn and to generalize rules, but also to perform parallel tasks, the artificial neural
network (ann) also consists of elements called neurons. As a machine learning method, it is
also capable to perform similar tasks. The first mathematical model of a simple neuron was
originally introduced by McCulloch and Pitts in 1943 (McCulloch and Pitts, 1943).
Eskandarian et al. (2007) and Friedrichs and Yang (2010a) also used this classifier for driver
state classification.
Figure 8.5 shows the architecture of an ann with three main layers: input layer, hidden layer
and output layer. Since all connections are in one direction and there are also no feed-back
loops between neurons, this architecture is called a feed-forward network. This network is
memoryless, because the output to a specific input does not depend on the previous state of
the network. Another variant of the network architecture discussed in Duda et al. (2012) is the
recurrent or feed-back network.
Figure 8.5.: Architecture of a feed-forward neural network with 3 inputs, 3 neurons in one hidden layer
and 2 outputs
The number of features and the number of classes determine the number of inputs and
outputs, respectively. Therefore, only the number of neurons and hidden layers are free
parameters to be selected. Considering too many neurons or hidden layers leads to the
overfitting of the network and consequently lack of generalization. On the contrary, too small
numbers of them prevent the network from learning rules adequately. The impact of the
number of neurons on the classification performance is discussed in the next sections.
8.2.1. Network’s architecture
As shown in Figure 8.6, the input layer sends the input values xi to the hidden layer
without processing them. The hidden layer neurons calculate the weighted sum of the inputs
called net activation (net). These calculated values are then fed to a non-linear activation
function f (.) whose outputs yj are the inputs to the next layer. Mathematically, we have
D
(1) (1)
, (8.14)
yj = f (netj) = wij xi +
f i=1
w0j
where index j refers to the j-th hidden neuron and w(1)
i
corresponds to the input-to-hidden
neuron weights (see Figures 8.5 and 8.6). j
Figure 8.6.: Mathematical representation of the input-to-hidden layer of a network
Similarly, the output layer also calculates the net activation and the final result corresponds to
the classifier output. Therefore, we have
Nh
(2) (2)
. (8.15)
zk = f (netk) = wjk yj +
f j=1
w 0k
In the above equation, index k denotes the k-th output unit (see Figure 8.5). Nh denotes the
number of neurons in the hidden layer.
In the case of a multi-class classification problem with m classes, the class with the maximum
value of zk will be selected as the final classification result by the ann classifier as follows
cˆ = arg max
k = 1, · · · , m zk . (8.16)
The overall output of the introduced three-layer network in Figure 8.5 can be represented as
zk = f ( )
Nh D
(2) (1)
wjk w xi , (8.17)
j=0 i=0 i
j
where zk is given as the function of the input xi by replacing (8.14) in (8.15) for yj and x0 =
y0 = 1. The generalization of (8.17) also allows considering other activation functions at
the output layer in comparison to the hidden layers.
The non-linear activation function can be either a hard threshold function such as the sign
function or a soft thresholding one such as the sigmoid function. The sigmoid function is
popular for having the following properties as shown in Figure 8.7 for the tangent sigmoid
2
f (net) − 1:
1+e−2net
=
• It is non-linear.
• It saturates which bounds the possible output values.
• It is continuous and differentiable.
It will be shown later that non-differentiable activation functions are, in general, not of
interest. Since f (.) is a non-linear function, ann is also a non-linear classifier and
consequently can handle complex rules between features and classes.
1
0.5
f(net)
−0.5
−1
−5 −4 −3 −2 −1 0 1 2 3 4 5
net
Figure 8.7.: Sigmoid activation function
8.2.2. Training of the network
After calculating the final outputs z of the network, which in our work correspond to the
driver state, they are compared to the desired driver states c. Since these desired states are
kss values and are available, we are dealing with a supervised classification. Obviously, the
goal is to minimize the difference between the estimated and the true states, i.e. the error, as
shown in Figure 8.8. To this end, the training error J should be calculated in terms of
mean squared
Figure 8.8.: Supervised classification by the ann
error as follows
m T
21 (
J=1 (c − 2
= c− ( c− given x . (8.18)
2 i=1i zi ) z z
The error minimization goal is achieved by updating the weights and calculating the new
outputs several times. In fact, the network learns patterns of the training data, if weights are
randomly initialized at the beginning of the algorithm and are updated iteratively based on
an error minimization criterion. This kind of iterative learning is called the back-propagation
algorithm and is performed by the gradient descent approach. In other words, the feed-
forward property of the network sends the inputs from the input layer to the output layer
and the back-propagation property updates the weights for achieving the most similar
outputs to the desired ones. There- fore, mathematically, the partial derivative of the error
with respect to the weights is calculated as follows
∂J
∆w = −η . (8.19)
∂w
The minus sign guarantees the reduction of error. η is called the learning rate and controls the
relative change in weights for optimizing the error (Duda et al., 2012). If it is set too high, the
final weights will be far from the optimal ones resulting in a poorly performing network. On
the other hand, a very small value of η yields a very time-consuming training process. In
Appendix G, it is explained how to train the network iteratively based on (8.19).
The iterative update rule for the τ -th iteration will be
w(τ + 1) = w(τ ) + ∆w(τ ) . (8.20)
The initial values of weights are set randomly at the beginning. Depending on the number of
samples available during the error minimization steps, different learning strategies are
possible. In this work, all samples are considered at the same time which is called batch
learning.
In addition to the above simple back-propagation method, there exists a variety of algorithms
which differ in their speed for optimizing the error J and finding its global minimum instead
of trapping in a local minimum. These algorithm are referred to as second order methods such
as the conjugate gradient, Newton and Levenberg-Marquardt. They are all explained in detail
in Duda et al. (2012). Unlike the gradient descent, these methods avoid a zigzag path towards
the minimum. According to Zamora (2001), the conjugate gradient method is superior to the
gradient descent optimization method in having a non-constant moving step towards the gradient
in the negative direction. In other words, as long as the local or global minimum is not
reached, the error J always decreases at each iteration (Bishop, 2006). In this work, we used
the scaled conjugate gradient (Moller, 1993) due to its high optimization speed by requiring
fewer iterations in optimizing the error J. Moreover, this method uses an approximation
which avoids the full calculation of the Hessian in the conjugate gradient (Zamora, 2001).
Practical issues
Priddy and Keller (2005) and Duda et al. (2012) provided some practical issues for improving
the training of the network which are summarized here.
• Scaling the features: Features, which are very different and far in their numerical
values, will be handled differently by the network during the training phase, as if one
feature would be more important than the other one. Duda et al. (2012) calls this
phenomenon non- uniform learning. There are some approaches to solve this problem. In
this work, in addition to the baselining step discussed in Section 7.1.3, we mapped the
feature values into a [−1, 1] range before feeding them to the ann classifier as follows
x − min (x) (
xnormalized = maxtarget − mintarget + mintarget , (8.21)
max (x) − min (x)
where maxtarget = 1 and mintarget = −1. For other scaling functions see Priddy and
Keller (2005).
• Number of hidden layers: Both of the mentioned references stated that one hidden
layer is enough for learning any arbitrary function, given an enough number of
neurons. As a result, in this work, we only use one hidden layer. Adding a second
hidden layer did not improve the classification results.
• Number of neurons: There exist some rules of thumb for the selection of the number
of neurons. In this work, however, we have selected it based on the classification
performance for the training and test sets.
8.2.3. Classification results of subject-dependent data sets
This section discusses the classification results based on the ann classifier for the subject-
dependent data division. Moreover, results of different classification issues such as feature
aggregation types and imbalanced data sets are discussed. Here, we have applied a feed-
forward ann classifier with scaled conjugate gradient back-propagation algorithm for
adjusting the weights in one hidden layer. For hidden and output layers, the tangent sigmoid
function similar to the example in Figure 8.7 and the linear transfer function were used,
respectively. The results were generated using the Neural Network Toolbox™ R2010b of
matlab.
In the following, it is shown that the most critical parameter of the ann classifier, which
directly impacts the classification performance, is the number of neurons Nh.
Results of KSS input-based features
In Section 7.1.1, we explained, how kss input-based features are extracted. These features
were applied to a binary subject-dependent ann classifier, while considering different
numbers of neurons. By considering 80% and 20% data division for the training and test sets,
the sets contained 312 and 79 samples, respectively.
The adr values for Nh = 2, 3, 4, 5 and 10 are shown in Figure 8.9 for both training and test sets.
The error bars indicate the standard deviation with respect to all 100 permutations for
selecting training and test sets. According to this figure, increasing Nh from 2 to 10 neurons
improves the classification results of the training set as adr increases. However, for the test
set, the adr
varies very slightly. Therefore, increasing the number of neurons does not improve the
generality of the network. The corresponding confusion matrix for Nh = 5, as an example, is
shown in Table 8.2. About 80% of the samples of each class are classified correctly. ≤Overall,
3 Nh 5
seems to be a sufficient number of neurons for the classification of the kss input-based features.
Nh = 10 increases the complexity of the network without improving the results.
Test set Training set

100
95
ADR [%]
90
85
80
75
2 3 4 5 10
Number of neurons
Figure 8.9.: adr of the training and test sets of the binary subject-dependent ann classifier for different
numbers of neurons. Feature type: kss input-based features. Bars refer to the standard
deviation of permutations.
Table 8.2.: Confusion matrix of the binary subject-dependent ann classifier ( Nh = 5). Feature type: kss
input-based features
predicted
driver state awake drowsy
awake 80.9% 19.1%
given
drowsy 19.0% 81.0%
Results of the drive time-based features
Drive time-based features were defined in Section 7.1.2. As mentioned before, on the one
hand, we expect to have poor classification results for these features due to the assumption
that the kss values do not change between two inputs. Moreover, we showed in previous
chapter that the correlation values of these features with kss values were lower in comparison
to kss input-based features. On the other hand, since a larger number of samples are available
for the training task, the classifier learns the rules based on more available information which
clearly improves the results.
Since, in total 4021 samples are available for drive time-based features as explained in Section
7.1.2, the training and test sets contain 3216 (80%) and 805 (20%) samples, respectively. Figure
8.10(a) depicts, how the variation of the number of neurons yields more precise classification
performance. The comparison of the results of drive time-based features in Figure 8.10(a)
with that of the kss input-based ones shown in Figure 8.9 indicates that our assumption
about constant kss values between successive kss inputs is justified. In fact, the larger number
of available samples in the feature matrix counteracts the possible impreciseness of the class
labels. The reason is that a larger amount of information is provided to the classifier for
learning the underlying relationship between kss and feature values. Again, we emphasize
that higher correlation values were found between kss input-based features and kss values.
This, however,
did not lead to better classification results for them. Moreover, the adr values shown in Figures
8.9 and 8.10(a) imply that for a larger number of samples in the feature matrix, a larger
number of neurons is needed. In fact, the number of neurons is directly linked to the
complexity of the training set. For kss input-based features, ≤ we saw that Nh 5 was a
suitable value for the number of neurons. However, for drive time-based features at least 10
neurons are needed. From Nh = 10 to 25, despite better classification results for the training
set, no improvement for the test set is evident. In other words, increasing the number of
neurons leads the classifier to learn more complex rules which do not necessarily generalize
the unseen data of the test set. The confusion matrix for the network with Nh = 10 is shown in
Table 8.3. The awake class is detected much better in comparison to the case shown in Table
8.2 (87.9% vs. 80.9%) at a cost of 3% drop in the dr value of the drowsy class (78.2% vs.
81.0%).
88 Test set 75
Training set
86
70
ADR [%]
84 ADR [%]
65
82
80 60
78
2 3 4 5 10 15 20 25 2 3 4 5 10 15 20 25
Number of neurons Number of neurons
(a) binary (b) 3-class
Figure 8.10.: adr of the training and test sets of the binary and 3-class subject-dependent ann classifier
for different numbers of neurons. Feature type: drive time-based features. Bars refer to
the standard deviation of permutations.
Table 8.3.: Confusion matrices of the subject-dependent ann classifiers for 2-class (Nh = 10) and 3-
class (Nh = 20) cases. Feature type: drive time-based features
predicted predicted
state awake medium drowsy
state awake drowsy
awake 84.2% 11.9% 3.9%
given awake 87.9% 12.1%
given medium 40.2% 42.8% 17.0%
drowsy 21.8% 78.2%
drowsy 9.8% 17.2% 73.0%
Now, we consider a 3-class subject-dependent classification case with respect to the kss
boundaries explained in section 8.1.1 for dividing kss values into 3 classes. adr values for
this case are shown in Figure 8.10(b). The values are not as good as those of the binary
classification. Similar to Figure 8.10(a), initially, increasing the number of neurons gives better
results (higher adr values) whereas from Nh = 20, due to overfitting, the test set results have
not improved. As mentioned before, in the 3-class classification problem, the classifier needs
to learn more complex rules which consequently requires a larger number of neurons in
comparison to the previously studied cases. Moreover, according to the confusion matrix for
Nh = 20 shown in Table 8.3, as expected, the medium class is mixed up with the awake class
most of the time (40.2%), while it is well distinguished from the drowsy class (17.0%). We
showed in Figure 7.6 that the feature boxplots corresponding to different kss values overlap
each other to a large extent. As a result, the ann classifier is also unable to find an acceptable
rule for distinguishing them from each
other, especially for the awake versus medium class.
Results of the imbalanced KSS input-based features
In previous sections, the feature matrix under investigation was based on almost equally dis-
tributed classes, i.e. balanced data sets. Now, we consider the imbalanced data set introduced
in Section 8.1.5 and resolve the consequences of imbalanced data issue. We mentioned that
the training set was constructed by considering kss input-based features collected in the
driving simulator experiment and removing features of the real road drives from the initial
feature matrix. The new feature matrix with 261 samples was divided to a balanced test set
with 26 samples for each class and an imbalanced training set with 209 samples (24% awake
and 76% drowsy samples).
We trained an ann classifier with different numbers of neurons based on 100 imbalanced
training sets. The corresponding adr values are shown in Figure 8.11(a). Although we
obtained adr values of about 75% for test sets, this does not imply that both of the classes are
similarly classified correctly. This figure only shows that increasing the number of neurons
does not improve the adr value for the test set and even deteriorates it. The confusion matrix
for Nh = 2 is shown in the left part of Table 8.4. As expected, unlike the drowsy class, the
awake class is classified close to random guessing due to lack of available information about it
in the imbalanced training set. This is in agreement with the statement of He and Garcia
(2009) regarding the drawbacks of an imbalanced data set.
100
100
95
95
90
90
ADR [%]
ADR [%]
85 Test set 85
Training set
80
80
75
75
70
2 3 4 5 10 70
Number of neurons 2 3 4 5 10
(a) imbalanced features Number of neurons
(b) balanced features by smote
numbers of neurons. Feature type: imbalanced and balanced by smote kss input-based
features of driving simulator experiment. Bars refer to the standard deviation of permuta-
tions.
Table 8.4.: Confusion matrices of the binary subject-dependent ann classifier for kss input-based
features of the driving simulator experiment. Left: imbalanced features ( Nh = 2). Right:
balanced features by smote (Nh = 2)
predicted predicted
state awake drowsy state awake drowsy
awake 59.6% 40.4% awake 75.3% 24.7%
given given
drowsy 9.0% 91.0% drowsy 16.1% 83.9%
In Section 8.1.5, we introduced two known methods for dealing with imbalanced data sets. The
under-sampling method is not used here, because it results in a lower number of samples in
the feature matrix which was shown to degrade the ann classification results. Here, the smote
was applied by considering k = 5 neighbors for adding new samples to the training set. After
cleaning the new training set based on the Tomek link approach, we obtained nearly
balanced class distributions. The same balanced test sets as before were classified again based
on the retrained network with balanced classes. The results of calculated adr values are
shown in Figure 8.11(b). Interestingly, increasing the number of neurons does not improve the
results in this case either. The confusion matrix for Nh = 2 is shown in the right part of Table
8.4. The comparison of the both confusion matrices, i.e. before and after applying the smote,
indicates that smote improved the classifier performance, such that the same awake samples
of the test sets were classified correctly up to 75.3% (Table 8.4, right) instead of only 59.6%
(Table 8.4, left). Clearly, this is obtained at a cost of 7% drop in the dr of the drowsy class
(91.0% vs. 83.9%).
Now, the question is whether the retrained network based on the balanced data by smote
is able to classify unseen awake data correctly. In fact, the 75.3% dr for the awake class
generated by smote does not necessarily guarantee that true awake samples can be classified
to the same extent correctly. The word true emphasizes that smote adds artificially generated
awake samples to the training set, not the measured true ones. On this account, we applied
the removed samples of the real road experiment, which mainly belong to the awake class, as
the test set to the trained network based on the smote and driving simulator data. This
clarifies how close the artificial awake samples are to the true ones. If we obtain high dr
values, then we save time and cost by considering sleep-deprived and drowsy subjects for
our experiments and balance the class distributions afterwards artificially.
Figure 8.12 shows the adr values for the kss input-based features of the real road experiment.
It can be seen that increasing the number of neurons does not improve the classification
results. It only leads to a better dr value for one of the classes at a cost of worse dr value for
the other class as shown in Table 8.5 for two choices Nh = 3 and Nh = 10. The comparison of the
confusion matrices clarifies why adr values do not change in Figure 8.12. Increasing Nh from
2 to 10 improves the dr value of the drowsy class to the same extent that it degrades the dr
value of the awake class, namely about 3%.
58
ADR [%]
54
50
46
2 3 4 5 10
Number of neurons
Figure 8.12.: adr of the kss input-based features for the real road experiment applied to the network
trained based on the smote. Bars refer to the standard deviation of permutations.
Regardless of the number of neurons, we conclude that adding artificial awake samples to the
imbalanced training set of the driving simulator data improves the classification result of this
class. Nevertheless, the retrained network results tend towards the awake class. The reason is
that the classifier failed to classify most of the unseen drowsy samples correctly. In other
words,
Table 8.5.: Confusion matrices of the binary subject-dependent ann classifier for kss input-based
features of the real road experiment applied to the network trained based on the smote.
Left: Nh = 3. Right: Nh = 10.
predicted predicted
awake 63.5% 36.5% awake 60.3% 39.7%
given given
drowsy 61.8% 38.2% drowsy 58.9% 41.1%
the network is overfitted to the new samples added to the training set and does not
generalize to the unseen data.
Based on the findings in this part, we suggest the collection of both awake and drowsy data
during the experiment. The is due to the fact that artificially generated samples lead to
overfitted classifiers with the tendency towards the minority class and lack of generalization.
8.2.4. Classification results of the subject-independent data sets
This section studies a binary subject-independent ann classifier considering drive time-based
features of all conducted experiments. As explained in Section 8.1.4, here, with a total of 43
subjects, the network was trained with 42 subjects and tested on the remaining subject who
was excluded from the training set. The dr values are shown in Table 8.6 for Nh = 2. For other
values of Nh, the network was overfitted. The small number of neurons for subject-
independent data sets implies that the test set, namely the data of unseen subject, was to a
large extent different from other 42 subjects. Consequently, a small number of neurons avoids
overfitting and improves the generalization of the classifier.
Table 8.6.: Confusion matrix of the binary subject-independent ann classifier for drive time-based
features (Nh = 2)
predicted
awake 80.8% 19.2%
given
drowsy 37.4% 62.6%
Figure 8.13 compares these results with those of the subject-dependent case shown in Table
8.3. In this figure, dr refers to the dr of each class and 100%− dr denotes the fnr and fpr
as defined in (8.5) and (8.6). Clearly, the subject-dependent classifier performs better in the
detection of both classes than the subject-independent one, because it has seen similar
samples in the training set. As soon as totally new data is fed to the classifier, classification
results degrade for at least 7%. This shows that individual properties of the subjects were not
filtered out thoroughly during the baselining step discussed in Section 7.1.3. Moreover, since
the dr of the drowsy class decreased more severely (about 15%, from 78.2% to 62.6%) in
comparison to that of the awake class, we conclude that the effect of drowsiness represents
different behaviors in different subjects. In other words, feature samples of each subject have
certain characteristics which do not necessarily apply to all.
8.3 Support vector machine classifier 159
drowsy
subj.-indep.
subj.-dep.
DR 100% − DR
awake
subj.-indep.
subj.-dep.
0 20 40 60 80 100
[%]
Figure 8.13.: Comparing confusion matrix of the binary subject-dependent ann classifier with that of the
subject-independent
8.3. Support vector machine classifier
The background theory of this section is based on a summary from Cristianini and Shawe-
Taylor (2000), Schmieder (2009), Abe (2010), Duda et al. (2012) and Abdellaoui (2013).
Support vector machines (svm) introduced by Vladimir Vapnik (Cortes and Vapnik, 1995) is
another machine learning method with the capability of learning rules. In contrast to the ann
classifier which is sensitive to outliers and might suffer from getting trapped in multiple local
minima, the svm classifier is more robust to outliers with a unique solution (Olson and
Delen, 2008; Yang, 2014). In addition, in applications with svm, overfitting seems no to be a
major issue (Olson and Delen, 2008). Hu and Zheng (2009) and Abdellaoui (2013) also used
svm classifier for driver state classification based on the blink features.
In the following sections, the basis of the svm classifier is briefly introduced and it is
explained how to train it by tuning the parameters.
8.3.1. Hard margin support vector machines
∈ {yi 1,
For a binary classification case with a linearly separable data set and class labels − 1 as
H
shown in Figure 8.14, the decision function is defined by the hyperplane : wT}x + b = 0,
where w and x refer to the weight and feature vectors, respectively. b denotes the bias.
Therefore, the classes can be distinguished from each other by the following inequality
(
yi wT xi + b ≥ 1 . (8.22)
Generally, the distance between sample xi of the training set and the separating hyperplane is
called the margin γi as shown in Figure 8.14(a). It is defined as
(
yi wT xi +
γi =
b , (8.23)
lw
l
where w denotes the Euclidean norm of w. The goal of support vector machine classifier is
to findlthe hyperplane with the maximal margin for the training set among different possible
l
separating hyperplanes as shown in Figures 8.14(a) and 8.14(b). This hyperplane is called the
optimal separating hyperplane and is found by optimizing the value of γmin. γmin, which refers
to the minimum distance between the separating hyperplane and all samples of the data set
in
(a) maximal margin (b) suboptimal
Figure 8.14.: Different separating hyperplanes
each class, is defined as

follows 1
γmin . (8.24)
lw
=
l
The largest value of γmin, called γopt, belongs to the optimal separating hyperplane and a
classifier based on it is called maximal margin classifier. Thus, the goal is to find γopt, i.e.
max γmin =⇒ min lwl (8.25)
w,b w,b
(
subject to yi wT xi + b ≥ 1 i = 1, · · · , N
which is equivalent to
1 2
max γmin =⇒ min lwl (8.26)
w,b
w,b 2
(
subject to yi wT xi + b ≥ 1 i = 1, · · · , N .
In addition to (8.26), which is called the primal form of the optimization problem, there exists
also an alternative dual form which in comparison to the primal form is much easier to solve.
In fact, in the primal form, it is difficult to handle the inequality constraint.
L as
The dual form of the optimization problem of (8.26) is based on the Lagrangian function
follows
N
1
L(w, b, α) = (
2 αi i ( xi + − , (8.27)
− i=1 y Tw
2 b
lwl 1)
where α = α1 α2 · · · αN T is the non-negative Lagrangian multipliers. For optimizing the
above equation, the extrema of L(w, b, α) should be calculated by differentiation with

respect to b and w. By substituting the differentiation results in (8.27), the dual problem is
obtained as
follows
N N αi αj yi yj xT xj (8.28)
Ld α
1 αi −
α max ( ) = 2 i
i=1 i,j=1
N
subject to
αi yi = 0 , αi ≥ 0 , i = 1, · · · , N
i=1
where d refers to the dual Lagrangian function. The above optimization is called hard margin
L
svm and is clearly independent of the weight vector w.
A main drawback of the above optimization problem for the maximal margin classifier based
on the hard margin is the requirement of a linearly separable data set. Clearly, such sets are
not always the case in real life data collection. Moreover, since the goal of such a classifier is
to classify the training data with no training error, it is inevitable to avoid overfitting.
Consequently, low generalization ability is expected. This issue will be addressed in the next
section.
8.3.2. Soft margin support vector machines
Since in real-world application not all data sets are linearly separable, the hard margin svm
should be revised before applying it to linearly inseparable data sets. A possible solution is,
on the contrary to the previous approach, to tolerate misclassification of the training data to
some extent as shown in Figure 8.15. It should be mentioned that it is the data set which
might be linearly inseparable. The decision function is still a linear boundary.
Figure 8.15.: Example of a linearly inseparable data
This goal is achieved by defining the slack variables ξi ≥ 0 which modify (8.22) as follows
(
y i w T xi + b ≥ 1 − ξ i . (8.29)
In the case of ξi = 0, xi is classified correctly1. For 0 < ξi < 1, xi is classified correctly and
is located within the selected margins, i.e. the selected margins are not the maximum ones
(see Figure 8.15). However, for ξi ≥ 1, xi is misclassified with respect to the selected optimal
hyperplane (see Figure 8.15).
The hyperplane based on this approach is called the soft margin hyperplane. Accordingly, the
classifier is called the soft margin svm. The primal form of the optimization problem in (8.26)
1
xi is not necessarily located on the boundary.
for finding this hyperplane is reformulated as follows
min N
1 lwl + l (8.30)
C2
ξ
w,b,ξ 2
l i=1
i
(
subject to yi wT xi + b ≥ 1 − ξi , ξi ≥ 0 , i = 1, · · · , N
where ξ = ξi ξ2 · · · ξN and
T the parameter C control the trade-off between minimizing the
number of misclassified samples and maximizing the margin. In fact, parameter C is

responsible for penalizing samples within the margin or those which are classified wrongly. l
is usually 1 or 2 and accordingly the problem is called either L1-norm or L2-norm svm.
Although the soft margin solution does not suffer from the drawback of the hard margin in
terms of linear separability of the data set, its performance depends on the choice of
parameter C.
Similar to (8.26), there also exists an alternative dual form for (8.30) whose derivation is
provided in Appendix H.2. For the L1-norm, we have
N N
L α α αi αj yi yj xT xj
− 1max ( )i = (8.31)
2
d i
α i,j=1
N i=1
subject to αi yi = 0 , 0 ≤ αi ≤ C , i = 1, · · · , N .
i=1
In comparison to (8.28), now αi has an upper bound in (8.31). Moreover, ξi is not directly
involved in the dual form.
An important property in the resulting dual forms of the optimization problem for both hard
and soft margin svm ((8.28) and (8.31)) is that finding the optimal separating hyperplane
never directly depends on the isolated values of the training data, but only on the inner
product of
the original feature vector xiT and xj. It will be shown in the next section, how we benefit from
this property.
8.3.3. Kernel trick
Although the previous section discussed the solution for handling linearly inseparable data
sets, the resulting optimal hyperplane might still suffer from lack of generalization
depending on the amount of non-linearity. We explained in Section 8.2 that ann classifier
benefits from non-linear transfer functions to deal with non-linearities. For svm, however,
another approach addresses this problem by analyzing the space RD on which the features are
lying. The reason is that the more complex the original space is, the more difficult it is to
learn the underlying patterns.
− This motivates mapping ( ) the attributes to another space,
called feature space F, of a →
higher dimension by function Φ. As a result, the linear separation of
a linearly inseparable data set becomes possible. Mathematically, the mapping is denoted as
follows
Φ: RD −→ F
x −→ Φ(x) . (8.32)
Figure 8.16 shows an example of such a mapping, where increasing the dimension of the
original space with linearly inseparable data set leads to a linear classification problem in the
feature space.
separating
x2 hyperplane
z3
z1
x1 z2
(a) original space (b) feature space
Figure 8.16.: An example of feature mapping for a linearly inseparable data set
Since after mapping, Φ(x) contains linearly separable values, the linear decision function
introduced in (8.22) is reformulated as
H: wT Φ(xi) + b . (8.33)
Accordingly, in (8.28) and (8.31), the inner product of xTi xj is replaced by the kernel function
K(.), namely
T
K(xi, xj ) = Φ(xi) Φ(xj) . (8.34)
The advantage of the kernel function is that it allows the calculation of the inner product of
T
Φ(xi) and Φ(xj) without explicitly calculating the mapped values. Moreover, the dimension
of the feature space does not play any role in the calculation of the kernel function.
Consequently, even a feature space with a very large dimension does not increase the
computational complexity of the classification problem. Other properties of the kernels are
discussed in Cristianini and Shawe-Taylor (2000) and Schmieder (2009).
Some well-known kernel functions are
• linear : K(xi, x j) = x T
i xj
• polynomial: K(xi, xj) = (a + xTi xj)d , d ∈ N , a ≥ 0
• radial basis function (rbf): K(xi , xj ) = e−γ lxi−xj l2 , γ > 0
(
• sigmoid: K(xi, x j) = tanh κ xiT xj + r , κ > 0 , r < 0
As it is shown, some kernel functions have also a parameter which should be tuned in
addition to the parameter C of the svm classifier. For example, the parameter γ of the rbf
kernel, is responsible for controlling under and overfitting during the training phase (Asa
and Weston, 2010).
In general, there is no known method which determines the type of the kernel function for a
specific application. Thus, depending on the characteristic of the data set, different kernels
might be appropriate or inappropriate. Since rbf kernel has only one parameter to be
optimized, it is usually the first choice (Chang and Lin, 2011).
8.3.4. Model construction
We mentioned in the previous section that the training phase of the svm classifier involves
optimization of two types of parameters: the parameter C for controlling the trade-off
between
the training error and the margin between classes and the kernel function parameter(s). Here,
we choose the rbf kernel which means the parameter pair (C, γ) must be optimized. A
common method for achieving this goal is applying the grid search method combined
with the cross validation as explained in the following.
Grid search
In general, the grid search method comprises searching for the optimal parameters guided by
a performance metric. For each parameter pair, a performance metric such as the accuracy
is calculated and the pair with the highest accuracy will be selected for constructing the svm
model. It should be noted that here, we use accuracy as the performance metric to guide the
grid search, because its classification results outperformed those of other metrics during the
training phase. By choosing the rbf kernel, first, the range of C and γ needs to be defined.
Since finding the best parameter pair is very time-consuming, the suggestion of Hsu et al.
(2003) is adapted. According to that, first a coarse grid search is applied to roughly find the
best values of C and γ, namely (C0, γ0), such that
(
(C0, γ0) = arg max acctrain svm(C, γ) (8.35)
C,γ
{ }
C = 2xC , xC ∈ −5, −3, · · · , 15 (8.36)
{ }xγ
γ=2 , x ∈ −15, −13,
···,3 , (8.37)γ
where svm(C, γ) denotes the svm model constructed using parameters C and γ. acctrain
refers to the accuracy during the training phase. We suppose that (C0, γ0) are calculated
based on (xC0 , xγ0 ). Afterwards, a fine grid search is performed around (C0, γ0) to
determine the optimal parameter pair (Copt, γopt) based on a new range of values for xC
and xγ as follows
(
(Copt, γopt) = arg max acctrain svm(C, γ) (8.38)
C,γ
}
xC ∈ xC0 − 2, xC0 − 1.75, · · · , xC0 + 1.75, xC + 2 } (8.39)
{
xγ ∈ xγ0 − 2, xγ0 − 1.75, · · · , xγ0 + 1.75, xγ0 + 2 . (8.40)
Unlike the coarse search, where the step sizes of xC and xγ are set to 2, in fine search, they
are set to 0.25. Figure 8.17 shows the results of the coarse and fine grid search with the
highest accuracy of 83.5% and 84.0%, respectively. The selected parameters are (C0, γ0) =
(32, 2) and (Copt, γopt) = (16, 2).
Cross validation
Cross validation is a method for avoiding overfitting during the training phase. As
mentioned in Section 8.1.1, prior to applying any classifier to the data set, a training and a test
set are constructed. The cross validation method splits the training set into two sets for a
second time. The first set of it is only used for the training, while the second one, S called
validation set validation, is considered as the test set (see Figure 8.18). Repeating this division step
j times is called j- fold cross validation which randomly divides the Ntrain samples of Strain into j
subsets with the
length of Ntrain
j
per subset. Each time, one of the j sets is used as Svalidation and the remaining
j − 1 sets are combined together as the new training set. Clearly, by repeating this procedure
j-times, all sets appear one time as the validation set. A performance metric on Svalidation is
83 7 84
15
13 80 83.5
6.5
training accuracy [%]

training accuracy [%]
11
6 83
9 ] 75
7
%
[ 5.5 82.5
y 70
xC
xC
5 c 5 82
a
3 r 4.5
u 65 81.5
1 c
c 4
−1 a 81
g 60
3.5
−3
n
i
80.5
−5 3
1n
55
−15 −13 −11 −9 −7 −5 −3 −1 3 80
−1 −0.5 0 0.5 1 1.5 2 2.5 3
xγ xγ
(a) coarse search (b) fine search
Figure 8.17.: An example of the grid search for finding (C0, γ0) and (Copt, γopt)
calculated each time. The overall performance of the training phase is the average over all j
calculated performance metrics. Figure 8.18 pictorially shows this method. If additionally the
samples of different classes are also equally distributed in the training and test sets, then the
method is called j-fold stratified cross validation. After constructing the final model, it will be
then applied to the initial test set which is totally new to the classifier.
Figure 8.18.: j-fold cross validation method
8.3.5. Multi-class classification approaches
In order to extend the application of the svm classifier for covering multi-class cases, different
approaches exist which are reviewed in Abe (2010). Here, we only explain One-Against-One
and One-Against-All strategies. Both of these strategies decompose the original problem into
multiple binary sets and then apply the introduced svm classifier to them as explained in the
following.
One-Against-All
This approach decomposes the original m-class problem into m binary classification problems
where sample xi belongs either to class J∈ {1, 2, · ·, · m or does not belong to this class.
Clearly, a sample, which does not belong to class J, } is the member of one of the other m
−1
classes. In order to cover all m classes, the decision function should be calculated m times.
According to (8.33), for the J-th decision function, which classifies the J-th class, we have
≥1 xi belongs to class J
wJT Φ(xi) + bJ ,
≤ −1 xi belongs to the remaining classes
where wJT Φ(xi) + bJ = 0 is the optimal separating hyperplane. The above decision function
is a discrete one, since only its sign plays a role in the classification. A shortcoming of such a
decision function is that a sample might be unclassifiable as shown in Figure 8.19(a) with
T
shaded areas. In this figure, a sample is classified as belonging to class J J, if w Φ(xi) + bJ > 0
(shown
with an arrow). Clearly, a sample is unclassifiable,
• if it satisfies wJT Φ(xi) + bJ > 0 for different classes or
• if it does not satisfy wJT Φ(xi) + bJ > 0 for any class J.
Therefore, instead of a discrete decision function, a continuous one is used and the final
predicted class is the one which maximizes the decision function as follow
arg max (wT Φ(xi ) + bJ ) . (8.41)

J = 1, · · · , m J
In addition to the mentioned problem, another problem of the One-Against-All approach is

that the distribution of the classes in the training set is imbalanced. This is due to classifying
samples of one class against samples of all other classes. As an example, for five classes each
with equal numbers of samples, this approach leads to the class distribution of 20% versus
80% in the training set. In Section 8.1.5, we discussed this issue.
(a) One-Agains-All (b) One-Agains-One
Figure 8.19.: An example of a 3-class classification with shaded areas as the unclassifiable. The ar-
rows show the positive sides of the hyperplanes. Decision functions for the One-Against-
All approach: H: wTJ Φ(xi) + bJ = 0, J = 1, 2, 3 and for the One-Against-One approach:
H: wIT Φ(x) + bIJ = 0, I = 1, 2, 3, J = 1, 2, 3 and I /= J.
J
One-Against-One
On the contrary to the previous approach, the One-Against-One approach decomposes the
m (m−1)
original multi-class problem into K2 = binary cases, where m refers to the number of
available classes. By applying the svm classifier to these binary problems, a sample xi will be
then classified for K times based on K decision functions as either the member of class I
or class J,
where I /= J. Consequently, this method performs the training phase with a fewer number of
samples, namely only with those belonging to the classes under investigation. On the
contrary, the One-Against-All method considers all samples together. By applying the
conventional svm classifier to the binary classes, we have
≥1 x belongs to class I
wTIJ Φ(x ) + bi ,
i≤ −1 xi belongs to class J
IJ
T
where I = 1, 2, . . . , m, J = 1, 2, . . . , m, I J and wIJΦ(xi) + bIJ = 0 is the optimal separating
hyperplane. The final predicted class of sample xi corresponds to the class with the maximum
number of votes after K times classification as follows
arg max m sgn(wT bIJ )

IJ
Φ(xi) (8.42)
+
.
I = 1, . . . , m J=1,I J
In fact, a sample xi is classified as belonging to the I-th class, if the above equation is equal
to m −1 for the I-th class and a value smaller than m 1 for the other class. If for none of
the classes the value m −1 is achieved, then xi is unclassifiable, because multiple classes satisfy
(8.42).
Figure 8.19(b) shows an example of an unclassifiable area for this method with shaded area. In
this figure, wIJT Φ(xi) + bIJ > 0 leads to the classification of xi as belonging to class I (shown
with arrows) and otherwise as belonging to class J. According to this figure, the advantage of
this method over the One-Against-All method in Figure 8.19(a) is that the unclassifiable area
for the current method is much smaller. Therefore, in this work, this approach is applied.
8.3.6. Dealing with imbalanced data
As mentioned in Section 8.1.5, an imbalanced data set degrades the classifier performance.
For the svm classifier with imbalanced data, the optimal hyperplane tends more towards the
minority class. The reason is that the more it gets closer to the samples of the minority class,
the larger is the number of correctly classified samples of the majority class in comparison to
the number of misclassified samples of the minority class. In other words, the classifier
prefers to classify a large number of samples of the majority class correctly without violating
the margins instead of violating the margins to correctly classify few samples of the minority
class. As a result, most of the unseen samples of the test set will automatically be classified as
belonging to the majority class due to the shifted optimal separating hyperplane.
We explained in Section 8.3.2 that the soft margin svm tolerates wrong classifications by
applying the parameter C in (8.30) which is also referred to as the misclassification cost. The
second term in (8.30) penalizes the errors of both classes equally, which is not desired, if the
available data set is imbalanced. Therefore, the following solution proposed by Veropoulos et
al. (1999) is used for the L1-norm svm which considers different values of C for each class,
namely
N
C ξi −→ C+ ξi + C− ξi . (8.43)
i=1 i∈N+ i∈N−
In the above equation, N+ and N− refer to the number of samples in the majority and minority
classes, respectively. Similarly, C+ and C− refer to different misclassification costs for each
class.
Accordingly, the upper bound of αi in (8.31) also becomes
0 ≤ αi+ ≤ C+ , xi ∈ S+ (8.44)
0 ≤ αi− ≤ C− , xi ∈ S− . (8.45)
To reduce the negative effect of an imbalanced data set, misclassified samples of the minority
class must be penalized to a larger extent than those of the majority class, i.e. C+ < C−. Akbani
et al. (2004) empirically found good results by selecting the following ratio between C+ and
C−
C− N+ N+
= =⇒ C = C . (8.46)
− +
C+ N− N−
Finally, this leads to
C− = C (8.47)
N−
C+ = C . (8.48)
N+
However, Schmieder (2009) suggested following values
C
C− = (8.49)
2N −
C
C+ = +. (8.50)
2N
In this work, we use the ratio in (8.46).
8.3.7. Classification results of subject-dependent data sets
In this work, the soft margin svm classifier was applied using the libsvm library (Chang and
Lin, 2011). As mentioned before, this classifier has two parameters to be tuned: the kernel pa-
rameter(s) and the classification error weight C for controlling the trade-off between the
training error and the margin between classes. The rbf kernel was selected here because of its
good classification results within the shortest simulation runtime. The γ parameter of the
rbf and C were optimized by the grid search method and the 5-fold cross-validation as
explained in Section 8.3.4. The range of C and γ were also defined according to (8.36) and
(8.37). The performance metric for guiding the search algorithm was the accuracy as defined
in (8.8), because its classification results outperformed those of other metrics during the
training phase.
Results of the KSS input-based features
As mentioned before, the balanced test and training sets were defined by 100 permutations.
Due to this fact, 100 binary svm models were trained with the parameters shown in Figure
8.20 regarding the kss input-based features with balanced class distributions. The
corresponding training and test accuracies are also shown in this figure. The confusion
matrix, which is calculated based on the average over all permutations, is also shown in
Table 8.7. It can be seen that both classes are classified with very close dr values. The
comparison of these results with those of the ann classifier in Table 8.2 indicates that the svm
classifier has achieved slightly better dr values, although the differences are only about 4%.
Copt γopt training test set

3500 set 100
3000 12 100 95
2500 10 95 90
acc [%]
acc [%]
2000 8 90 85
1500 6 85 80
1000 4 80 75
500 2 75 70
0 0 70
balanced imbalanced balanced imbalanced balanced imbalanced
balanced imbalanced
Figure 8.20.: Boxplot of C , γ, training and test accuracies for the balanced and imbalanced 2-class
subject- dependent classification with the svm for all 100 permutations. Feature type: kss
input- based features
Table 8.7.: Confusion matrix of the binary subject-dependent svm classifier. Feature type: kss input-
based features
predicted
awake 84.2% 15.8%
given
drowsy 16.4% 83.6%
Similar to the ann classifier, 2-class and 3-class svm classifiers with drive time-based features
were trained. We expect to achieve better results in comparison to the results of the kss input-
based features due to having a feature matrix with a larger number of samples and a larger
amount of information.
Figure 8.21 shows the trained parameters C and γ for both subject-dependent svm classifiers.
Comparing these parameters with those of the kss input-based features does not reveal a
large difference between them. Moreover, the parameters of the 2- and 3-class cases are also
not very far from each other. Therefore, we conclude that in spite of different feature
aggregation approaches, the optimization parameters do not differ to a large extent. The
corresponding confusion matrices are also shown in Table 8.8. Similar to the ann classifier,
using drive time- based features improves the dr of the awake class by about 5% (89.0% vs.
84.2%). However, the dr of the drowsy class drops to a larger extent, namely by about 6%
from 83.6% to 77.6%. The comparison of the performance of the svm with that of the ann in
Table 8.3 indicates that for both binary and 3-class cases, the type of the classifier did not
affect the classification results. As an example, the medium class is also confused with the
awake class by the svm classifier. This finding emphasizes two facts. First, the class labels
might be imprecise. As a result, regardless of the type of the classifiers applied to the
features, some classes are not distinguishable such as awake versus medium class. Second,
the features might not be informative enough for our task which limits the result
improvement. Consequently, both classifiers performed similarly. Besides the mentioned
points, we also conclude that the underlying driver state information in the extracted features
is interpreted similarly by both classifiers. Since this is valid independent of the classifiers’
type, we consider it as the strong point of the features.
Copt γopt training test set

4 set 95
40
95
90
3
30 90
85
[%]
85
20 2 80
[%]
80
75
10 1 75
70
0 70
2 0
2 3 2 3
3 classe 2 3 classes
classes s classes
Figure 8.21.: Boxplot of C , γ, training and test accuracies for the 2-class and 3-class subject-dependent
classification with the svm for all 100 permutations. Feature type: drive time-based
features
Table 8.8.: Confusion matrices of the subject-dependent svm classifiers for the 2-class and 3-class cases.
Feature type: drive time-based features
predicted predicted
state awake drowsy
awake 85.3% 9.8% 4.9%
given medium 40.2% 41.7% 18.1%
drowsy 22.4% 77.6%
drowsy 10.5% 14.9% 74.6%
In this section, we study whether the svm classifier is sensitive to the distribution of classes in
the training set. Therefore, similar to Section 8.2.3, we fed imbalanced kss input-based
features of the driving simulator experiment to the binary subject-dependent svm classifier.
The values of C, γ and the corresponding accuracies are depicted in Figure 8.20. Interestingly,
in comparison to balanced case, the value of C has a larger range. The calculated confusion
matrix is shown on the left part of Table 8.9. The awake class is classified less correctly
(63.0%) in comparison to the drowsy class (93.7%) which underscores the tendency of the
classifier towards the majority class, namely the drowsy class. We explained in Section 8.3.6
that this happens due to equally penalizing the misclassified samples of both classes during
the training phase. There, we also mentioned that this problem can be solved by considering
different misclassification costs for each class. According to (8.46), we need the ratio between
the number of samples in the minority and majority classes which were N− = 50 and N+ = 159.
The confusion matrix of the binary subject-dependent svm classifier with different
misclassification costs is shown in the right part of Table 8.9.
Table 8.9.: Confusion matrices of the binary subject-dependent svm classifiers for kss input-based fea-
tures of driving simulator experiment. Left: imbalanced features, right: balanced features
by considering different misclassification costs
predicted predicted
awake 63.0% 37.0% awake 73.3% 26.7%
given given
drowsy 6.3% 93.7% drowsy 7.5% 92.5%
According to the listed values, the classification of the awake class improves by about 10%.
However, the dr value of the drowsy class decreases by about 1%. This is unlike the ann
classifier where the dr of the awake class improved to a larger extent by reducing the dr of the
drowsy class (see Table 8.4). As a result, the ann classifier was more sensitive to imbalanced
data set.
Overall, both classifiers perform almost similarly in detection of the awake class. However,
the svm outperforms the ann in the detection of the drowsy class by about 8%. This might be
due to the fact that different solutions for handling the imbalanced data were applied. In fact,
totally different training sets were fed to these classifiers. The ann classifier was trained with
the new artificially generated awake samples based on the smote, while the svm classifier
received only the same imbalanced training set. In other words, the ann classifier was
retrained by changing the input values and not the structure of the training phase. On the
contrary, the svm classifier was retrained by the fixed input samples and a new structure of
the training phase, i.e. different misclassification costs. In addition, the smote is a general
approach which is independent of the classifier type. However, the approach used for the
svm directly affects the classifier itself. Consequently, it has yielded better classification
results.
In order to know whether the newly trained svm also classifies unseen awake samples
correctly, we applied the features of the real road drives to it. According to the corresponding
confusion matrix shown in Table 8.10, both of the classes are detected completely randomly,
i.e. dr values are close to 50%. Consequently, similar to the ann (compare with Table 8.5), an
svm classifier retrained with different misclassification costs is also still unable to classify
unseen awake samples. It even fails to classify drowsy samples correctly. In fact, the new
classifier is overfitted and is far from a generalized model. Therefore, we conclude that even
by applying the svm classifier and its approach for handling imbalanced classes, the
collection of both awake and drowsy data during the experiment is essential.
Table 8.10.: Confusion matrix of the binary subject-dependent svm classifier for kss input-based
features of the real road experiment applied to the model trained by considering different
misclassification costs
predicted
awake 44.4% 55.6%
given
drowsy 47.6% 52.4%
Similar to Section 8.2.4, we also studied the svm while considering the subject-independent
classification approach. The confusion matrix shown in Table 8.11 does not differ from that of
the ann classifier listed in Table 8.6. Therefore, the type of the classifier seems not to be an
issue for solving the problem of unseen data given the drive time-based features. However,
Figure 8.22, which compares Table 8.11 with the similar confusion matrix for the subject-
dependent classification by the svm (Table 8.8), indicates that classification of the awake class
is less problematic (10% drop of the dr from 89.0% to 79.5%). On the contrary, for the drowsy
class, the dr decreases more severely (16%, from 77.6% to 61.5%).
Table 8.11.: Confusion matrix of the binary subject-independent svm classifier for drive time-based fea-
tures
predicted
awake 79.5% 20.5%
given
drowsy 38.5% 61.5%
subj.-indep.
drowsy
subj.-dep.
DR 100% − DR
subj.-indep.
awake
subj.-dep.
0 20 40 60 80 100
[%]
Figure 8.22.: Comparing confusion matrix of the binary subject-dependent svm classifier with that of
the subject-independent
8.4. k-nearest neighbors classifier
The classifiers, which were studied in Sections 8.2 and 8.3, are both based on complex ideas
and optimization solutions. The ann classifier works based on neurons and hidden layers and
the svm looks for the optimal separating hyperplane. In this section, another classifier based
on a very simple idea, namely nearest neighbors, is studied to show how complex a classier is
required to be in our application.
8.4.1. Background theory
The k-nearest neighbor classifier is a nonparametric classification method which does not
need any information about the underlying distribution of the data set. It is also very famous
for its simplicity. As its name shows, this classifier determines the class of each sample
depending on the majority class of the k nearest adjacent samples. According to Cover and
Hart (1967), the number of k should not be set very large, otherwise outliers of other classes
influence the true class. Moreover, by selecting odd numbers of k ambiguous decisions are
avoided. However, the special case of k = 1 should not be chosen. It is subject to high amount
of noise and unreliable results, because it always leads to overfitting.
This method strongly depends on the distance between the sample under investigation and
samples of the training set. Thus, different metrics are defined for calculating it such as the
general metric of Minkowski Lp(x, y)
1
N p
p
Lp , (x y) |xi − yi| (8.51)
= i=1
.
In the case of p = 1, (8.51) is called Manhattan distance or L1-norm and for p = 2, it is the
conventional Euclidean distance or the L2-norm. As mentioned in Duda et al. (2012), a main
8.4 k-nearest neighbors classifier 173
drawback of the Euclidean distance is its sensitivity to the features’ scaling, i.e. a large
disparity in the range of features. This fact underlines the importance of feature
normalization prior to the classification. Another alternative distance metric, which does not
suffer from the mentioned problem, is the Mahalanobis distance defined as
Lmaha(x, y) = ✓(x − y)T S−1 (x − y), (8.52)
where S refers to the covariance matrix. In addition, a linear transformation of x does not
affect the value of Lmaha (Yang, 2014).
8.4.2. Classification results of the subject-dependent data sets
Similar to other classifiers, in this section, we study the k-nn classifier under subject-
dependent and subject-independent, 2-class and 3-class cases and finally its sensitivity to the
imbalanced data.
Results of the KSS input-based features
Here, we study the performance of the k-nn classifier with regard to the kss input-based
features for different values of k, namely k = 3, 5, 7 and 9. The Mahalanobis distance was used
as the distance metric for finding the nearest neighbors. Figure 8.23 shows the calculated adr
values. The results seem very similar for all values of k with the maximum adr achieved at k =
5. The confusion matrix for k = 5 is also listed in Table 8.12. In comparison to the ann
classifier (Table 8.2), this classifier detects the awake class by about 2% better (83.2% vs.
80.9%) which makes it comparable with the svm classifier (83.2% vs. 84.2%) (Table 8.7). In
detection of the drowsy class, however, the svm classifier is superior to both of them (svm:
83.6%, ann: 81.0% and k-nn: 79.7%). Overall, these results show that relying on the class of the
nearest neighbors is a good solution for defining the class of the sample under investigation
given the kss input-based features.
85
ADR [%]
80
75
3 5 7 9
Number of nearest neighbors
Figure 8.23.: adr of the test sets of the binary subject-dependent k-nn classifier for different numbers of
neighbors. Feature type: kss input-based features. Bars refer to the standard deviation of
permutations.
The drive time-based features were also fed into the k-nn classifier for the binary and 3-class
cases. The corresponding values of the adr are shown in Figure 8.24. For both binary and 3-
class
Table 8.12.: Confusion matrix of the binary subject-dependent k-nn classifier for k = 5. Feature type:
kss input-based features
predicted
awake 83.2% 16.8%
given
drowsy 20.3% 79.7%
cases, increasing the number of nearest neighbors k slightly improves the classification
results. The confusion matrices for k = 7 are also listed in Table 8.13. In comparison to the
results for the kss input-based features in Table 8.12, the dr value for the awake class
increases by about 3% (86.8% vs. 83.2%), while the detection of the drowsy class shows no
improvement (79.2% vs. 79.7%). Comparison of the binary k-nn classifier with the binary ann
and svm classifiers regarding drive time-based features indicates that, overall, all classifiers
perform similarly and the difference between the dr values of each class is only about 2%.
68
84
ADR [%]
ADR [%]
66
82
64
80
3 5 7 9 3 5 7 9
Number of nearest neighbors Number of nearest neighbors
(a) 2-class (b) 3-class
Figure 8.24.: adr of the 2-class and 3-class subject-dependent k-nn classifier for different numbers of
neighbors. Feature type: drive time-based features. Bars refer to the standard deviation of
permutations.
Table 8.13.: Confusion matrices of the subject-dependent k-nn classifier (k = 7) for the 2-class and 3-
class cases. Feature type: drive time-based features.
predicted predicted
state awake drowsy
awake 82.0% 14.4% 3.6%
given medium 36.8% 47.6% 15.6%
drowsy 20.8% 79.2%
drowsy 11.0% 19.5% 69.5%
For the 3-class case, the detection of the medium class slightly improves in comparison to that
of the ann and svm classifiers (k-nn: 47.6%, ann: 42.8% and svm: 41.7%). However, this is
followed by the degradation of the dr value for the awake and drowsy classes by this
classifier.
The sensitivity of the k-nn classifier to an imbalanced training set based on the kss input-
based features is studied in this section. According to Figure 8.25, increasing the number of
nearest
8.4 k-nearest neighbors classifier 175
neighbors k from 3 to 7 slightly improves the dr values. However, k = 9 yields similar results
to k = 7. The confusion matrix for k = 7 is listed in Table 8.14. Similar to the ann and svm
classifiers (Tables 8.4 and 8.9), this classifier also fails to classify the samples of the minority
class as correct as the samples of the majority class with about 30% difference between dr
values of each class (60.2% vs. 89.2%).
80
ADR [%]
75
70
65
3 5 7 9
Figure 8.25.: adr of the test sets of the binary subject-dependent k-nn classifier for different numbers of
neighbors. Feature type: imbalanced kss input-based features of driving simulator experi-
ment. Bars refer to the standard deviation of permutations.
Table 8.14.: Confusion matrix of the binary subject-dependent k-nn classifier (k = 7). Feature type:
imbalanced kss input-based features of driving simulator experiment
predicted
awake 60.2% 39.8
given
drowsy 10.8% 89.2%
The results of the subject-independent classification for the binary k-nn classifier based on
drive time-based features are shown in Table 8.15 for k = 9. We showed in Sections 8.2.4
and 8.3.8 that the performance of the ann and svm classifiers degraded in the case of subject-
independent classification. Nevertheless, the drowsy class was classified by them up to 61%
correctly. However, it seems that the k-nn classifier is less suitable for the classification of
unseen data in comparison to the other classifiers due to the smaller dr value for the drowsy
class (57.4%). By varying the number of k, the results do not improve.
Table 8.15.: Confusion matrix of the binary subject-independent k-nn classifier for drive time-based fea-
tures (k = 9)
predicted
awake 80.0% 20.0%
given
drowsy 42.6% 57.4%
Figure 8.26 compares the metrics of the confusion matrix for the subject-dependent and subject-
independent k-nn classifiers. It can be seen that the dr of the drowsy class drops by about 22%
from 79.2% to 57.4%. For the awake class, however, the dr decreases only by about 7% from
86.8% to 80.0%.
subj.-indep.
drowsy
subj.-dep.
DR 100% − DR
subj.-indep.
awake
subj.-dep.
0 20 40 60 80 100
[%]
Figure 8.26.: Comparing confusion matrix of the binary subject-dependent k-nn classifier with that of
the subject-independent
8.5. Comparison of the supervised classifiers for driver

state classification
In previous sections, three classifiers were introduced and their classification results were
discussed. In this section, we review them in terms of different aspects such as the
performance, subject-dependent versus subject-independent classification, simulation
runtime, etc.
Figure 8.27 summarizes the confusion matrices of all binary classifiers for the subject-
dependent and subject-independent classifications given the drive time-based features. In
this figure, all−dr and 100% dr values as listed in a confusion matrix are provided. 100%
drawake−and 100% drdrowsy correspond to the fpr and fnr values, respectively. Overall, none of
the classifiers outperforms the other ones. If a classifier detects one class with a higher dr
value, it is usually at a cost of degraded classification result for the other class. As an
example, the binary subject-dependent k-nn classifier classifies the drowsy class slightly
better than the svm and ann classifiers, while it achieves the smallest value of the dr for the
classification of the awake class among other classifiers. Therefore, it seems that any of the
classifiers can be applied to the drive time-based features for the subject-dependent
subj.-indep. subj.-dep.
classification.
DR 100% − DR
awake drowsy
k-NN
SVM
ANN
k-NN
SV
M
ANN
100 80 60 40 20 0 20 40 60 80 100
[%]
Figure 8.27.: Comparing confusion matrices of the binary ann, svm and k-nn classifiers for the subject-
dependent and subject-independent classifications. Feature type: drive time-based
features
Regarding subject-independent classification, the performance of all classifiers degrade as ex-

pected. The k-nn method also seems not to be a suitable classifier for a data set with large
8.5 Comparison of the supervised classifiers for driver state classification 177
between-subject differences due to its poorer performance in the classification of the drowsy
class. The ann and svm classifiers interpret the features in a similar way in this case.
Neverthe- less, a more effective feature baselining method might improve the results of
subject-independent classification.
For the 3-class classification, all confusion matrices are summarized in Figure 8.28. All in all,
the dr values of all classes and all classifiers are not as good as those of the binary cases. The
drowsy medium awake
svm and ann classifiers detect awake and drowsy classes with higher dr values in comparison
to the k-nn classifier. However, the medium class is best detected by the k-nn.
k-NN
SVM
ANN dr
k-NN awake as medium awake as drowsy medium as awake
medium as drowsy
SVM
drowsy as medium drowsy as awake
ANN
k-NN
SVM
ANN
0 10 20 30 40 50 60 70 80 90 100
[%]
Figure 8.28.: Comparing confusion matrices of the 3-class ann, svm and k-nn classifiers for the subject-
dependent classification. Feature type: drive time-based features
Since the performance of a supervised classifier is highly influenced by the preciseness of the
labels, all results provided here also fully depend on the accuracy of the collected kss values.
Apart from that, in this study, all binary classifiers perform well with the adr of 82% which is
32% better than random guessing. It is, hence, concluded that apart from the informativeness
of the features, the subjects were also able to properly distinguish between awake and
drowsy states leading to accurate labels. Therefore, for the 3-class problem also all classifiers
seldom confuse the awake and drowsy classes with each other, while the medium class is
more often misclassified by all of them, especially as the awake class. This might be due to
the fact that the transition from the awake to the medium state was not very distinguishable
for the subjects themselves and, consequently, they interpreted their states inconsistently.
Nevertheless, an average dr of 66% for the 3-class classification is still twice more effective
than a random classifier.
Figure 8.29 shows all results for the kss input-based features. The first three bars from the
top compare the dr values for the balanced case where all features of all experiments, namely
real road and driving simulator experiments, were included in the training and test sets. All
classifiers perform almost similarly. svm achieves the maximum dr values for both of classes.
It should be mentioned that the number of samples for kss input-based features are about 10
times smaller than that of the drive time-based ones. Hence, comparing the classification
results based on each feature aggregation method should be done with care. Taking the large
difference between the number of samples in the training sets into consideration, the
classification results of the kss input-based features are even more satisfactory. In fact, the
smaller dr values for the kss input-based features do not necessarily imply that these features
are not as informative as the drive time-based features. Nevertheless, adding a larger number
of samples to the training sets of kss input-based features might improve the results.
Regarding imbalanced training sets, also shown in Figure 8.29, we conclude that the svm
classifier with the standard misclassification cost is more robust against imbalanced
distributions of the
DR 100% − DR
awake drowsy
k-NN
balanced SVM
ANN
k-NN
imbalanced SVM
ANN
balanced with different costs: SVM
balanced with somte: ANN
SVM
real road
ANN
drives
100 80 60 40 20 0 20 40 60 80 100
[%]
Figure 8.29.: Comparing confusion matrices of the binary subject-dependent ann, svm and k-nn
classifiers for different kss input-based features
classes, since it achieves the highest values of the dr for both classes among other classifiers.
By applying two methods, we tried to improve the imbalance issue of the training set, namely
smote and different misclassification costs. The former was combined with the ann classifier
and we artificially added new samples to the minority class. The latter, however, took
properties of the svm classifier into account. For both methods, the classification results
improve, but the svm classifier is superior to the ann in this case. This might be due to the
fact that the smote solves the imbalanced data problem regardless of the classifier, while
different misclassification costs approach solves this problem by directly manipulating the
classifier itself.
Although we could solve the problem with the lack of awake samples in the training sets, the
resulting classifiers are still inefficient in classifying unseen awake samples. The ann classifier
classifies these samples up to about 63% correctly at a cost of misclassifying most of the
unseen drowsy samples. The svm, however, performs not better than random guessing.
Therefore, it can be inferred that both of the methods lead to overfitted classifiers, such that
they cannot be generalized to other unseen data. This occurred despite the approaches which
we applied during the training phase, such as the cross validation, to avoid overfitting. This
finding might also be related to the overall difference between driving simulator and real
driving conditions.
As mentioned before, these results confirm that for driver state classification, always two
types of events should be available in the training sets, namely both awake and drowsy
events. In fact, a warning system, which always warns all drowsy subjects correctly, is not
necessarily a reliable system, if it is unable not to warn awake subjects. On this account, in
this study, part of the data set was collected during the night as the drowsy data and the
other part during the day as the awake data, because we showed that they cannot be
artificially substituted in the data set in the case of imbalanced class distributions.
In addition to the classification performance, classifiers can also be compared with each other
in terms of the simulation runtime for achieving the mentioned results. All of the results in
this work were generated by matlab. The k-nn classifier is the fastest, with the runtime of less
than 10 s for classifying drive time-based features. The runtime for the ann classifier depends
highly on the selected number of neurons. On average, training a network with Nh = 2
neurons is very fast (< 1 min), while for Nh = 10, it takes about 5 min. The highest runtime is
needed by the svm with about 45 min. All of these runtime durations are based on one
iteration out of 100 permutations given the drive time-based feature. Considering both the
performance of
8.6 Features of the driving simulator versus real road driving 179
the classifier, namely all achieved results, and the computational complexity factor, the ann
classifier is the best solution for the driver state classification based on blink features in this
work.
8.6. Features of the driving simulator versus real road driving
As mentioned before, due to safety concerns, driving simulators are required for data
collection with higher drowsy-related characteristics. However, this makes the collected data
less applicable for the comparison with real driving conditions. In this context, Hallvig et al.
(2013) reported longer blink duration in the driving simulator compared to real driving and
believes that due to safety aspects of driving simulators higher level of drowsiness is
generally achieved in them. Philip et al. (2005), who also compared real driving with driving
in simulators, reported slower reaction time and higher kss values in the driving simulator.
Therefore, all good classification results based on driving simulator data might suffer from
the fact that very deep phases of drowsiness are included in the data set which sharpens the
discrimination of classes. In general, under real driving conditions, drowsiness should be
detected at a time that the warning of the corresponding assistance system can still be
perceived by the driver for a timely correcting reaction.
To address the mentioned issues, two approaches are considered here. The first approach
gen- eralizes the driving simulator to real driving conditions by discarding very drowsy
parts of the drives. Unlike the first approach, the second approach uses all valuable features
collected in both driving simulator and under real driving conditions to investigate whether
unseen drowsy data collected under real driving conditions can be classified correctly.
8.6.1. Generalization of the simulator data to real road driving
In this section, we want to investigate, if driver drowsiness in the driving simulator is

representative of the same effect under real driving conditions. To this end, we removed all
parts of the drives after the first KSS = 9 or after the second KSS = 8 from the data set. This
procedure is referred to as the generalization of simulator to real driving (gsrd). The reason that
we applied these conditions to the data is that as mentioned in Section 4.2.3, exactly these
rules have been followed for collecting drowsy data on the real road.
The new feature matrix contains 3070 samples of all 19 drive time-based features (19× 3070)
based on real road and driving simulator experiments. After removing approximately 950
samples from the drowsy class, the classes are distributed as awake = 72% and drowsy =
28%. As we discussed in previous sections, imbalanced class distributions degrade the
performance of the classifiers. Therefore, we randomly undersampled the features, i.e.
repeatedly removing random samples of the majority class to obtain balanced distributed
classes (50% vs. 50%) as explained in Section 8.1.5. The ann, svm and k-nn classifiers were
again applied to the new feature matrix including both subject-dependent and subject-
independent data sets, but only for the binary case. The kss input-based features were not
studied due to a smaller number of available samples after removing the samples of the
drowsy class according to the above procedure.
Results of the ANN classifier
Figure 8.30 depicts the adr values for different numbers of neurons of the gsrd subject-
dependent ann classifier. The comparison of this figure with Figure 8.10(a) indicates that
for all values of Nh the adr of the test sets decreases by approximately 4–7% with slightly
larger standard deviation values. Moreover, increasing the number of neurons Nh from 10 to
25 only improves the adr of the training set. The corresponding values of the confusion
matrix for Nh = 10 are shown in Figure 8.31. The dr value for the awake class is more affected
in comparison to Figure 8.27 (about 10% drop from 87.9% to 78.4%).
90 Test set
Training set
85
ADR [%]
80
75
70
2 3 4 5 10 15 20 25
Number of neurons
numbers of neurons based on gsrd case. Feature type: drive time-based features. Bars
refer to the standard deviation of permutations.
subj.-indep. subj.-dep.
DR 100% − DR
awake drowsy
k-NN
SVM
ANN
k-NN
SVM
ANN
100 80 60 40 20 0 20 40 60 80 100
[%]
Figure 8.31.: Comparing confusion matrices of the binary ann (Nh = 10), svm and k-nn (k = 7)
classifiers for the subject-dependent and subject-independent classifications of the gsrd
case. Feature type: drive time-based features.
Figure 8.31 also shows the result of the subject-independent classification which should be
cau- tiously compared with the corresponding results in Figure 8.27. In gsrd, only parts of the
drowsy phase per subject were removed, leading to smaller numbers of drowsy samples and
smaller values of FP+TN. In the worst case, if tp, fp and fn values of the confusion matrix
remain unchanged after applying the gsrd procedure, the decreased number of tn
−
results in extremely increased fpr in (8.6) (or 100% dr) as occurred here. Surprisingly, for
the subject-independent case, it is still the correct classification of the drowsy class which is
more problematic (dr of the drowsy class < dr of the awake class). Despite this fact,
applying the gsrd procedure degrades the classification of the awake class to a larger
extent, i.e. 10% drop of the dr value from 80.0% to 70.4%.
8.6 Features of the driving simulator versus real road driving 181
Results of the SVM classifier
The parameters for training the svm classifier under the gsrd case are shown in Figure 8.32.
These parameters are slightly different from those shown in Figure 8.21. In particular, the
interquartile ranges are larger for the gsrd case. The accuracy of the test sets also decreases
by about 6%. Figure 8.31 also depicts the results of subject-dependent and subject-
independent classifications of the gsrd for the svm. Similar to the results of the ann classifier,
these results are not as good as the previous results shown in Figure 8.27, with a 5% drop of
the adr for the subject-dependent case (from 83.3% to 78.1%) which is mostly due to 8%
degradation in the correct classification of the awake class (from 89.0% to 80.6%). For the
subject-independent case, a similar explanation holds.
Cop γopt accuracy
t
150
95
4
100 90
3
[%]
85
2
50
80
1
75
0
0
2 2
class class training set test set
Figure 8.32.: Boxplot of C, γ, training and test accuracies for the 2-class subject-dependent classification
of the gsrd case with svm for all 100 permutations. Feature type: drive time-based features
Results of the k-NN classifier
The results for different numbers of neighbors k under the gsrd procedure are shown in Figure
8.33. In comparison to Figure 8.24(a), the gsrd procedure impairs all metrics by
approximately 6%. The metrics of the confusion matrices for both classes under subject-
dependent and subject- independent cases are also shown in Figure 8.31. By comparison with
the corresponding dr values of the ann and svm classifiers, the k-nn subject-dependent
classifier classifies the drowsy class with the highest dr value (k-nn: 79.4%, svm: 75.7% and
ann: 75.0%), although it is not the best classifier for classification of the awake class. As a
subject-independent classifier, k-nn has the poorest performance with the lowest dr values for
both classes. Therefore, the k-nn classifier is the least suitable classifier, if the unseen data
differs a lot from the training set.
Conclusion
Considering the results of the gsrd approach provided by all classifiers for the subject-
dependent data division, removing samples from the very drowsy parts of the drive degrades the
performance of the classifiers. Nevertheless, it is still possible to correctly detect both classes
over 70%. An interesting finding is that regardless of the data division type, namely subject-
dependent versus subject-independent, the removed drowsy samples are crucial not only for
the correct classification of the drowsy samples, but also for the correct classification of the
awake samples. The dr value of the drowsy class by the subject-independent k-nn classifier, as
an example, varies only 2% (57.4% vs. 55.1%), while for the awake class, it drops by about
15% (80% vs. 65.7%).
80
78
ADR [%]
76
74
3 5 7 9
Figure 8.33.: adr of the 2-class subject-dependent k-nn classifier of the gsrd case for different numbers
of neighbors. Feature type: kss input-based features. Bars refer to the standard deviation
of permutations.
8.6.2. Classification of drop-outs under real road driving conditions
On the contrary to the previous approach, where parts of the drowsy data were removed,
here, we used the entire feature matrix based on all conducted experiments as the training
set. This set was then fed to the ann, svm and k-nn classifiers. For the test set we used new
unseen data which was collected nighttime under real driving conditions as explained in
Section 4.2.3. This test set comprised solely the data from subjects who aborted the real
driving experiment due to severe drowsiness according to their own subjective assessment or
that of the investigator. Simon et al. (2011) called such subjects “drop-outs” and believed that
this condition is the “most objective fatigue criterion available”. They also found a larger
variation of the EEG features for these subjects in comparison to non-drop-outs who
completed the experiment to the end. Following the plausible idea of Simon et al. (2011), we
used the data of the drop-outs participated in our experiment as the test set and repeated the
classification task by the trained network or model which is, in fact, a subject-independent
case.
Figure 8.34 shows the resulting confusion matrices for the ann, svm and k-nn classifiers. The
parameters of the classifiers are Nh = 10 for the ann, C = 90.5 and γ = 1.4 for the svm
and k = 7 for the k-nn. These parameters lead to the best classification results compared to
other parameter values. It seems that overall, the ann classifier outperforms other classifiers.
In classifying the awake class, all of the classifiers perform similarly, while the drowsy class is
classified more correctly by the ann classifier.
DR 100%−DR
awake drowsy
k-NN
SVM
ANN
100 80 60 40 20 0 20 40 60 80 100
[%]
Figure 8.34.: Comparing confusion matrices of the binary subject-independent ann (Nh = 10), svm
(C = 90.5, γ = 1.4) and k-nn (k = 7) classifiers for unseen real road drives of drop-
outs. Feature type: drive time-based features.
8.7 Feature dimension reduction 183
For drive time-based features, all classifiers achieve an accuracy of about 70%. Therefore, we
conclude that the collected drowsy events in the driving simulator are not far from reality.
The remaining 30% wrong classifications in each group might occur due to the followings:
• Getting drowsy in the driving simulator is in some unknown senses different from
getting drowsy under real driving conditions. As a result, the drowsy events collected
in the driving simulator do not represent all drowsy events of the real drives.
• Large between-subject differences in blink features leads to individual characteristics in

different subjects which might be sorted out by including more subjects in the training
set in the future.
8.7. Feature dimension reduction
In the previous sections, we discussed the classification results with regard to 19 features in
the feature matrix. Regardless of the performance of the classifiers, it is not clear which single
feature or which feature subset has more contribution in generating the results. The reason is
that features were applied all together to the classifier in order to complement each other. In
this section, we explore whether all of the 19 features are needed or whether it is possible to
reduce the feature matrix dimension and achieve the same results with a smaller number of
features. This is an important issue for in-vehicle warning systems as we explain in the
following.
Imagine that the quality of an extracted feature or a feature subset deteriorates as soon as the
EOG is replaced with a driver observation camera due to image processing problems and
lower frame rate. If we show that the existence of this feature or this feature subset is not
crucial for having a reliable driver state classification, then the degraded feature quality is not
an issue any more. Apart from that, for most of in-vehicle systems, processing time and
memory storage are serious concerns. Consequently, a feature matrix with a reduced
dimensionality is desired and preferred.
The application of feature dimension reduction is also motivated by the curse of dimensionality
which requires the number of unknown parameters of a classifier to be at least 5 to 10 times
less than the number of samples available in the training set (Yang, 2014). In other words, by
increasing the dimension of the feature matrix, the number of training samples should
increase exponentially, otherwise overfitting is inevitable (Bishop, 2006). This point plays an
important role in applications with small numbers of available training samples in
comparison to the number of extracted features. Therefore, this is not an issue in this work.
Feature dimension reduction also helps to avoid redundancy by keeping the most
uncorrelated features for further analysis.
According to Yang (2014), there exist two types of approaches for reducing the dimension of
the feature matrix. The first method, called feature selection, selects a subset of features with
the desired dimension D˘ out of available D features, where D˘ < D. On the contrary, the
feature transform method transforms the features either linearly or non-linearly to reduce D.
Both of these methods are guided either by a classifier-dependent metric, e.g. accuracy of the
classification result or by a classifier-independent criterion such as the correlation. The
former, which uses the learning algorithm itself, is a wrapper approach discussed in Sections
8.7.1 and
8.7.2. The latter is a filter approach which is based on the intrinsic properties of the data and
is discussed in Section 8.7.3.
8.7.1. Sequential floating forward selection
Sequential floating forward selection (sffs), a feature selection method introduced by Pudil
et al. (1994), is a combination of both sequential forward selection (sfs) and sequential backward
selection (sbs). The former achieves the desired number of features by adding the best feature
combination to an empty feature set, while the latter achieves this goal by removing the worst
feature combination from the full feature set. According to Pudil et al. (1994), both of these
feature selection methods suffer from wrong decisions, called nesting effect, during adding or
removing a feature. This is due to the fact that no correction steps are considered in their
algorithms. Hence, combining these methods with each other results in a more dynamic
feature selection method. Based on the considered criterion for evaluating the selected
features (e.g. classification performance), inclusion and exclusion steps are then applied. In
other words, after adding a new feature, some backward steps are performed as long as the
new subset outperforms the previous one. If this is not the case, the backward step is
disregarded.
In the following, the algorithm is clarified by a numerical example. The mathematical notation
of the sffs algorithm is taken from Lugger (2011).
We consider the original feature set with 7 features, namely = y 1 , y 2 , y 3 , y 4 , y 5 , y 6 , y7
and the desired number of features D M= 4. The sffs algorithm starts
M with
{ an empty feature
˘ }
set Yk = ∅, k = 0 and selects the best single feature as follows
yi = arg max J(
Y∪k y) (8.53)
y ∈ M\Yk
Yk+1 = Yk ∪
yi k = k + 1,
e.g. Y1 = y4{. J denotes the performance function. Then the remaining 6 features will be
added to Y1}separately and the best pair is selected based on the criterion in (8.53) with k = 1,
e.g. Y 2 ={ y4, y2} out ofY ∪ 1 y ={ y4, y1} ,{ y4, y2 }, · · ,· {y4, y7 . This forward step is repeated
} time leading to the best 3-feature combination, e.g.
for the third Y 3 ={ y4, y2, y7 . Now, the
backward step is applied for controlling the redundancy in the new feature } set (Uhlich, 2006).
If a recently added feature contains similar information, which was already taken into
consideration, it can then be excluded. Consequently, the sffs algorithm analyzes the
following subsets of the current 3-element feature set: {Y3 \ y, y ∈ Y3} = {y2, y7}, {y4, y7},
{y4, y2} and selects the best subset, e.g. yj = 4, Y3 \ yj = {y2, y7}, where yj is defined as
yj = arg max J(Y k y). (8.54)

y ∈ Yk \
If the performance of this new subset, namelyY 3 y\j = y2, y{7 , is better
} than that of 3 =
y , y
{4 2 7 }, y , then y 4 is excluded. This leads to the new Y
definition of Y 2, i.e. Y 2 = {y2, y7 .
}
Mathematically, we have
if J(Yk \ yj) > J(Yk−1), Yk−1 = Yk \ yj, repeat (8.54) with k = k − 1. (8.55)
Otherwise, a new feature will be added to the current feature set based on (8.53) as follows
if J(Yk \ yj) ≤ J(Yk−1), repeat (8.53) (8.56)
By these steps, the sffs algorithm looks for the best feature combination. Hence, a feature,
which does not improve the performance of the current feature combination, is then
discarded similar to y4 in our example. This is valid regardless of the previous good results
provided by that feature. Clearly, sffs is a classification-dependent feature selection method.
We applied the sffs algorithm combined with the ann classifier (Nh = 10) to select the best
10-feature combination out of 19 (D = 19 and D˘ = 10) drive time-based features. Accuracy
was used to guide inclusion/exclusion steps of this algorithm as shown in Figure 8.35.
Selected features in each of the 10 steps are listed in Table 8.16. At the first glance, it seems
that 4 features are enough for the driver state classification task, since increasing the number
of features does not improve the classification accuracy. The right part of Table 8.17 shows the
confusion matrix for the best selected 4 features by the ann classifier (Nh = 10). A similar
confusion matrix with the full 19 features (Table 8.3) is also shown on the left of Table 8.17.
Comparison of the dr values in both cases indicates that the detection of the awake class is
apparently possible with only 4 features with the same performance as that of the 19 features
(87.3% vs. 87.9%). However, the remaining 15 features are responsible for improving the dr
value of the drowsy class by about 4% (74.3% vs. 78.2%). In other words, the 4% increase of
the dr for the drowsy class underscores the fact that some of the underlying information in
the feature space can be covered only if more than the 4 mentioned features are included in
the feature space.
0.82
0.8
acc [%]
0.78
0.76
0.74
0.72
1 2 3 4 5 6 7 8 9 10
Number of selected features D˘
Figure 8.35.: The ann classification accuracy of the best selected features by the sffs algorithm from 1
to 10-feature combination. Feature type: drive time-based features.
Table 8.16.: Best selected feature combination set by the sffs and ann classifier from 1 to 10 features.
Feature type: drive time-based features
D selected features accuracy
˘
1 T50 72.4%
2 T50 F 76.6%
3 E MCV 79.4%
4 E MCV A 80.0%
F
5 E MCV Tr 80.2%
6 E MCV perclos 80.1%
F A o
7 E MCV Tr perclos ACV 80.2%
8 E MCV ACV To 80.2%
o perclos
9 F A 80.3%
E MCV Tr AOV ACV To MOV
10 E MCV AOV ACV To MOV Tcl,2 80.4%
o
F A
Tr
o
F A Tr
o
F A Tr
o
F A
F
Additionally, we applied the trained ann classifier with 4 features to the unseen features of
the drop-outs introduced in Section 8.6.2. This helps to investigate the generalization aspect
of the classifier based on 4 features for unseen data. Right part of Table 8.18 shows the
confusion matrix
Table 8.17.: Confusion matrices of the binary subject-dependent ann classifier (Nh = 10) for drive
time- based features. Left: classification with 19 features. Right: classification with 4
features
predicted predicted
awake 87.9% 12.1% awake 87.3% 12.7%
given given
drowsy 21.8% 78.2% drowsy 25.7% 74.3%
for the binary subject-independent ann classifier for drive time-based features of drop-outs with
D˘ = 4 and Nh = 10. On the left part of this table, the confusion matrix of the corresponding
subjects with 19 features (shown in Figure 8.34) is listed. Interestingly, the dr of the awake
class improves by about 5% (from 71.3% to 77.0%) after removing 15 features from the data
set. For the drowsy class, however, the dr drops by about 7% (from 72.4% to 65.2%). This
emphasizes the role of other features for the correct classification of the drowsy unseen data.
Table 8.18.: Confusion matrices of the binary subject-dependent ann classifier ( Nh = 10) for drive time-
based features of drop-outs. Left: classification with 19 features. Right: classification with 4
features
predicted predicted
awake 71.3% 28.7% awake 77.0% 23.0%
given given
drowsy 27.6% 72.4% drowsy 34.8% 65.2%
The 4 selected features are E, F, MCV and A, with the Spearman’s correlation coefficient cal-
culated between them and kss values as listed in Table 7.11: 0.11, 0.36, − 0.41 and 0.23. As
mentioned before, it is the complementary property of a feature which determines its
contribution to the classification task, not its efficiency and informativity as a single feature.
Moreover, according to Table 7.21, except for the feature pair (A, MCV ), which is linearly up
to 0.78 correlated with each other, for other pairs we have ρp < 0.5. As a result, the low
correlation of the features with each other seems to positively impact their complementary
properties.
8.7.2. Margin influence analysis
Margin influence analysis (mia) (Li et al., 2011a) is a feature selection method especially de-
signed for the svm classifier. As mentioned in Section 8.3, the performance of the svm
classifier thoroughly depends on the size of the margin between classes. Consequently, svm
models with larger margins are expected to perform better in classification. Regarding this
fact, Li et al. (2011a) suggested the mia method which evaluates a feature based on its
contribution to and influence on the classifier’s margin. In other words, a feature is
considered as informative, if its consideration leads to an svm model with a larger margin.
According to Li et al. (2011a), the mia algorithm includes the following steps:
1. Define the number of features D˘ to be sampled by the Monte Carlo sampling (mcs)
(Lemieux, 2009) which is equivalent to the desired dimensionality of the feature matrix.
2. By applying Nmcs Monte Carlo sampling, select D˘ features randomly out of all existing
D features in each sampling (e.g. Nmcs = 10000 (Li et al., 2011a)). This leads to Nmcs subsets
each containing D˘ features.
3. Train an svm for all Nmcs feature subsets which results in Nmcs margins.
4. For a specific feature i, find all subsets A i which include this feature and all subsets i
which exclude it.

5. Compute the mean value of margins in subsets Ai and Bi, namely γAi and γBi .
6. Define feature i as informative based on the difference between the computed mean
values, namely ∆γi = γAi − γBi , as follows
• ∆γi > 0: feature i is informative and increases the margin of the svm classifier.
• ∆γi ≤ 0: feature i is not considered informative and consequently will decrease the
margin of the svm classifier.
7. Remove all features not considered informative.
8. Determine whether informative features lead to a significant increase of the margin by
applying the Mann-Whitney U test (Field, 2007) and analyzing the calculated p-value as
follows
• p-value ≤ α: feature i significantly increases the margin and is informative.
• p-value > α: feature i does not increase the margin significantly.
According to Field (2007), the Mann-Whitney U test is a non-parametric test for comparing
mean values.
Table 8 . 1 9 shows the calculated ∆γ values for each drive time-based feature with D˘ = 4.
The corresponding p-value of the Mann-Whitney U test is also shown. It can be seen that
except for A, MOV , A/MCV , A/MOV , ACV , AOV , F , To and Tc1,1 with ∆γ > 0, the other
features cannot be proved informative. According to the p-values, only A/MCV has passed
the significance test at α = 0.05. In comparison to the features selected by the sffs combined
with the ann classifier, only A was selected as an informative feature also by the mia and the
svm classifier.
Table 8.19.: Values of ∆γ calculated based on the mia method for drive time-based features (D˘ = 4)
Feature ∆γ p-value Feature ∆γ p-value
A 0.002 0.24 Tc -0.000 1.31
E -0.002 1.78 To 0.001 0.84
MCV -0.001 1.88 Tcl,1 0.005 0.12
MOV 0.002 0.65 Tcl,2 -0.008 1.00
A/MCV 0.006 < 0.001 Tro -0.002 1.99
A/MOV 0.003 0.55 perclos -0.002 1.24
ACV 0.000 0.71 T50 -0.001 1.69
AOV 0.003 0.07 T80 -0.004 1.09
F T90
0.003 0.08 -0.007 1.00
T -0.002 1.13
Overall, this method is a very time-consuming approach, depending on the number of features D˘
to be selected. We showed in Section 8.5 that the svm is a difficult classifier in terms of runtime.
Moreover, for a feature matrix with small dimensions, e.g. D˘ ≤ 5 instead of D = 19, it is more
time-consuming to construct and train an optimized model. Therefore, in this respect, the mia
approach is impractical.
8.7.3. Correlation-based feature selection
Another feature selection method introduced by Hall (1999) is the correlation-based feature
selection (cfs). This method is, on the contrary to the previous approaches, a classifier-
independent
feature selection method. According to Hall (1999), the term correlation used in the name of
the method does not necessarily refer to the classical linear correlation. It should be
interpreted as any measure quantifying the amount of relationship and dependency between
two features.
The cfs method first generates subsetsRof features with kcfs numbers of features in each
subset out of all available D features. As an example, for D = 19 and kcfs = 3 in our study, this
corresponds to 3-feature combination subsets of all 19 features, namely 969 subsets.
Afterwards, the subsets are ranked based on the following evaluation metric MR for a subset
R
MR kcfs rcf
= , (8.57)
kcfs + kcfs (kcfs − 1) rff
where rcf and rff denote the average correlation (in its general meaning) between classes and fea-
tures, i.e. inter-correlation, and between features, i.e. intra-correlation. Obviously, considering
both correlations in the evaluation function (8.57) covers both redundancy and relevance
issues of features simultaneously.
Calculating M−for
R all 2D 1 feature combination subsets makes this method, however, very
time-consuming for very large values of D. Hall (1999) suggested forward or backward
selection search instead where a feature is added only if any improvement is seen. Moreover,
he studied the so-called relief (Kira and Rendell, 1992a,b) and the minimum description length
(mdl) (Rissanen, 1978) as a correlation measure. In a later study (Hall, 2000), however, he
suggested the Pearson correlation coefficient for continuous attributes.
Table 8.20 shows the results of the cfs method for different numbers of drive time-based
features in a subset using the Pearson correlation coefficient for calculating rcf and rff. For
each value of kcfs, the best feature subset based on the maximum R value of M is listed.
Similarly, Table
8.21 shows the MRvalues using the Spearman’s rank correlation coefficient as a measure of
relevance and dependency of features. The features, which are shown in red, were selected
by both Pearson and Spearman’s rank correlation coefficients.
Table 8.20.: Best selected drive time-based features based on the cfs method and Pearson correlation
coefficient (Red features were also selected by the Spearman’s rank correlation coefficient
in Table 8.21.)
kcfs selected features MR
1 A/MOV 0.478
2 F ACV 0.553
3 F MOV Tr 0.595
4 MOV o Tc 0.594
5 MOV Tr perclos T90 0.592
F o
6 MOV perclos T90 Tr 0.589
7 A/MOV 0.584
MOV perclos T90 o MCV
8 F A/MOV 0.575
MOV perclos T90 Tr MCV T
9 A/MOV 0.570
MOV perclos T90 o MCV To ACV
10 A/MOV 0.565
F MOV perclos T90 Tr MCV To ACV T80
A/MOV
A/MOV o
Tr
F
o
Tr
F
o
F
F
The calculated MR values for both correlation coefficients are shown in Figure 8.36. According
to this figure, the MR values for kcfs = 2 to 5 are almost the same, regarding both correlation
coefficients. From kcfs = 6, however, the values deviate from each other to a larger extent. This
might be due to the non-linear relationship between features and kss labels, since M R values
using Spearman’s rank correlation coefficient are larger. In general, cfs with both correlation
coefficients selects almost the same feature sets, except for Tcl,2 and T90 which were selected
only by one of the correlation coefficients repeatedly.
Table 8.21.: Best selected drive time-based features based on the cfs method and Spearman’s rank cor-
relation coefficient (Red features were selected by the Pearson correlation coefficient in
Table 8.20.)
kcfs selected features MR
1 Tro 0.510
2 F ACV 0.546
3 F MOV Tro 0.589
4 F MOV perclos T 0.595
5 F MOV perclos A/MOV Tcl,2 0.602
6 F MOV perclos A/MOV Tcl,2 Tr 0.610
7 F MOV perclos A/MOV Tcl,2 AC 0.602
o
8 F A/MOV Tcl,2 Tr V MCV 0.599
MOV perclos
9 F 0.589
MOV perclos A/MOV Tcl,2 o T MCV ACV
10 F Tcl,2 Tr T MCV ACV Tc 0.583
MOV perclos A/MOV
o T
Tr
o
Tr
o
0.6
MR
0.55
0.5
MR by Pearson correlation coefficient
MR by Spearman’s rank correlation coefficient
0.45
1 2 3 4 5 6 7 8 9 10
kcfs
Figure 8.36.: M values of the best 10-feature combinations calculated based on the Pearson and
R
Spear- man’s rank correlation coefficients
In order to be able to compare the results of the cfs with those of the sffs method, we
applied a binary subject-dependent ann classifier with Nh = 10 to kcfs = 4 drive time-based
features selected using both Pearson and Spearman’s rank correlation coefficients. The
corresponding confusion matrices are shown in Table 8.22. Comparing these results with
those of the sffs method (Table 8.17) indicates degraded classification results for the drowsy
class by at least 4% (70.2% and 68.6% vs. 74.3%). Overall, the selected feature combination by
the filter approach is different from the subset selected by the sffs method as a wrapper
approach. The results of the sffs approach outperform those of the cfs clearly due to the fact
that in the sffs approach, the classifier is directly involved in the selection of the best 4-
feature combination. As a result, a poor performance for all classifier-independent methods
can be expected due to their generality or underfitting.
Regardless of the value of kcfs, Tables 8.23 and 8.24 show the best 10-feature combinations with
respect to M Rout of all possible combinations for both correlation coefficients. Interestingly,
the cfs method based on the Pearson correlation coefficient selected on average 4 features.
However, the Spearman’s rank correlation coefficient achieved the highest score by selecting
6 features on average. F and MOV are the only features which were selected in all sets of best
feature combinations regarding MR. Other selected feature subsets differ to some extent.
Table 8.22.: Confusion matrices of the binary subject-dependent ann classifier (Nh = 10) for drive
time- based features and kcfs = 4. Left: Pearson correlation coefficient. Right:
Spearman’s rank correlation.
predicted predicted
awake 85.2% 14.8% awake 86.5% 13.5%
given given
drowsy 29.8% 70.2% drowsy 31.4% 68.6%
The only feature, which was selected by all of the introduced methods, is F . Hence, we
conclude that F is a feature which, on the one hand, is associated with the kss values and on
the other hand, it is to a lesser extent correlated with other features. In addition, since it was
also selected by the sffs, it has a strong complementary property.
Table 8.23.: Best selected drive time-based features based on the cfs method using the Pearson
correlation coefficient regardless of the number of features
rank selected features MR
1 F MO Tr 0.5945
2 V o Tr 0.5935
3 MO Tc o 0.5926
F
4 V A/MO Tr 0.5926
5 ACV V o
T90 0.5923
6 F A/MO Tr 0.5914
MC o
Tro
7 V V 0.5910
perclos
8
F MO A/MO 0.5905
perclos
9 V V 0.5905
T80
10 MO A/MO 0.5903
F Tro
V V
perclos
MO Tc
Tro
F V T90
MO A/MOV
V A/MOV
F
MO
V
F
MOV
F
F
Table 8.24.: Best selected drive time-based features based on the cfs method using the Spearman’s rank
correlation coefficient regardless of the number of features
rank selected features MR
1 F A/MOV Tcl,2 Tro perclos MOV 0.6103
2 A/MOV Tcl,2 Tro perclos MCV 0.6026
3 A/MOV Tcl,2 Tro perclos ACV 0.6023
F
4 A/MOV Tcl,2 MOV perclos 0.6022
5 A/MOV Tcl,2 Tr perclos MOV ACV 0.6021
6 F A/MOV Tcl,2 perclos AOV 0.6019
o
7 A/MOV Tcl,2 MOV MCV 0.6014
Tr perclos
8 A/MOV ACV MOV 0.6011
F Tcl,2 o
9 A/MOV perclos MOV T 0.6003
Tcl,2 Tr
10 A/MOV MOV 0.6002
F Tcl,2 o
Tr
o
F
Tr
o
F Tr
o
F
F
9. Summary, conclusion and future work
9.1. Summary and conclusion
Timely detection of a drowsy driver and warning him to make him aware of his low vigilance
state plays an important role in improving the traffic safety. In this work, we addressed
driver state classification based on blink features collected by the electrooculography as a
reference measurement system.
In Chapter 1, we discussed different terminologies for defining drowsiness along with
defining driver distraction and inattention. We also reviewed drowsiness countermeasures during
driving such as conversing, rumble strips, etc. which, however, do not have a long-lasting
effect on the vigilance. This chapter also provided an overview of drowsiness detection
systems on the market.
Chapter 2 discussed objective and subjective methods for measuring driver state. The
objective measures are either driving performance measures which monitor the driver
indirectly based on the lane keeping behavior or steering wheel movements or a fusion of
them. We reviewed that such measures suffer from external factors such as the quality of the
lane marking, road condition, etc. In addition, their efficiency is restricted to the situations
that the driver performance is not improved by other assistance systems. We also introduced
driver physiological measures which are the result of direct monitoring of the driver, such as
EEG, ECG, etc. We proposed the idea of removing phases of the driver, where the driver is
visually distracted, to improve the association between an EEG-based measure and the
drowsiness. At the end of this chapter, we also introduced different subjective measures and
the concerns about their interpretation and reliability.
In Chapter 3, the human visual system was introduced. There, we mentioned the concepts
of what and where which describe the visual attention. Further, the structure of the human
eye and relevant types of eye movements during driving were defined. We also categorized
eye movements into two groups with regard to their velocity, namely slow and fast eye
movements, and showed that blinks can belong to both of these groups depending on the
driver’s vigilance state.
The robustness and reliability of the EOG measuring system for collecting eye movements
during driving was tested in a pilot study in Chapter 4. There, we studied the relationship
between driver eye movements and different real driving scenarios independent of driver’s
vigilance state in a fully controlled experiment conducted on a proving ground. All in all, it
can be concluded that ground excitation and large amplitude bumps add an extra pattern to
the EOG signals. On the other hand, monitoring driver eye movements seems to be
undisturbed by a single small amplitude bump. Moreover, it is clear that the inevitable
sawtooth pattern due to curve negotiation is not related to the driver’s inattention or
drowsiness. Therefore, we suggested the exclusion of tortuous road sections for further
investigation of driver eye movements.
Since the capability of the EOG as a robust and reliable reference measuring system for eye
movement monitoring even under real driving conditions was acknowledged by the pilot study,
we
192 Summary, conclusion and future work
conducted daytime and nighttime experiments under real road and simulated driving
conditions using EOG to collect eye movement data for the rest of this work as described in
Chapter 4.
In chapter 5, we addressed the detection approaches of blinks and saccades in the raw EOG
signals. We showed that the median filter-based method as the most conventional blink
detection approach was only suitable for the detection of blinks during the awake phase of
driving. As soon as, the shape of blinks changed due to drowsiness, this method either
missed an event or detected only part of it. In addition, we showed that the median filter-
based method was not suitable for detecting saccades and slow eye movements. As a result,
we proposed a method based on the derivative of the EOG signal for detecting saccades in
addition to the blinks. It was shown how to detect vertical saccades and blinks
simultaneously in vertical EOG signal. In addition, a 3-means clustering algorithm was
recommended to distinguish between saccades and blinks in those applications where the
data of both awake and drowsy phases are available. This helped to prevent confusing a
driver’s decreased amplitude blinks with saccades or other eye movements. Moreover, blinks
with long eye closure and microsleep events, whose patterns deviated from those during the
awake phase, were detected and distinguished from saccades based on the statistical
distribution of the amplitude. This method, however, was shown to perform poorly in the
detection of slow eye movements. Therefore, we introduced the wavelet transform method
which was superior to the Fourier transform in providing time localization information. In
addition to the continuous wavelet transform for detection of both fast and slow eye
movements, we applied the discrete wavelet transform as a suitable method for
preprocessing of the EOG signal, namely drift and noise removal. Finally, comparison of the
detection methods showed that the proposed derivative-based algorithm outperformed the
method based on median filtering in detection of fast eye movements. Although the wavelet
transform method performed best in the correct detection of both fast and slow eye
movements, it suffered from high false detection rates. Consequently, we combined its
detected events with those of the derivate-based algorithm to balance false detections.
In Chapter 6, we studied blinking behavior under distracted and undistracted driving. In the
first experiment, during which the subjects performed a secondary visuomotor task in
addition to the driving task, we showed that saccades and gaze shifts induced the occurrence
of blinks. However, we observed two different behaviors among subjects, direction dependent
and direction- independent gaze shift-induced blinks. For the former group, performing the
secondary task (either visuomotor or auditory) did not alter the blink rate in comparison to
the undistracted driving. However, for the latter, the blink rate changed to a large extent due
to distraction. In addition, we showed that visual distraction led to a blinking time interval
synchronous with the occurrence of the gaze shift. In a second experiment, during which the
subjects were not distracted, the results represented that the amount of gaze shift was
positively correlated with the occurrence of a simultaneous blink, i.e. the higher the
amplitude of the gaze shift, the larger was the probability of the blink occurrence. Based on
these results, we suggested those, who consider blink rate as an indicator of drowsiness, to
handle gaze shift-induced blinks differently to spontaneous ones, particularly if the driver is
visually or cognitively distracted. In fact, since such blinks are situation-dependent, they
locally change the blinking behavior, especially the blink frequency.
Based on the detected events of Chapter 5, in Chapter 7, we extracted 19 different features for
each event. These features were aggregated by considering two strategies, namely kss input-
based and drive time-based approaches. Unlike the latter, the former sacrificed the available
number of samples for more reliable class labels. In addition, feature baselining was
addressed to improve the classification results in Chapter 8. Further, in this chapter, based on
the scatter
9.1 Summary and conclusion 193
plots and correlation coefficients between features and kss values, we showed whether
features were positively or negatively associated with the driver state. Interestingly, for some
of the features, such as the blink amplitude, different trends were observed. Thus, we
conclude that a warning system, which relies only on a single feature for its decision strategy,
is prone to high false alarm rates. This chapter also discussed the variation of each feature
shortly before the occurrence of the first safety-critical event, namely a lane departure and a
microsleep, in comparison to the beginning of the drive. The results showed an important
finding regarding driving performance measures. For the lane departure event, overall, a
larger variation of the features shortly before the event was found in comparison to the
microsleep event. This proves for our data set that a drowsy driver experiences a microsleep
event without necessarily departing the lane or degraded driving performance measures. In
other words, this finding acknowledges the fact that lane departure might be related to a
deeper drowsiness phase in comparison to the microsleep. As a result, from this aspect,
driver physiological measures are superior to driving performance measures for the early
driver drowsiness detection. Finally, in this chapter we degraded the sampling frequency of
the raw EOG signal to make it similar to the raw signal provided by the driver observation
cameras on the market. The goal was to study the effect of sampling frequency on the feature
quality. According to the results, we conclude that velocity- based features are at high risk of
quality degradation.
Finally, in Chapter 8, we classified the driver state by three classifiers, namely ann, svm and k-
nn classifiers based on the extracted features in Chapter 7. The feature matrix was divided
either by a subject-dependent or a subject-independent approach. We also addressed the
issue of imbalanced data based on classifier-dependent approach. According to the results,
for the binary subject-dependent classification, all classifiers performed similarly regarding
drive time- based features. This was also valid for binary subject-independent classifiers. In
the binary subject-dependent case, we obtained at least about 80% correctly classified
samples in each class regardless of the selected classifier. The binary subject-independent
classifiers, however, performed poorly in the classification of drowsy samples. In fact, the
classification of unseen drowsy samples seems to be more challenging. In the 3-class
classification, the ann and svm classifiers performed poorer in the detection of the medium
class in comparison to the k-nn. We believe that this is due to imprecise class labels and that
the subjects were not good at rating medium levels of drowsiness.
For kss input-based features, we achieved slightly better classification results by the binary
svm classifier in comparison to the ann and k-nn. The high detection rate of both classes (each
around 80%) by this aggregation approach also underlines that in self-rating, subjects most
likely take the time interval shortly before the kss inquiry into account when rating
themselves.
For imbalanced class distributions, it was shown that all classifiers performed poorly to the
same extent in the classification of the minority class. We solved the issue of imbalanced data
with two approaches. The first one, as a classifier-independent method, was the smote which
artificially generated additional samples similar to those of the minority class. We combined
it with the ann classifier and obtained improved classification results. The retrained ann,
however, was not applicable to unseen data. For the svm classifier, we applied a classifier-
dependent approach where the misclassification cost was tuned with respect to the number
of samples in the minority and majority classes. Again, despite improved results with
imbalanced data, the constructed model was poor on unseen data. Therefore, we conclude
that imbalanced class distributions in the task of driver state classification do not lead to a
generalized classifier and should never be considered as a substitution for the minority class
data collection. In other words, the results of driver state classification are reliable only if the
features of both awake and drowsy phases of
194 Summary, conclusion and future work
the drive are collected under similar circumstances and are included in the feature matrix in
a balanced manner.
Chapter 8 also discussed the generalization aspects of the data collected in the driving
simulator to the real road conditions. There, we showed that by removing very drowsy parts
of the drive, which can only be collected in simulated driving, still both classes were
detectable to 70% given the subject-dependent binary classification with drive time-based
features. As soon as the subject-independent data division was applied, the results degraded.
The k-nn classifier was most affected due to dominant within-subject differences. We also
applied the unseen features of the drop-out subjects, who aborted the real nighttime driving
experiment due to severe drowsiness, to all classifiers. They were classified with acceptable
results given the drive time-based features (accuracy = 70%). Therefore, we conclude that the
drowsiness behavior in the driving simulator is to an acceptable extent representative of the
same behavior on the real road. Overall, the between-subject differences also have a
significant contribution on the degradation of the classification results.
Finally, we discussed approaches for feature dimension reduction in order to address the
issues of an in-vehicle warning system. According to the sffs method fused with the ann
classifier, four features were determined to be sufficient. The trained ann classifier, however,
did not perform as good as a classifier trained with all 19 features in the detection of the
drowsy class. As a result, we conclude that for the correct detection of the drowsy class,
which seems overall to be more challenging, more than four features are needed.
9.2. Perspective of future work
We showed that blink features based on the EOG are a promising approach for the driver
state detection. Nevertheless, in this section, we suggest possible directions for the future
work.
The first issue is the EOG as a reference measuring system which should be substituted with
a driver observation camera for having an in-vehicle warning system based on eye
movements. It should be investigated, if similar results can be achieved by the camera even
with low frame rates. Since cameras also measure the eyelid gap, an improvement of
classification results is expected despite degraded quality for some features. In addition, after
replacing EOG with camera, new problems arise which degrade the eye tracking process due
to other factors. Examples of such issues are varying light conditions and reflections arising
from wearing glasses.
In this work, we used 1 min of the EOG data for aggregating features. It is required to
investigate how the variation of this time interval improves or degrades the relationship
between features and driver state.
A third issue is concerned with the poor detection rate results of the medium class in the 3-
class classification as shown in Chapter 8. In future, it should be scrutinized whether
imprecise self-rating by subjects is responsible for poor classification results or whether the
eye movement features themselves cannot reflect the evolution of drowsiness at a lower level.
The forth issue is the fusion of the introduced blink features with other features such as the
saccade features or features based on the driving performance measures. Moreover, features
like traffic density or monotony, time of day and time-on-task can be integrated to contribute
to the classification task. Further, the combination of saccade occurrence with the traffic
density as a new feature appears to be promising for driver state classification in terms of the
short term variation or detection of the driver distraction.
9.2 Perspective of future work 195
Finally, the similarity of findings of this work and its extension to autonomous car driving
needs to studied. In partially automated driving, it is assumed that the vehicle performs
steering and lane keeping activities, while being fully observed by the driver for a timely
intervention. In this case, in addition to driver drowsiness, the level of driver attention or
distraction is indeed crucial. In highly automated driving, driver distraction and attention
detection is even more essential, because the driver is allowed to be distracted by turning his
attention to other activities. In complex situations, however, the driver must still be able to
take over the driving task after receiving a warning. Therefore, on top of the blink behavior
studied in distracted driving in this work, new features such as the gaze direction and proper
gaze shifts to the road ahead should be extracted and explored. Moreover, new experiments
and analyses should be conducted to quantify the amount of workload for investigation of
driver distraction and attention detection.
A. Derivation of sawtooth occurrence
frequency during curve negotiation
Figure A.1 geometrically represents the scenario during which the vehicle (subject) moves
from position A to B while tracking tpA and tpB, successively.
TPB B TPB
δ B
δ
TPA TPA
Γ
A A
r d ∆ r d
Figure A.1.: Geometrical representation of tracking two successive tps during a curve negotiation
According to both plots of Figure 4.8, the measured time interval ∆tm between successive
sawtooth patterns is very short (on average < 1 s). Thus, it can be assumed that the vehicle’s
lateral distance d to the inner curve lane marking in Figure A.1 remains constant during
tracking two successive tps, namely tpA and tpB. We assumed d = 1.5 m. For the same reason,
we considered the radius r of the curve and the distance p between subject and the
momentary tp to be constant.
Since the longitudinal acceleration is assumed negligible during tracking two successive tps,
∆tc can be calculated from the velocity v and the displacement Γ between A and B as follows
∆tc = Γ (r + d) ∆ψ
v = v , (A.1)
where ∆ψ is the yaw angle corresponding to the displacement Γ . r has been calculated out of
road curvature κ (κ = 1/r) which is a function of measured v and yaw angle rate ψ˙ (κ = ψ˙ /v).
Since ψ˙ was not equal for all subjects, the calculated value of κ and consequently r will be
different for the left curve of Figure 4.1. Therefore, r is assumed to be the mean over all
calculated radii for all subjects which corresponds to 52 m.
Based on our assumptions and the geometrical modeling of Figure A.1, the unknown value of
angular displacement of subject’s position ∆ψ leads to the angular displacement of the eyes δ.
According to Figure A.1, the angular displacement of eyes δ is as follows
( h )
δ = arctan , (A.2)
p−a
198 Derivation of sawtooth occurrence frequency during curve negotiation
where
h = r − r cos(∆ψ) (A.3)
a = r sin(∆ψ) (A.4)
and the distance p between the driver and the momentary tp is

r
p = const. = r tan ϕ = r tan(arccos( )) . (A.5)
r+
d
By substituting (A.3), (A.4) and (A.5) in (A.2), the angular displacement of eyes δ can be
described as a function of yaw angle ∆ψ as the following
1 − cos(∆ψ )
δ = arctan( ). (A.6)
tan ϕ −
∆ψ can be derived from (A.6) by knowing δ.sin(∆ψ)
Then, ∆tc can be approximated using the calculated
∆ψ in (A.1).
B. Description of a boxplot
Figure B.1 shows the information included in the boxplot representation. In this work, all
boxplots are shown for w = 1.5.
outlier
8 q3 + w(q3 − q1)
whisker
7
6 75th percentile (q3)
5 50th percentile (median)
4
25th percentile (q1)
3 interquartile range (q3 − q1)
q1 − w(q3 − q1)
2
Figure B.1.: Description of a boxplot representation

C. k-means clustering
The material is taken from Bishop (2006).

The k-means clustering method aims to assign samples of a data set to different groups
or clusters. Therefore, N samples of a matrix X with D dimensions are assigned to a given
number of clusters k. As a rule, the samples assigned to a cluster have a small inter-
sample distance in comparison to the samples outside a cluster. We∈define { · · ·µi, where i
1, , k , as the center of k clusters. The rule, which assigns samples of X} to a specific
cluster, is based on the minimization of sum of squares of the distances ∈{ ···
between a
sample xn, where n 1, , N , and the nearest µi. For each sample xn∈ { define the}binary
, we
value rni 1, 0 . It shows which cluster i the sample xn is assigned to. If xn is assigned to the
i-th cluster, then we /have rni = 1 and rnj = 0, where i = j. The cost function, called distortion
measure, for guiding the cluster assignment is defined as follows
N k
J = rni lxn − µi 2
. (C.1)
n=1 i=1 l
The goal is to minimize J with respect to rni and µi. This goal is achieved in an iterative
manner comprising the following two successive steps:
• Step 1: minimizing J with respect to rni, while µi is held fixed.
• Step 2: minimizing J with respect to µi, while rni is held fixed.
These steps correspond to the Expectation Maximization (em) algorithm such that step 1 and
2 refer to the expectation (e) and maximization (m) step, respectively.
In step 1, where µi is held fixed, the relationship between J and rni in (C.1) is a linear one.
Therefore, by assigning the sample xn to the nearest cluster i, J is minimized. Mathematically,
we have
= 1 if i = arg min lxn − µjl2 (C.2)
rni j
0 otherwise.
In step 2, however, with rni held fixed, the relationship between J and µi in (C.1) is quadratic.
Therefore, for minimizing J, the derivative of J with respect to µi is set to zero as follows
L=
n=1 rni
N N
2 rni (xn − ) = 0 =⇒ LN xn (C.3)
n=1 µi µi n=1 .
rn
i
The denominator in previous equation denotes the number of samples assigned to the i-th
cluster. Therefore, according to (C.3), the center of the i-the cluster µi is defined as the mean
of all samples xn assigned to that cluster. To this account, the method is called k-means
clustering.
The aforementioned steps are repeated as long as the assignment of samples to clusters
changes or a maximum number of iterations is reached. Since after each iteration, J is
minimized, the cost function definitely converges. A drawback of the k-means clustering,
however, is that
202 k-means clustering
it might wrongly converge to a local minimum of J instead of the global one. Moreover, as
we mentioned, this method uses the squared Euclidean distance for quantifying the amount
of dissimilarity between samples and µi. Consequently, the Euclidean distance between all N
samples and all cluster centers have to be calculated which leads to a slow convergence of
this method. For further optimization ideas with respect to the convergence rate of the k-
means clustering see B i s h o p (2006).
Obviously, before performing step 1, µi have to be initialized and the chosen initial values
have a direct impact on the convergence rate of the algorithm. Bishop (2006) suggests setting
µi randomly to a subset of k samples of X.
D. Statistical tests
The material provided here are taken from Gosling (1995), Montgomery and Runger (2006) and
Field (2007).
D.1. Paired-sample t-test
The Paired-sample t-test or dependent t-test is a suitable test for comparing the mean of two
groups, if the samples are measured as pairs. In other words, the samples should be collected
under “homogeneous conditions” (Montgomery and Runger, 2006). Mathematically, the pairs
are (x1,i, x2,i) with i = 1, 2, ·, ·N samples. In the case that the test is applied to the samples
collected in an experiment ·with different participants, similar participant should have been
participated in both groups. This means that samples x1,i and x2,i belong to one participant,
otherwise pooled t-test or independent t-test should be applied. Moreover, the results of the t-
test are reliable, if its assumption is fulfilled which is the normal distribution of samples.
Since t-test analyzes the difference between groups, i.e. ∆x−= x1 x2, the samples of ∆x should
be normally distributed.
The hypotheses of the test are as follows:
H 0: µ1 = µ2
H1: µ1 /= µ2 ,
where µ1 and µ2 refer to the average of x1 and x2, respectively.
The test statistic is found as follows

µ∆
t0 = , (D.1)
x
σ∆x
√
N
where µ∆x and σ∆x denote the mean and the standard deviation of the difference values in ∆x.
The decision about the rejection of H0 is made based on the value of the confidence level α
and the degrees of freedom ν, where ν−= N 1. The critical value of the test, namely tα,ν, is
found in the table of the Student’s t-distribution in statistic books such as Montgomery and
Runger (2006). Consequently, we have
reject H0: if |t0| > tα,ν =⇒ µ1 differs significantly from µ2.

fail to reject H0: if |t0| ≤ tα,ν .
The results can also be reported by the p-value based on the Student’s t cumulative
distribution function. Accordingly, p-values < α yields the rejection of H0.
204 Statistical test
D.2. Normal distribution test: Lilliefors test
In order to know whether the distribution of the data under investigation is normal, Lilliefors
test is applied which is a goodness of fit test. This test determines the normality of the current
data based on fitting a normal distribution to it and evaluating the difference between them.
Therefore, for data x with N values, the hypotheses are defined as follows
H0: x is normally distributed.

H1: x is not normally distributed.
First, the empirical cumulative distribution function FˆN (x) of x is calculated as follows
FˆN
number of samples in x ≤ x
(x) = N . (D.2)
The goal is to assess the agreement between FˆN (x) and the theoretical distribution function F
(x). F (x) is normally distributed with regard to the mean and standard deviation of x,
namely µ and σ. As a result, the standard values of x are needed as follows
x−µ
z= . (D.3)
σ
The test statistic is defined as
t0 = max (1F (z) − FˆN (z)1) . (D.4)
By comparing t0 with tN,α denoting the critical value of the test, it can be decided whether to
reject the H0 or not, i.e.
reject H0: if t0 > tN,α =⇒ x is not normally distributed.

fail to reject H0: if t0 ≤ tN,α ,
where α is the confidence level. The values of tN,α are listed in Gosling (1995).
D.3. Test of significance for the Pearson correlation coefficient
Based on a hypothesis test, namely the t-test, it is possible to show whether the Pearson
product- moment correlation coefficient ρp between two variables is significantly different
from zero. If ρp = 0, then we conclude that there is no linear relationship between the
variables under investigation. Therefore, the hypotheses are defined as follows
H0: ρp = 0 =⇒ There is no linear relationship between variables.

H1: ρp /= 0 .
According to Field (2007), the test static tρp with N − 2 degrees of freedom for N samples of
the variables is calculated as follows
N−2
tρp = ρp . (D.5)
1−
ρ p2
D.4 Comparison of two Pearson correlation coefficients 205
By comparing tρp with tα,ν, which denotes the critical value of the test based on the Student’s t-
distribution for ν = N 2 degrees
− of freedom, it can be decided whether to reject the H0 or not,
i.e.
reject H0: if |tρp | > tα,ν =⇒ ρp /= 0 and ρp is significantly different from zero.
fail to reject H0: if |tρp | ≤ tα,ν ,
where α is the confidence level. The values of tα,ν are listed in Gosling (1995).
Accordingly, based on the calculated tρp value, the p-value of the test can be reported as well.
p-value < α also yields the rejection of H0.
D.4. Comparison of two Pearson correlation coefficients
According to Field (2007), based on the t-statistic, it is possible to assess whether two Pearson
correlation coefficients are significantly different from each other or not. Mathematically,
three variables are available, namely x, y and z, and the relationship between two pairs,
namely ρp(x, y) and ρp(z, y), are of interest. For the ease of notation, we replace ρp(x, y) with
ρxy, since just the Pearson correlation coefficient is studied here.
The hypotheses of the test are as follows
H0: ρxy is not different from ρzy. (D.6)

H1: ρxy is different from ρzy. (D.7)
The test significance is calculated as follows
tdifference = (ρxy − ρzy) (

(N − 3)(1 +
. (D.8)
2 1− ρxz) 2
ρ2 −
xy ρ xz
− 2
zy + 2 ρxy ρzy
ρ
By comparing the value of tdifference with the critical test value provided in the table of t-
distribution for ν = N − 3 degrees of freedom and the confidence level of α, namely tα,ν,
following decisions are made, which are similar to the paired-sample t-test in Appendix D.1,
reject H0: if |tdifference| > tα,ν =⇒ ρxy differs significantly from ρzy.
fail to reject H0: if |tdifference| ≤ tα,ν .
The result can also be reported by the p-value based on the Student’s t cumulative
distribution function. Accordingly, p-values < α yields the rejection of H0.
D.5. One-way repeated measures ANOVA
The analysis of variance (anova) is a technique for comparing means of different groups. If
the measurements of the groups are related to different participants, it is then called one-way
independent anova. If, however, similar participants are available, i.e. several measurements
for each subject, then one-way repeated measures anova is used instead. This method
concentrates on the within-subject differences and is, in fact, an extension of the paired-
sample t-test explained
in Appendix D.1. On the contrary, one-way independent anova focuses on the between-group
differences.
Before applying the anova for repeated measurements, following assumptions are required to
be fulfilled:
• Normally distributed group differences.
• Sphericity, which is equivalent to the homogeneity of the variance. The variances of
group differences should almost be the same. According to Field (2007), by applying
the Mauchly’s test, this assumption can be checked.
• Independent samples.
For a data set with G groups and N subjects, which fulfills all aforementioned assumptions,
the hypotheses of the anova are as follows
H0: µ1 = µ2 = · · · = µG
H1: at least one µ is different from the other ones.
µi refers to the mean of the i-th group. It is clear that, in the case of H1, the test does not
provide information about the group or groups with different mean values. For the ease of
notation, we consider the data set listed in Table D.1.
Groups
Subjects 1 2 ··· G mean
1 x11 x12 ··· x1G x¯1
2 x21 x22 ··· x2G x¯2
. . . . . .
N xN 1 xN 2 ··· xN G x¯ N
mean µ1 µ2 ··· µG µ¯
Table D.1.: Typical data set of one-way anova
First, the groups variability, called SSbetween, is calculated as the summed square of the deviation
of the group mean µi from the overall mean µ¯ of the data set
G
2
SS between = N (µi µ¯) . (D.9)
i=1
The next step is the calculation of within-subject variation called SSwithin

−
G N
2
SSwithin = (xij µj) . (D.10)
j=1 i=1
In the case of repeated measurements, −each subject must also be considered separately.
There- fore, the summed square of the deviation of the subject’s mean x¯i from the
overall mean µ¯ is needed, i.e.
N
2
SS subjects = G(x¯i µ¯) . (D.11)
i=1
−
D.6 Homogeneity of variance: Levene’s test 207
x¯i refers to the mean of the samples for the i-th subject, namely
G
1
i x¯ = ij
G x .
j=1
(D.12)
SSwithin also includes SSsubjects. Hence, the error is defined as follows
SS error = SSwithin − SSsubjects . (D.13)
Now, the means of sum of squares are calculated by considering the degrees of freedom ν1
and
ν2 as follows
MS between SS
=
between (D.14)
ν1
MS error SS
=
error (D.15)
ν2
ν1 = G − 1 (D.16)
ν2 = (N − 1)(G − 1) . (D.17)
Finally, the test statistic F0 is calculated as
F0 =
MS
. (D.18)
between
MS error
By comparing F0 with the critical value of F -distribution with respect to ν1 and ν2, namely
Fα,ν1,ν2 , following decision is made:
reject H0: if F0 > Fα,ν1,ν2 =⇒ At least one µ is significantly different.

fail to reject H0: if F0 ≤ Fα,ν1,ν2 .
α refers to the confidence level.
D.6. Homogeneity of variance: Levene’s test
The homogeneity of variance is an assumption of the anova which investigates whether the
variance and the spread of values in each group are in the same range or not. This is done
based on the Levene’s test. We use the same notation as in Appendix D.5.
The hypotheses of the test are as follows:
H0: σ21= σ2 2= · · · = σ2G

H1: at least one σ2 is different from the other ones.
σ2i denotes the variance of the i-th group.

We define the following

variables
zij = |xij − µj|
N
µzj = 1 zij
N
i=1
1 N G
µ = z .
z NG ij
i=1 j=1
Based on these definitions the test statistic is defined which is equivalent to applying one-
way anova to zij with i going from 1 to N samples and j going from 1 to G groups.
Therefore, we have
G
L
G (N − 1) j=1 2
. (D.19)
F0 = L N
L N (µ
G−1 G
(zij − µzj )2
j=1 i=1
The critical value of the test and the conditions for rejecting/not rejecting H0 are similar to the
one-way repeated measures anova with the degrees of freedom ν1 and ν2
ν1 = G − 1 (D.20)
ν2 = G (N − 1) . (D.21)
Therefore, we have
reject H0: if F0 > Fα,ν1,ν2 =⇒ At least one σ2 is significantly different.

fail to reject H0: if F0 ≤ Fα,ν1,ν2 .
α refers to the confidence level.
D.7. Wilcoxon signed-rank test
As a non-parametric statistical test, the Wilcoxon signed-rank test analyzes whether the means
of two paired samples are significantly different from each other. This test is mainly applied,
if the assumption of the paired-sample t-test explained in Appendix D.1, namely the normal
distribution of samples, is violated. This assumption Therefore, it is important to have similar
participants in both groups, since this test studies within-subject differences.
For the paired samples (x1,i, x2,i) with i = 1, 2, · · , N and the corresponding mean values of µ1
and µ2, H0 and H1 hypotheses are as follows: ·
H 0: µ1 = µ2
H1: µ1 /= µ2 .
Test procedure:
| ∆x
1. Rank the absolute value of the difference between pairs | i , where ∆xi = x1,i x2,i, in
the ascending order. −
D.8 Pearson’s chi-square test 209
2. Give each rank the same sign as the corresponding ∆xi. For samples with similar
difference values, the average of the ranks should be considered.
3. Find W = min(W+, W −), where W+ and W denote the sum of positive and
negative ranks, respectively, while considering the absolute value of each rank in the
summation.
4. Calculate the following mean and variance values, namely W and VW ,
N (N + 1)
W =
4
N (N + 1)(2N + 1)
VW = .
24
5. Calculate the z-score of W

W−W
z0 = . (D.22)
VW
6. The test result is as follows
reject H0 : if |z0 | > wα∗ =⇒ µ1 differs significantly from µ2 .

fail to reject H0 : if |z0 | ≤ wα∗ .
wα∗ with α as the confidence level is the critical value of the test which is listed in statistic books
such as in Montgomery and Runger (2006).
Alternatively, the p-value of the test can be calculated based on the normal cumulative distri-
bution function. Thus, H0 is rejected, if p-value < α.
D.8. Pearson’s chi-square test
The Pearson’s chi-square test is a statistical test for analyzing the relationship between observed
categorical data with respect to the chi-squared (χ2) distribution. If the categories interact
which other, then we conclude that they are dependent, because the occurrence of one event
leads to the occurrence of the other one. Categorical data can be represented by the
contingency table which summarizes the scores with respect to their membership in each
category, as shown in Table D.2 for two categories.
Table D.2.: Contingency table with two categories

condition A
events yes no total
yes x11 x12 x11 + x12
condition B
no x21 x22 x21 + x22
total x11 + x21 x12 + x22 N
The test hypotheses are as follows
H0: The observed samples are statistically independent.

H1: The observed samples are statistically dependent.
The test statistic χ2 is

0 calculated as follows
r c 2
(xij − Eij )
χ20 = Ei , (D.23)
i=1
j
where xij is the observation summarized in Table D.2 with c columns and r rows. Eij denotes
the expected frequency for each member of the contingency table under the independency
assumption and is calculated as ( c )( r
L x L xqj
i
k=1 k q=1 . (D.24)
Eij = )
N
N refers to the total number of scores.
By comparing χ20 with the critical value χ2α, with ν = (r − 1)(c − 1) degrees of freedom and
confidence level of α, it is decided whetherνto reject the H0 as follows
reject H0: if χ20 > χ2 α, =⇒ Events occur independently.

ν
fail to reject H0: if χ20≤ χ2 α, .
ν
The result can also be reported by the p-value based on the χ2 cumulative distribution
function. Accordingly, p-values < α yields the rejection of H0.
E. Mother wavelets
Figure E.1 shows the scaling and wavelet functions of Haar and db4.
Haar scaling function Haar wavelet ψ(t)
1.5 φ(t) 2
1 1
0.5 0
0
−1
−0.5
0 0.5 1 −2
0 0.5 1
db4 scaling function φ(t)
db4 wavelet ψ(t)
1.5
1.5
1 1
0.5 0.5
0
0
−0.5
−0.5
0 2 4 6 −1
0 2 4 6
Figure E.1.: Scaling and wavelet functions of two mother wavelets
F. Additional results
F.1. Analysis of statistical measures of features
Figure F.1 shows the Spearman’s rank correlation coefficient ρs between the statistical metrics
of 18 drive time-based features and kss values. Since frequency F is not affected by different
metrics, it is not included. In addition to the mean, which was used in this work, following
statistical metrics were calculated: standard deviation (std), median, minimum (min),
maximum (max), range, defined as max − min, and root mean square (rms), namely
I1
rms = i n
x2i (F.1)
n i=1 ,
where n denotes the number of events in an extraction window. All of these metrics were
calculated for the events detected within an extraction window. Moreover, they were
baselined afterwards to filter out individual differences. The missing bars could not be
calculated. ρs values were not significant (p-value > 0.001) for: std, max and range for A, range
for MCV and AOV and min for T80.
mean std median min max range rms
0.6
0.4
0.2
ρs
−0.2
−0.4
PERCLOS
A/MOV
A/MCV
MOV
MCV
Tcl,1
Tcl,2
AOV
ACV
Tro
T50
T80
T90
To
Tc
A
Figure F.1.: Comparison of Spearman’s rank correlation coefficient between statistical metrics of
features and kss values. Feature type: drive time-based features
Interestingly, some of the metrics are associated with drowsiness in different directions. As an
example, perclos is positively correlated with the kss regarding std and rms, while for the
mean and min, ρs values are negative.
214 Additional results
Overall, mean, median and rms are more consistent in terms of trends of all features in com-
parison to other metrics.
F.2. Boxplot of drive time-based features versus KSS values
Following figures show boxplots of drive time-based features versus kss values for 42 subjects
separately. For subject S2, which is not included in the following figures, all features are
shown together in Figure F.40.
• A: Figures F.2 and F.3
• E: Figures F.4 and F.5
• MCV : Figures F.6 and F.7
• MOV : Figures F.8 and F.9
• A/MCV : Figures F.10 and F.11
• A/MOV : Figures F.12 and F.13
• ACV : Figures F.14 and F.15
• AOV : Figures F.16 and F.17
• F : Figures F.18 and F.19
• T : Figures F.20 and F.21
• Tc: Figures F.22 and F.23
• To: Figures F.24 and F.25
• Tcl,1: Figures F.26 and F.27
• Tcl,2: Figures F.28 and F.29
• Tro: Figures F.30 and F.31
• perclos: Figures F.32 and F.33
• T50: Figures F.34 and F.35
F.2 Boxplot of drive time-based features versus KSS values 215
Normalized A
S1 S3 S4
1
0.75
0.5
0.25 28 27 23
7 2 6
Normalized A
S5 S6 S7
1
0.75
0.5
0.25
17 32 17
Normalized A
S8 S9 S10
1 3 8 4
0.75
0.5
0.25
Normalized A
S11 S12 S13

1 11 32 22
0.75
9 8 9
0.5
0.25 6
Normalized A
1 S14 S15 S16

42 2 34
0.75
4 4
0.5
0.25
Normalized A
1 S17 27 S18 S19

4
0.75
21 37
0.5
3 3
0.25
Normalized A
1 S20 S21 S22

36
0.75
3
0.5
19 13
0.25 2 5
33
9
23 37
6 3
123456789 123456789 123456789
KSS KSS KSS
Figure F.2.: Boxplot of normalized feature A versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of A [µV] for each subject.
Normalized A S23 S24 S25

1
0.75
0.5
0.25 80 39 20
8 7
Normalized A
S26 S27 S28

1
0.75
0.5
0.25 70
4 33 64
Normalized A
S29 S30 S31

1 0 8
0.75
0.5
0.25
25
Normalized A
1 S32 S33 S34

6 19 23
0.75 7 5
0.5
0.25
2 4
44 6
0 44
6
Normalized A
1 S35 S36 S37
0.75
0.5
0.25
73 18 53
Normalized A
1 1 9 5
S38 S39 S40
0.75
0.5
0.25
Normalized A
1 80 68 60
S41 S42 S43
0.75
6 0 6
0.5
0.25
37 50 34
9 7 9
123456789 123456789 123456789
KSS KSS KSS
Figure F.3.: Boxplot of normalized feature A versus kss for subjects S23 to S43. The values on the bottom
left show the maximum of A [µV] for each subject.
Normalized E
S1 S3 S4
1
0.75
0.5
0.25
0 0.5 2 2
1
Normalized E
S5 S6 S7
1
0.75
0.5
0.25
0 1 0.2
0.7 5
Normalized E
S8 S9 S10
1 4
0.75
0.5
0.25
0 2
Normalized E
S11 S12 0.6 S13

1 0.2
0.75 1
0.5
0.25
0 0.5
Normalized E
6 2
1 S14 S15 S16
0.75 4
0.5
0.25
0 1
Normalized E
1 S17 0.8 S18 S19

0.75 1
0.5
0.25
0 0.7
Normalized E
1 S20
0.8 S21 S22
0.75
1
0.5
0.5
0.25
0
1
1
0.9
9
123456789 123456789 123456789
KSS KSS KSS
Figure F.4.: Boxplot of normalized feature E versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of E [(mV)2] for each subject.
Normalized E S23 S24 S25

1
0.75
0.5
0.25
0 0.2 1 0.4
4 2
Normalized E
S26 S27 S28

1
0.75
0.5
0.25
0 2
4 2
Normalized E
S29 S30 S31

1
0.75
0.5
0.25
0 4
0.8
Normalized E
S32 S33 1 S34

1 5
0.75
0.5
0.25
0
0.8
Normalized E
1 S35 8 S36 4 S37

3
0.75
0.5
0.25
0
Normalized E
1 S38 0.6 S39 2 S40

5 5
0.75
0.5
0.25
0
Normalized E
1 2
S41 S42 S43
5 3
0.75
0.5
0.25
0
1
2
2
123456789 123456789 123456789
KSS KSS KSS
Figure F.5.: Boxplot of normalized feature E versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of E [(mV)2] for each subject.
Normalized M CV
S1 S3 S4
1
0.75
0.5
0.25
5 4 4
Normalized M CV
S5 S6 S7
1
0.75
0.5
0.25
3 4 3
Normalized M CV
S8 S9 S10
1
0.75
0.5
0.25
2 5 4
Normalized M CV
S11 S12 S13

1
0.75
0.5
0.25
7 5 5
Normalized M CV
S14 S15 S16

1
0.75
0.5
0.25
3 5 7
Normalized M CV
S17 S18 S19

1
0.75
0.5
0.25
3 5 2
Normalized M CV
S20 S21 S22

1
0.75
0.5
0.25
5 6 5
123456789 123456789 123456789
KSS KSS KSS
Figure F.6.: Boxplot of normalized feature MCV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of MCV [mV/s] for each subject.
Normalized M CV S23 S24 S25

1
0.75
0.5
0.25
2 6 3
Normalized M CV
S26 S27 S28

1
0.75
0.5
0.25 1 7 10
Normalized M CV
1
S29 S30 S31
1
0.75
0.5
0.25
5 3 5
Normalized M CV
S32 S33 S34

1
0.75
0.5
0.25
9 5 7
Normalized M CV
S35 S36 S37

1
0.75
0.5
0.25
12 4 10
Normalized M CV
S38 S39 S40

1
0.75
0.5
0.25
13 10 11
Normalized M CV
S41 S42 S43

1
0.75
0.5
0.25
9 9 6
123456789 123456789 123456789
KSS KSS KSS
Figure F.7.: Boxplot of normalized feature MCV [mV/s] versus kss for subjects S23 to S43. The values
on the bottom left show the maximum of MCV for each subject.
Normalized MOV
S1 S3 S4
1
0.75
0.5
0.25 3 3 3
Normalized MOV
S5 S6 S7
1
0.75
0.5
0.25 2 3 3
Normalized MOV
S8 S9 S10
1
0.75
0.5
0.25 2 4 3
Normalized M OV
S11 S12 S13

1
0.75
0.5
0.25 5
Normalized MOV
3 3
S14 S15 S16
1
0.75
0.5
0.25 2
Normalized MOV
3 5
S17 S18 S19
1
0.75
0.5
0.25 2
Normalized MOV
3 1
S20 S21 S22
1
0.75
0.5
0.25 2
4 3
123456789 123456789 123456789
KSS KSS KSS
Figure F.8.: Boxplot of normalized feature MOV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of MOV [mV/s] for each subject.
Normalized M OV S23 S24 S25

1
0.75
0.5
0.25 1 4 2
Normalized MOV
S26 S27 S28

1
0.75
0.5
0.25 9 6 8
Normalized M OV
S29 S30 S31

1
0.75
0.5
0.25 3 3 3
Normalized M OV
S32 S33 S34

1
0.75
0.5
0.25 5
4 6
Normalized M OV
S35 S36 S37

1
0.75
0.5
0.25 8 3 9
Normalized MOV
S38 S39 S40

1
0.75
0.5
0.25 8 7 8
Normalized M OV
S41 S42 S43

1
0.75
0.5
0.25 6 7 4
123456789 123456789 123456789
KSS KSS KSS
Figure F.9.: Boxplot of normalized feature MOV versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of MOV [mV/s] for each subject.
S1 S3 S4
1
Normalized
AMCV
0.75
0.5
0.25 0.0 0.13 0.1

8 2
S5 S6 S7
1
Normalized
AMCV
0.75
0.5 0.1
5
0.25
0.0 0.0
S8 S9 S10
1 9 9
Normalized
AMCV
0.75
0.5
0.1
0.25
6
S11 S12 S13
1 0.0 0.0
Normalized
7
AMCV
0.75
7
0.5
0.25 08
0.0
S14 8 S15 S16
1
Normalized
AMCV
0.0 0.
0.75
8
0.5
0.25
1 S17 0.1 S18 0.0 S19

1 8
Normalized
AMCV
0.75
0.1
0.5 7
0.25
1 S20 S21 S22

0.2 0.0
Normalized
AMCV
0.75 5 7
0.5 0.0
0.25 9
0.0 0.0
9 9
0.0
9
123456789 KSS 123456789
KSS 1 23456789
KSS
Figure F.10.: Boxplot of normalized feature A/MCV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of A/MCV [s] for each subject.
S23 S24 S25

1
AMCV
0.75
Normalized
0.5
0.25 0.1 0.1 0.1

6 1
S26 S27 S28
1
AMCV
0.75
Normalized
0.5
0.25 07 0.0
0. 8 0.0
S29 S30 S31
1 8
AMCV
0.75
Normalized
0.5
0.25
0.1 0.1
S32 S33 S34
1 1 0.0
AMCV
0.75
7
Normalized
0.5
0.25
0.0
1 S35 S36 S37
7 0.0
AMCV
0.75 6 0.0
Normalized
7
0.5
0.25
07
1 0. S38 S39 S40
AMCV
0.75 0.0
Normalized
7 0.0
0.5 6
0.25
08
1 0.0 S41 S42 S43
AMCV
0.75 7
Normalized
0.
0.5
0.0
0.25 6
0.1
0.0
7
0.0
7
123456789 123456789 123456789
KSS KSS KSS
Figure F.11.: Boxplot of normalized feature A/MCV versus kss for subjects S23 to S43. The values on
the bottom left show the maximum of A/MCV [s] for each subject.
S1 S3 S4
1
A M OV
0.75
Normalized
0.5
0.07 0.09 0.08

S5 S6 S7
1
A M OV
0.75
Normalized
0.5
0.07 0.08 0.06

S8 S9
1 S10
A M OV
0.75
Normalized
0.5
0.06 0.07 0.08

S11
1 S12 S13
A M OV
0.75
Normalized
0.5
0.08 0.08 0.09

S14
1 S15 S16
A M OV
0.75
Normalized
0.5
0.09 0.07 0.07

S17
1 S18 S19
A M OV
0.75
Normalized
0.5
0.08 0.1 0.08

S20
1 S21 S22
A M OV
0.75
Normalized
0.5 0.07 123456789

KSS
1234567 123456789
89 KSS
0.07 KSS 0.08
Figure F.12.: Boxplot of normalized feature A/MOV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of A/MOV [s] for each subject.
S23 S24 S25

1
A M OV
0.75
Normalized
0.5
0.06 0.0 0.1
S26
8
S27 S28
1
A M OV
0.75
Normalized
0.5
0.07 0.05 0.06
S29 S30 S31
1
A M OV
0.75
Normalized
0.5
0.07 0.07 0.07
S32 S33 S34
1
A M OV
0.75
Normalized
0.5
0.06 0.06 0.07
S35 S36 S37
1
A M OV
0.75
Normalized
0.5
0.07 0.06 0.06
S38 S39 S40
1
A M OV
0.75
Normalized
0.5
0.07 0.08 0.07
S41 S42 S43
1
A M OV
0.75
Normalized
0.5
0.06 0.07 0.07
12345678 9 123456789 123456789
KSS KSS KSS
Figure F.13.: Boxplot of normalized feature A/MOV versus kss for subjects S23 to S43. The values on
the bottom left show the maximum of A/MOV [s] for each subject.
Normalized ACV
S1 S3 S4
1
0.75
0.5
0.25
4 3 3
Normalized ACV
S5 S6 S7
1
0.75
0.5
0.25
2 3 3
Normalized ACV
S8 S9 S10
1
0.75
0.5
0.25
2 4 3
Normalized ACV
S11 S12 S13

1
0.75
0.5
0.25
5 3 4
Normalized ACV
S14 S15 S16

1
0.75
0.5
0.25
2 3 5
Normalized ACV
S17 S18 S19

1
0.75
0.5
0.25
2 4 2
Normalized ACV
S20 S21 S22

1
0.75
0.5
0.25
3 4 4
123456789 123456789 123456789
KSS KSS KSS
Figure F.14.: Boxplot of normalized feature ACV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of ACV [mV/s] for each subject.
Normalized ACV S23 S24 S25

1
0.75
0.5
0.25
1 4 2
Normalized ACV
S26 S27 S28

1
0.75
0.5
0.25 8 5 7
Normalized ACV
S29 S30 S31

1
0.75
0.5
0.25
3 2 3
Normalized ACV
S32 S33 S34

1
0.75
0.5
0.25
6 4 5
Normalized ACV
S35 S36 S37

1
0.75
0.5
0.25
8 3 8
Normalized ACV
S38 S39 S40

1
0.75
0.5
0.25
9 7 8
Normalized ACV
S42
S41 S43
1
0.75
0.5
6
0.25
6 4
123456789 123456789 123456789
KSS KSS KSS
Figure F.15.: Boxplot of normalized feature ACV versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of ACV [mV/s] for each subject.
Normalized AOV
S1 S3 S4
1
0.75
0.5
0.25
1 1 1
Normalized AOV
S5 S6 S7
1
0.75
0.5
0.25 0.93 1 1
Normalized AOV
S8 S9 S10
1
0.75
Normalized AOVNormalized AOV
0.5
0.25 0. 1 0.9
8 9
S11 S12 S13
1
0.75
0.5
0.25
2 1 1
S14 S15 S16
1
0.75
0.5
0.25 0. 99 1 2
Normalized AOV
S17 S18 S19

1
0.75
0.5
0.25 0.8 1 0.6

Normalized AOV
9
S20 S21 S22
1
0.75
0.5
0.25 0.95 1 1
123456789 123456789 123456789
KSS KSS KSS
Figure F.16.: Boxplot of normalized feature AOV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of AOV [mV/s] for each subject.
Normalized AOV S23 S24 S25

1
0.75
0.5
0.25 0.5 1 0.8

Normalized AOV
9 9
S26 S27 S28
1
0.75
0.5
0.25
4 3 3
Normalized AOV
S29 S30 S31

1
0.75
0.5
0.25
2 1 1
Normalized AOV
S32 S33 S34

1
0.75
0.5
0.25
2 2
2 S35 S36 S37
Normalized AOV
0.75
0.5
0.25
4 1 S40
Normalized AOV
S38 S39
1
3
0.75
0.5
4 3 3
0.25
Normalized AOV
S41 S42 S43

1
0.75
0.5
0.25 2 3 2
123456789 123456789 123456789
KSS KSS KSS
Figure F.17.: Boxplot of normalized feature AOV versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of AOV [mV/s] for each subject.
Normalized F
S1 S3 S4
1
0.75
0.5
0.25
39 79 S6 43
0
Normalized F
S5 S7
1
0.75
0.5
0.25 48
0
37 S9 30
Normalized F
S8 S10
1
0.75
0.5
0.25 S12
36 33 80
0
Normalized F
S11 S13
1
0.75
0.5 37
0.25 S15
0
56 68
Normalized F
1 S14 S16
0.75
0.5 48
0.25 S18
0
45 48
Normalized F
1 S17 S19
0.75
0.5 36
0.25 S21
31
0 38
Normalized F
1 S20 S22
0.75
0.5 35
0.25
0
33
51
123456789 123456789 123456789
KSS KSS KSS
Figure F.18.: Boxplot of normalized feature F versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of F [1/min] for each subject.
Normalized F S23 S24 S25

1
0.75
0.5
0.25
50 41 30
0
Normalized F
S26 S27 S28

1
0.75
0.5
0.25
14 25 44 S31
0
Normalized F
S29 S30
1
0.75
0.5
0.25 37
0
47 28 S34
Normalized F
S32 S33
1
0.75
0.5
0.25 37 S37
0
30 61
Normalized F
1 S35 S36
0.75
0.5 67
0.25 S40
18
0 33
Normalized F
1 S38 S39
0.75
0.5 57
0.25 S43
0
14
34
Normalized F
1 S41 S42
0.75
0.5 53
0.25
0
47
36
123456789 123456789 123456789
KSS KSS KSS
Figure F.19.: Boxplot of normalized feature F versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of F [1/min] for each subject.
S1 S3 S4
Normalized T
1
0.75
0.5
0.25 453 704 868

Normalized T
S5 S6 S7
1
0.75
0.5
0.25 624 674 53

7
Normalized T
S8 S9 S10
1
0.75
0.5
0.25 37 70 42
3 6 4
Normalized T
S11 S12 S13

1
0.75
0.5
0.25 57 44 60
2 4 8
Normalized T
S14 S15 S16

1
0.75
0.5
0.25 2 68 44 5
5
Normalized T
S17 S18 S19

1
0.75
0.5
0.25 46 128 45
9 5 5
Normalized T
S20 S21 S22

1
0.75
0.5
0.25 462 417 564

123456789 123456789 123456789
KSS KSS KSS
Figure F.20.: Boxplot of normalized feature T versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of T [s] for each subject.
Normalized T S23 S24 S25

1
0.75
0.5
0.25 53 54 73
0 2 2
Normalized T
S26 S27 S28

1
0.75
0.5
0.25 32 39 41
9 1 3
Normalized T
S29 S30 S31

1
0.75
0.5
0.25 44 70 37
1 2 5
Normalized T
S32 S33 S34

1
0.75
0.5
0.25 39 34 38
2 4 0
Normalized T
S35 S36 S37

1
0.75
0.5
0.25 47 8 36 32
9 2
Normalized T
S38 S39 S40

1
0.75
0.5
0.25 40 45 41
8 0 2
Normalized T
S41 S42 S43

1
0.75
0.5
0.25 419 509 437

123456789 123456789 123456789
KSS KSS KSS
Figure F.21.: Boxplot of normalized feature T versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of T [s] for each subject.
Normalized Tc
S1 S3 S4
1
0.75
0.5
0.25 14 22 24
3 2 8
Normalized Tc
S5 S6 S7
1
0.75
0.5
0.25
19 29 23
Normalized Tc
S8 S9 S10
1 7 0 6
0.75
0.5
0.25
Normalized Tc
S11 S12 S13

1 14 34 13
0 7 9
0.75
0.5
0.25
17 16 14
6 3 5
Normalized Tc
S14 S15 S16

1
0.75
0.5
0.25 30 19 15
5 5 0
Normalized Tc
S17 S18 S19

1
0.75
0.5
0.25
16 32 17
Normalized Tc
S20 S21 S22

1 4 4 2
0.75
0.5
0.25
18 15 20
0 8 5
123456789 123456789 123456789
KSS KSS KSS
Figure F.22.: Boxplot of normalized feature Tc versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of Tc [ms] for each subject.
Normalized Tc S23 S24 S25

1
0.75
0.5
0.25 34 172 213

0
Normalized Tc
S26 S27 S28

1
0.75
0.5
0.25
150 154
12
Normalized Tc
S29 S30 S31

1 0
0.75
0.5
0.25
411 187
Normalized Tc
S32 S33 S34

1 16
0.75
2
0.5
0.25
125 125
Normalized Tc
1 S35 S36 S37
0.75 15
9
0.5
0.25
121 91
Normalized Tc
1 S38 S39 S40
0.75
25
0.5
8
0.25
173 104
Normalized Tc
1 S41 S42 S43

0.75
0.5
14
0.25 4
188 159
16
3
123456789 123456789 123456789
KSS KSS KSS
Figure F.23.: Boxplot of normalized feature Tc versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of Tc [ms] for each subject.
Normalized To
S1 S3 S4
1
0.75
0.5
315 296 346

Normalized To
S5 S6 S7
1
0.75
0.5
266 289 265

Normalized To
S8 S9
1 S10
0.75
0.5
242 334 301

Normalized To
S11
1 S12 S13
0.75
0.5
300 298 310

Normalized To
S14
1 S15 S16
0.75
0.5
340 308 308

Normalized To
S17
1 S18 S19
0.75
0.5
307 395 298

Normalized To
S20
1 S21 S22
0.75
0.5
23456789
KSS
308 279 365
1 123456789 123456789
KSS KSS
Figure F.24.: Boxplot of normalized feature To versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of To [ms] for each subject.
Normalized To S23 S24 S25

1
0.75
0.5
250 347 337
Normalized To
S26 S27 S28

1
0.75
0.5
227 264 283
Normalized To
S29 S30 S31

1
0.75
0.5
281 342 240
Normalized To
S32 S33 S34

1
0.75
0.5
285 226 262
Normalized To
S35 S36 S37

1
0.75
0.5
268 255 235
Normalized To
S38 S39 S40

1
0.75
0.5
280 320 316
Normalized To
S41 S42 S43

1
0.75
0.5
273 325 301
123456789 123456789 123456789
KSS KSS KSS
Figure F.25.: Boxplot of normalized feature To versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of To [ms] for each subject.
Normalized Tcl,1 S1 S3 S4
1
0.75
0.5
0.25
0 63 23 33
Normalized Tcl,1
S5 5 S6 7 S7
1
0.75
0.5
0.25
0 24
Normalized Tcl,1
3 11 27
S8 S9 S10
1 6 2
0.75
0.5
0.25
0
Normalized Tcl,1
71
1 S11 S12 S13
14 38
0.75 5
0.5
0.25
0 2
Normalized Tcl,1
8 22
19
1 S14 S15 4 S16
8
0.75
0.5
0.25
0 4
Normalized Tcl,1
0
1 S17 S18 S19
25
0.75 1
25
0.5 2
0.25
0 9
Normalized Tcl,1
5
1 S20 S21 S22
0.75
71
0.5 73
0.25 8
0
95
23
65
123456789 123456789 123456789
KSS KSS KSS
Figure F.26.: Boxplot of normalized feature Tcl,1 versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of Tcl,1 [ms] for each subject.
Normalized Tcl,1Normalized Tcl,1
S23 S24 S25

1
0.75
0.5
0.25
0 174 115 216
S26 S27 S28
1
0.75
0.5
0.25
2 2 2
0 0 0 0
Normalized Tcl,1
S29 S30 S31

1
0.75
0.5
0.25
0 2 2 2
Tcl,1
0 0 0
S32 S33
Normalized Tcl,1Normalized Tcl,Normalized
1 S34
0.75
0.5
1
0.25
0 2 2
0 0 2
S35 S36
0
1
S37
0.75
0.5
0.25
0 2 2
0 0 2
1 S38 S39 0
0.75 S40
0.5
0.25
0 2 2
Normalized Tcl,1
0 0
1 S41 S42 2
0
0.75
S43
0.5
0.25
0 2 2
0 0
123456789 123456789 2
KSS KSS 0
123456789
KSS
Figure F.27.: Boxplot of normalized feature Tcl,1 versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of Tcl,1 [ms] for each subject.
Normalized Tcl,2
S1 S3 S4
1
0.75
0.5
0.25 31 490 62
Normalized Tcl,2
5 4
S5 S6 S7
1
0.75
0.5
0.25 385
Normalized Tcl,2
46 44
S8 S9 S10
1 3 7
0.75
0.5
0.25
411
Normalized Tcl,2
S11 S12 S13

1 26 30
1 1
0.75
0.5
0.25
298
Normalized Tcl,2
1 S14 S15 S16
0.75
45 47
4 4
0.5
0.25
508
Normalized Tcl,2
1 S17 S18 S19
0.75
52 31
0.5 8 4
0.25
Normalized Tcl,2
103
1
0
S20 S21 S22
0.75
0.5
31 32
0.25 1 8
288
30 40
8 5
123456789 123456789 12345678 9
KSS KSS KSS
Figure F.28.: Boxplot of normalized feature Tcl,2 versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of Tcl,2 [ms] for each subject.
Normalized Tcl,2Normalized Tcl,2
S23 S24 S25

1
0.75
0.5
0.25 34 381 519

0
S26 S27 S28
1
0.75
0.5
0.25 264 283

Normalized Tcl,2
22
S29 S30 S31
1 7
0.75
0.5
0.25
342 240
Tcl,2
S32 S33 S34

Normalized Tcl,2Normalized Tcl,Normalized
1 28
1
0.75
2
0.5
0.25
226 262
1 S35 S36 S37
0.75 28
5
0.5
0.25
255 235
1 S38 S39 S40
0.75
26
0.5 8
0.25
Normalized Tcl,2
320 316
1 S41 S42 S43
0.75
0.5
28
0.25 0
325 301
27
3
123456789 123456789 123456789
KSS KSS KSS
Figure F.29.: Boxplot of normalized feature Tcl,2 versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of Tcl,2 [ms] for each subject.
Normalized Tro
S1 S3 S4
1
0.75
0.5
0.25
0 10 99 11
Normalized Tro
3 8
S5 S6 S7
1
0.75
0.5
0.25
0 99
Normalized Tro
92 94
S8 S9 S10
1
0.75
0.5
0.25
0 15
Normalized Tro
94 1 76
1 S11 S12 S13
0.75
0.5
0.25
0 1
Normalized Tro
88 79 10
1 S14 S15 S16
0.75
0.5
0.25
0
0 65
Normalized Tro
13 13
1 S17 S18 S19
3
0.75
0.5
0.25
0 10
Normalized Tro
96 1
1 S20 S21 S22
20
0.75
0
0.5
0.25
0
89 12
1
79
123456789 123456789 123456789
KSS KSS KSS
Figure F.30.: Boxplot of normalized feature Tro versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of Tro [ms] for each subject.
Normalized Tro S23 S24 S25

1
0.75
0.5
0.25
0 15 87 123
Normalized Tro
0
S26 S27 S28
1
0.75
0.5
0.25
0 76 79
Normalized Tro
65
S29 S30 S31
1
0.75
0.5
0.25
0 116 79
Normalized Tro
85
1 S32 S33 S34
0.75
0.5
0.25 68
0 65
Normalized Tro
83
1 S35 S36 S37
0.75
0.5
0.25
0 55
66
Normalized Tro
67
1 S38 S39 S40
0.75
0.5 59
0.25
0
66
Normalized Tro
63
1 S41 S42 S43
0.75
0.5
72
0.25
0
85
67
123456789 123456789 123456789
KSS KSS KSS
Figure F.31.: Boxplot of normalized feature Tro versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of Tro [ms] for each subject.
S1 S3 S4
1
Normalized
0.75
0. 4 0. 2 0.74
0.5 7 7
S5 S6 S7
1
Normalized
0.75
0.69 0.7 0.72

0.5
S8 S9 S10
1
Normalized
0.75
0.7 0.7 0.73

0.5
1
S11 S12 S13
1
Normalized
0.75
0.7 0.71 0.75

0.5
5
S14 S15 S16
1
Normalized
0.75
0. 0.72 0.75
0.5
7
S17 S18 S19
1
Normalized
0.75
0. 0.76 0.72
0.5
7
S20 S21 S22
1
Normalized
0.75
0.73 0.72 0.74

0.5
123456789 123456789 123456789
KSS KSS KSS
Figure F.32.: Boxplot of normalized feature perclos versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of perclos for each subject.
S23 S24 S25

1
PERCLOS Normalized PERCLOS Normalized PERCLOS NormalizedPERCLOS NormalizedPERCLOS Normalized PERCLOS NormalizedPERCLOS Normalized
0.75
0.6 0.77 0.74

0.5
7
S26 S27 S28
1
0.75
0.6 0.71 0.71

0.5
9
S29 S30 S31
1
0.75
0.6 0.7 0.69

0.5
9
S32 S33 S34
1
0.75
0.7 0.68 0.73

0.5
2
S35 S36 S37
1
0.75
0.7 0.72 0.72

0.5
1
S38 S39 S40
1
0.75
0.7 0.73 0.77

0.5
2
S41 S42 S43
1
0.75
0.7 0.73 0.73

0.5
123456789 123456789 123456789
KSS KSS KSS
Figure F.33.: Boxplot of normalized feature perclos versus kss for subjects S23 to S43. The values on
the bottom left show the maximum of perclos for each subject.
Normalized T50 S1 S3 S4
1
0.75
0.5
0.25 18 39 54
9
Normalized T50
S5
0 S6
4 S7
1
0.75
0.5
0.25
Normalized T50
29 36 21
S8 S9 S10
1 5 5 2
0.75
0.5
0.25
Normalized T50
S11 S12 S13

1 13 35 14
0.75 6 3 5
0.5
0.25
Normalized T50
1 S14 S15 S16
0.75 29 14 34
4 9 1
0.5
0.25
Normalized T50
1 S17 S18 S19

0.75
0.5
44 40 14
8 4 7
0.25
Normalized T50
1 S20 S21 S22

0.75
0.5
15 48 19
0.25 9 0 1
16 15 25
4 4 5
123456789 123456789 123456789
KSS KSS KSS
Figure F.34.: Boxplot of normalized feature T50 versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of T50 [ms] for each subject.

1
0.75
0.5
0.25
290 216 279
Normalized T50
S26 S27 S28

1
0.75
0.5
0.25
100 120 128
Normalized T50
S29 S30 S31

1
0.75
0.5
0.25
157 289 154
Normalized T50
1 S32 S33 S34
0.75
0.5
0.25
165 102 107
Normalized T50
1 S35 S36 S37
0.75
0.5
0.25
129 115 85
Normalized T50
S38
1 S39 S40
0.75
0.5
0.25
132 157 101
Normalized T50
S41
1 S42 S43
0.75
0.5
0.25
114 163 143
123456789 123456789 123456789
KSS KSS KSS
Figure F.35.: Boxplot of normalized feature T50 versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of T50 [ms] for each subject.
1
0.75
0.5
0.25
10 26 41
1
Normalized T80
S5
8 S6
6 S7
1
0.75
0.5
0.25
Normalized T80
15 23 12
S8 S9 S10
1 0 0 2
0.75
0.5
0.25
Normalized T80
S11 S12 S13

1 65 21 76
0.75 6
0.5
0.25
Normalized T80
1 21 S14 S15 S16

25
0.75 5 80 7
0.5
0.25
Normalized T80
1 S17 S18 S19

0.75 26 28 75
0.5 5 2
0.25
5
Normalized T80
1 S20 S21 S22

0.75 11
0.5 78 26
0.25 2
13
4
79
81
123456789 123456789 123456789
KSS KSS KSS

1
0.75
0.5
0.25
18 150 174
0
Normalized T80
S26 S27 S28

1
0.75
0.5
0.25
62 68
Normalized T80
54
S29 S30 S31
1
0.75
0.5
0.25
138 60
Normalized T80
77
1 S32 S33 S34
0.75
0.5
0.25
50 53
Normalized T80
67
1 S35 S36 S37
0.75
0.5
0.25
55 44
Normalized T80
53
1 S38 S39 S40
0.75
0.5
0.25
68 52
Normalized T80
54
1 S41 S42 S43
0.75
0.5
0.25
86 71
62
123456789 123456789 123456789
KSS KSS KSS
1
0.75
0.5
0.25
65 19 23
Normalized T90
S5 7 S6 7 S7
1
0.75
0.5
0.25
10
0
Normalized T90
S8
14 S9
80 S10
1 1
0.75
0.5
0.25
Normalized T90
39 48
1 S11 S12 S13
15
0.75 5
0.5
0.25
Normalized T90
15 18
1 8 S14 S15 S16
7
0.75 50
0.5
0.25
5
Normalized T90
1 18 S17 S18 S19

54
0.75 19
0.5 2
0.25
Normalized T90
1 46 S20 S21 S22

79
0.75
0.5 16
0.25 6
47 94
46
123456789 123456789 123456789
KSS KSS KSS

1
0.75
0.5
0.25
90 100 123
Normalized T90
S26 S27 S28

1
0.75
0.5
0.25
31 32 45
Normalized T90
S29 S30 S31

1
0.75
0.5
0.25
48
80 36
Normalized T90
1 S32 S33 S34
0.75
0.5
0.25
40
26 36
Normalized T90
1 S35 S36 S37
0.75
0.5
0.25
33
28 23
Normalized T90
1 S38 S39 S40

0.75
0.5
0.25
30
30
40
Normalized T90
1 S41 S42 S43

0.75
0.5
0.25
37 43
47
123456789 123456789 123456789
KSS KSS KSS
Normalized A E M CV
1
0.75
0.5
0.25
0
193 [µV] 0.8 [(mV 2] 3 [mV/s]
)
A/MCV A/MOV
M OV
Normalized
1
0.75
0.5
0.25
3 [mV/s] 0.1 [s] 0.07 [s]
0
ACV AOV F
Normalized
1
0.75
0.5
0.25
0 2 [mV/s] 1 [mV/s] 46 [1/min]
T Tc To
Normalized
1
0.75
0.5
0.25
0 951 s] 222 [ms] 280 [ms]
[ Tcl,1 Tcl,2 Tro
Normalized
1
0.75
0.5
0.25
0
539 ms] 764 [ms] 110
[ T50 [ms]
PERCLOS T80
Normalized
1
0.75
0.5
0.25
0 0.6 440 [ms]
8 ]
213 ms
T90 123456789 [1 2 3 4 5 6 7 8 9
KSS KSS
Normalized
1
0.75
0.5
0.25
0
146 ms]
[1 2 3 4 5 6 7 8 9
KSS
Figure F.40.: Boxplot of normalized features versus kss values for subject S2. The maximum value of
each feature is shown on the bottom left of plots.
F.3. Correlation between features using the Spearman’s

rank correlation coefficient
Figures F.41 and F.42 show the association between features for kss input-based features and
drive time-based features with respect to the absolute value of the Spearman’s rank
correlation coefficient |ρs|. The p-value for feature pairs with the red × were all larger than
0.05.
1
T80
T50
0.9
PERCLOS
Tclosed,2
0.8
T90
Tr 0.7
o
Tclosed,1
0.6
F
T 0.5 |ρs|
E
0.4
A
To 0.3
Tc
0.2
AOV
ACV
0.1
A/MCV
A/MOV
0
MOV
MCV MOV A/MOV A/MCV
MCV
Figure F.41.: Absolute values of the Spearman’s rank correlation coefficient ρ|s |calculated between kss
input-based features
F.3 Correlation between features using the Spearman’s rank correlation coefficient 255
1
T80
T50
0.9
PERCLOS
Tclosed,2 0.8
T90
Tr 0.7
o
Tclosed,1
0.6
F
T 0.5 |ρs|
E
0.4
A
To 0.3
Tc
0.2
AOV
ACV
0.1
A/MCV
A/MOV
0
MOV
MCV MOV A/MOV A/MCV
MCV
Figure F.42.: Absolute values of the Spearman’s rank correlation coefficient|ρs| calculated between drive
time-based features
G. Gradient descent approach for training
the ANN
In the following, it is explained how gradient descent is used for training the ann. It is clear
that the gradient descent approach in (8.19) should be calculated for the weights of each layer
separately. Therefore, for the weights of hidden-to-output layer we have
∂J ∂J ∂netk . (G.1)
(2)
= ∂net (2)
∂ jk k
jk
w ∂
w
The last term can be easily calculated out of netk definition in (8.15) which gives yj. For the
first term, the chain rule is applied as follows
∂J ∂J ∂zk
= = −(c −t z ) ). (G.2)
f (net
∂netk ∂zk ∂netk k k k
∂J
is also called sensitivity (ϑ ) and denotes the changes of the training error J with
respect
− to∂tnheetk net activation.
k
The final equation for updating weights of the hidden-to-output layer is
∆wj(2) = η (ck − zk ) f t (netk ) yj = η ϑk yj . (G.3)

k
For the input-to-hidden layer, similarly, the chain rule is applied to (8.19) as follows
∂netj
∂J ∂J ∂y , (G.4)
= j
(1)
(1)
∂ ij ∂yj ∂netj
ij
w ∂
w
where the first term is calculated as
∂J
∂yj = ∂J ∂zk
∂zk ∂yj
∂J ∂zk ∂netk
= ∂zk ∂netk ∂yj
m
=− (ck − zkϑ)fkt (netk ).. w(2) . (G.5)
k=1 j
k
The other terms in (G.4) are calculated based on (8.14), namely ∂yj ∂ netj
∂ne
j = f t (netj ) and (1)
= xi.
∂wij
Similar to the ϑk, ϑj is also defined as t
m
(2)t
ϑ ≡ f (netjk ) w
ϑ , (G.6)j k=1 j k
258 Gradient descent approach for training the ANN
which links the sensitivity at a hidden layer to that of the output layer. Similar to (G.3), the
learning rule of the input-to-hidden layer is
m
t
∆wi = η xi f (netj )
(1)
w(2)j ϑk = η xi ϑj . (G.7)
j k=1 k
Now it is clear, why this algorithm is called back-propagation. The reason is that it calculates
the error and propagates it back as sensitivities ϑk from output to the hidden layer in order to
learn the weights of the input to hidden layer. In other words, the error at a specific layer can
be calculated, only if the error at the next layer is available.
H. On the understanding of the dual
form of the optimization problem
H.1. Karush-Kuhn-Tucker theorem
For a better understanding of the Karush-Kuhn-Tucker theorem, following definitions should

be reviewed first according to Chiang (2007) and Schmieder (2009):
• Convex set: a set C is a convex set, if the line connecting any two points in C also lies in
C (see Figure H.1), i.e.
(
θ x1 + (1 − θ) x2 ∈ C ∀x1, x2 ∈ C , ∀θ ∈ [0, 1] (H.1)
(a) convex (b) nonconvex
Figure H.1.: Examples of convex and non-convex sets
• Convex function: a function is referred to as a convex function, if the value of the

function between any two points of it is smaller than the linear interpolation of
−→we have, f : R R is a convex
n
these points (see Figure H.2). Mathematically,
function, if = domf is a convex set and
f (θ x1 + (1 − θ) x2 ) ≤ θ f (x1 ) + (1 − θ) f (x2 ) ∀x1 , x2 ∈ C , ∀θ ∈ [0, 1] . (H.2)
domf denotes the domain of the function f .
Figure H.2.: An example of a convex function
• Affine function: an affine function is both convex and concave. A function f is concave, if
−f is convex.
260 On the understanding of the dual form of the optimization problem
Theorem (Karush-Kuhn-Tucker Theorem (Cristianini and Shawe-Taylor, 2000))

Given an optimization problem with convex domain Ω ⊆ Rn,
minimise f (w), w∈ Ω
subject to gi(w) ≤ 0, i = 1, . . . , k,
hi(w) = 0, i = 1, . . . , m,
with f ∈ 1 convex and gi, hi affine, the necessary and sufficient conditions for a normal point
C an optimum are the existence of α∗, β∗ such that
w∗ to be
∗ ∗ ∗
∂L(w∂w
,α ,β )
= 0,
∂L(w∗ ∗ ∗
∂,α ,β ) = 0,
β
αi∗ gi(w∗) = 0, i = 1, . . . , k,
∗
gi(w ) ≤ 0, i = 1, . . . , k,
αi∗ ≥ 0, i = 1, . . . , k.
L denotes the Lagrangian function defined in (8.27).
H.2. Extraction of the dual problem for the soft

margin classifier
Similar to the linearly separable data set, the Lagrangian function for the L1-norm case of the
soft margin svm can be defined base on Lagrangian multipliers αi and βi
N N
N ( )
1 2 α y (wT xi + −1+ − β i ξi , (H.3)
+ ξi
L(w, b, ξ, α, β) = i=1 − i=1 i i b ξi i=1
2 C
where α = α1 α2 · · · αN T and β = β β ··· T . The conditions of the kkt, which
1 2
β
N
should be
fulfilled, are as
follows
∂
(
,
ξ
α
β
=
0
=
(H.4)
w= αyx
N
iii
∂w
i=1
N
∂L(w, b, ξ, α, β)
=0 ⇒ αi yi = 0 . (H.5)
i=1
=
∂b
∂L(w, b, ξ, α, β)
= 0 =⇒ α i+ β i = C , i = 1, · · · , N . (H.6)
∂
T
αi (yi (w xi + b ξ− 1 + ξi) = 0 , i = 1, · · · , N (H.7)
βi ξi = 0 , i = 1, · · · , N (H.8)
αi ≥ 0 , βi ≥ 0 , ξi ≥ 0 , i = 1, · · · , N . (H.9)
Based on the above equations three cases may happen

• αi = 0 =⇒ ξi = 0. As mentioned in Section 8.3.2, this corresponds to the correct
classification of xi.
H.2 Extraction of the dual problem for the soft margin classifier 261
• 0 < αi < C = yi w ( T xi + b + ξi = 1 and ξi = 0, which leads to yi wT(xi + b = 1 and

⇒
xi is consequently support vector.
• αi = C = yi ( wT xi + b + ξi = 0 and ξi 0, which means that xi is support
⇒
≥ on the value of ξi, xi might be misclassified (see Section 8.3.2).
vector. Depending
List of figures
1.1 The evolution of number of killed and injured persons in traffic accidents of Germany . . 1
1.2 The evolution of number of vehicle accidents due to driver drowsiness and the number of
injured persons involved in them........................................................................................................2
1.3 A typical steering event detected by the Attention Assist as a drowsiness-related steering
wheel movement.....................................................................................................................................8
1.4 Tool chain of this thesis.......................................................................................................................10
2.1 32-electrode arrangement of EEG (excluding 4 electrodes for eye movement data collection) 17
2.2 ActiCAP measurement system for EEG recording by Brain Products GmbH..............................17
2.3 EEG signals showing α-bursts with closed eyes versus open eyes...............................................18
2.4 Frequency components of the α-bursts by applying the Fourier transform to the wave of
the O2 electrode shown in Figure 2.3...............................................................................................18
2.5 Sensitivity of the asr to auditory and visuomotor secondary tasks and the corresponding
number of horizontal saccades...........................................................................................................20
2.6 asr with different control signals.................................................................................................21
2.7 asr before and recalculated after applying the control signal.......................................................22
2.8 EOG electrodes attached around the eyes for collecting horizontal and vertical eye move-
ment data...............................................................................................................................................24
2.9 An example of the drift in the collected EOG data (vertical component)..................................25
3.1 Structure of the human eye while transmitting the ray of light....................................................32
3.2 Eye muscles...........................................................................................................................................32
3.3 Different categories of eye movements based on their velocity....................................................32
3.4 Representative examples of blinks measured by the vertical (V (n)) and horizontal (H(n))
components of the EOG......................................................................................................................34
3.5 H(n) and V (n) representing different types of saccades due to horizontal, vertical and
diagonal eye movements......................................................................................................................35
4.1 Selected tracks of the Applus+ idiada proving ground.................................................................38

4.2 H(n) and V (n) of subject S2 for baseline P1 of track 1.................................................................40
4.3 H(n) and V (n) of subject S2 for baseline of track 2.......................................................................40
4.4 Spectrogram and power spectral density (psd) of 20 s of V (n) of subject S4 for track 1
and track 2.............................................................................................................................................41
4.5 Boxplot of the values listed in Table 4.1.......................................................................................42
4.6 Detected large amplitude bumps based on the ewvar of wheel speed sensor data...................42
4.7 V (n) of subject S2 and S6 for small (top plot) and large (bottom plot) amplitude bumps
of track 3................................................................................................................................................43
4.8 H(n) of subject S8 for track 4, top/bottom: right/left curve.........................................................44
Calculated sawtooth occurrence time period ∆t versus curve radius r for δ = 4◦, 6◦, 10◦
4.9 c 45
4.10 Daytime experiment’s route, about 130 km.....................................................................................47
4.11 One block of daytime real road experiment with secondary tasks...............................................48
4.12 In-vehicle setup of the daytime experiment with secondary tasks...............................................49
4.13 Nighttime experiment’s route, about 450 km...................................................................................50
5.1 Drift removal by applying a median filter to V (n) to improve blink detection. Top: awake phase,
bottom: drowsy phase.........................................................................................................................54
5.2 Information loss of slow blinks by median filter method..............................................................54
264 List of figures
5.3 V (n) and its derivative V 1 (n) representing eye blinks during the awake phase ..........................56
5.4 Normalized histogram of all detected potential blinks and their clustering thresholds by
the k-means clustering method for 11 subjects................................................................................57
5.5 Simultaneous detection of saccades by eye blink detection algorithm........................................57
5.6 Normalized histogram of all detected potential saccades and blinks with long eye closure.
Their clustering thresholds are also shown......................................................................................58
5.7 Possible combinations of two vertical saccades in V (n).................................................................59
5.8 Flow chart of the derivate-based method for blink detection.......................................................60
5.9 The impact of window length Lwin on the efficiency of the stft...............63
5.10 Examples of typical mother wavelets.................................................................................................64
5.11 Scaling and translation of the mother wavelet with varying a and b......................................................64
5.12 Scalogram of the cwt for x(t) signal shown in Figure 5.9, top plot: 1 ≤ a 256, bottom
plot: 1 ≤ a 20....................................................................................................................................65
5.13 Scalograms of the cwt with different mother wavelets for V (n) signal of the awake phase
(left plots) and the drowsy phase (right plots) of the drive...........................................................66
5.14 cwt with different mother wavelets for V (n) signal of the awake phase (left plots) and
the drowsy phase (right plots) of the drive with a = 5, 10 and 15................................................68
5.15 Comparison ofXψ(a, b) with the Haar wavelet at a = 5, 10, 15, 30 and 100 with the negative
of the derivative of the EOG signal−V 1 (n) for the awake and drowsy phases of the drive . 69
5.16 Comparison of cwt at a = 10, 30 and 100 for the detection of fast (the first 20 s) and slow
(the last 20 s) blinks..............................................................................................................................70
5.17 Detected and accepted peaks at different scales of Xψ(a, b) signals..............................................70
5.18 Flow chart of the cwt-based method for blink detection..............................................................72
5.19 Schematic of spaces spanned by scaling and wavelet functions...................................................74
5.20 J-stage decomposition tree.................................................................................................................76
5.21 Three-stage decomposition of the EOG signal during the awake phase by db4 wavelet..............77
5.22 Three-stage decomposition of the EOG signal during the drowsy phase by db4 wavelet............78
5.23 J-stage reconstruction tree..................................................................................................................79
5.24 Example 1: denoising of the EOG signal by removing different coefficients during the
reconstruction.......................................................................................................................................80
5.25 Example 2: denoising of the EOG signal by removing different coefficients during the
reconstruction.......................................................................................................................................81
5.26 Scatt er plot: ε 1 (n) versus ε 2 (n) shown in Figure 5.25...........................................................82
5.27 Two examples of drift removal with the wavelet decomposition and reconstruction for
awake (top) and drowsy (bottom) phases of V (n)..........................................................................83
5.28 rc and pc of vertical saccade and blink detections for the derivative-based algorithm and
the median filter-based method during the awake and drowsy phases......................................84
5.29 Average duration (first row), amplitude (second row) and number of blinks (third row)
versus self-estimated drowsiness level for subjects S15, S16 and S18 based on the derivative-
based algorithm and the median filter-based method....................................................................85
5.30 rc and pc of blink detection for the derivate based algorithm and the wavelet transform
method during the awake and drowsy phases................................................................................87
5.31 Setting the threshold for distinguishing between blinks and vertical saccades in an online
implementation of the detection method..........................................................................................89
6.1 Saccade rate for the variable time-on-task (four blocks) for all subjects......................................92
6.2 Percentage of saccades time-locked to blinks for all subjects and all blocks during the
visuomotor task....................................................................................................................................93
6.3 Percentage of saccades time-locked to blinks with respect to saccade direction averaged
over all blocks during the visuomotor task......................................................................................94
6.4 Scatter plot: number of saccades accompanied by blinks with respect to their direction
during the visuomotor task for all subjects. Ellipses show two clusters......................................94
6.5 Scatter plots of blink rate for visuomotor vs. driving and auditory vs. driving task.
Pearson correlation coefficient (ρp) and the corresponding p-values are provided as well. 96
List of figures 265
6.6 Percentage of blinks time-locked to saccades for all subjects averaged over all blocks during
the visuomotor task..............................................................................................................................97
6.7 Scatter plot: blink rate versus saccade rate during the visuomotor task.....................................97
6.8 EOG signals during the visuomotor and driving task for subject S8...........................................98
6.9 EOG signals during the visuomotor and driving task for subject S1...........................................98
6.10 Algorithm for determining the threshold of horizontal saccade detection.................................99
6.11 Histogram: absolute amplitude of saccades out of H(n) signal for subject S1.........................100
6.12 The algorithm for balancing the number of small (Ns) and large-amplitude (Nl) saccades 100
6.13 Normalized histogram: amplitude of all horizontal saccades (dark bars) and those accom-
panied by blinks (light bars) for 12 subjects...................................................................................101
6.14 Scatter plot: number of saccades in percent time-locked to the blinks with respect to their
amplitude, i.e. small and large.........................................................................................................101
7.1 kss input-based feature aggregation method.................................................................................104

7.2 Relative frequency of kss values for two feature aggregation methods.....................................105
7.3 Drive time-based feature aggregation method..............................................................................106
7.4 MOV feature before and after baselining.......................................................................................107
7.5 V (n) and its derivative V 1 (n) representing eye blinks in awake and drowsy phases with
the corresponding features...............................................................................................................109
7.6 Boxplot of normalized drive time-based features combined for all subjects versus kss values110
7.7 Histogram of MCV and MOV with the estimated (est.) distribution (dist.) curves..............111
7.8 Histogram of ACV and AOV with estimated (est.) distribution (dist.) curves. The
outliers (values > 10 mV/s) are not shown.
.........................................................................................................................................................
112
7.9 Histogram of Tc and To with the estimated (est.) distribution (dist.) curves...........................114
7.10 Best linear fit to all baselined feature values of Tc and To............................................................................... 114
7.11 Comparison between Tc, Tcl,1 and To during the awake and drowsy phases of the drive for
all subjects...........................................................................................................................................115
7.12 Best linear fit to all baselined feature values of Tro..................................................................................... 116
7.13 Scatter plot of T versus Tx, x } 50, 80, 90 . The red line shows the best linear fit with
its equation on the top of each plot.................................................................................................119
7.14 Examples of intended (takeover maneuver) and unintended (lane departure) lane change
events visible in the vehicle’s lateral distance signal.....................................................................123
7.15 The mean of the ewvar of lateral distance versus kss values for 25 subjects who drove in
the driving simulator. The standard deviations are also shown.................................................125
7.16 The mean of all baselined features over the first and the last 5 min before the first unin-
tended lane departure event for 23 subjects who drove in the driving simulator....................126
7.17 The mean of all baselined features over the first and the last 5 min of the drive for 18
subjects who drove under real conditions......................................................................................128
7.18 The mean of all baselined features over the first and the last 5 min before an unintended
microsleep for 11 subjects who drove in the driving simulator..................................................130
7.19 Boxplot of baselined kss input-based features for all subjects versus kss values.....................133
7.20 Absolute values of Pearson correlation coefficient calculated between kss input-based fea-
tures......................................................................................................................................................136
7.21 Absolute values of the Pearson correlation coefficient calculated between drive time-based
features.................................................................................................................................................136
7.22 scatter plot: comparison of 50-Hz features with 40- and 30-Hz ones for the first 12 subjects
- part 1..................................................................................................................................................137
7.23 scatter plot: comparison of 50-Hz features with 40- and 30-Hz ones for the first 12 subjects
- part 2..................................................................................................................................................138
7.24 Scatter plot: comparing AOV and MOV extracted based on 30, 40 and 50 Hz sampling
rate. The lines indicate the best linear fits.....................................................................................139
8.1 Examples of three classification rules..............................................................................................143

8.2 Distribution of classes for kss input-based and drive time-based features...............................143
266 List of figures
8.3 Applying the smote to an imbalanced data set............................................................................148

8.4 Applying the smote and the Tomek link cleaning technique to an imbalanced data..............149
8.5 Architecture of a feed-forward neural network with 3 inputs, 3 neurons in one hidden layer
and 2 outputs......................................................................................................................................150
8.6 Mathematical representation of the input-to-hidden layer of a network...................................150
8.7 Sigmoid activation function..............................................................................................................151
8.8 Supervised classification by the ann...............................................................................................152
8.9 adr of the training and test sets of the binary subject-dependent ann classifier for different
numbers of neurons. Feature type: kss input-based features. Bars refer to the standard
deviation of permutations.................................................................................................................154
8.10 adr of the training and test sets of the binary and 3-class subject-dependent ann
classifier for different numbers of neurons. Feature type: drive time-based features. Bars
refer to
the standard deviation of permutations.........................................................................................155
8.11 adr of the training and test sets of the binary subject-dependent ann classifier for different
numbers of neurons. Feature type: imbalanced and balanced by smote kss input-based
features of driving simulator experiment. Bars refer to the standard deviation of permu-
tations...................................................................................................................................................156
8.12 adr of the kss input-based features for the real road experiment applied to the network
trained based on the smote. Bars refer to the standard deviation of permutations...............157
8.13 Comparing confusion matrix of the binary subject-dependent ann classifier with that of
the subject-independent....................................................................................................................159
8.14 Different separating hyperplanes....................................................................................................160
8.15 Example of a linearly inseparable data...........................................................................................161
8.16 An example of feature mapping for a linearly inseparable data set...........................................163
8.17 An example of the grid search for finding (C0, γ0) and (Copt, γopt)...............................................165
8.18 j-fold cross validation method..........................................................................................................165
8.19 An example of a 3-class classification with shaded areas as the unclassifiable. The
arrows show the positive sides of the hyperplanes. Decision functions for the One-
Against-All approach: J H: wT Φ(xi) + bJ = 0, J = 1, 2, 3 and for the One-Against-One
approach:
H: wJT Φ(x) + bIJ = 0, I = 1, 2, 3, J = 1, 2, 3 and I /= J..............................................166
8.20 Boxplot of C , γ, training and test accuracies for the balanced and imbalanced 2-class subject-
dependent classification with the svm for all 100 permutations. Feature type: kss input-based
features.................................................................................................................................................169
8.21 Boxplot of C , γ, training and test accuracies for the 2-class and 3-class subject-dependent
classification with the svm for all 100 permutations. Feature type: drive time-based
features170
8.22 Comparing confusion matrix of the binary subject-dependent svm classifier with that of
8.23 adr of the test sets of the binary subject-dependent k-nn classifier for different numbers
of neighbors. Feature type: kss input-based features. Bars refer to the standard
deviation
of permutations...................................................................................................................................173
8.24 adr of the 2-class and 3-class subject-dependent k-nn classifier for different numbers of
neighbors. Feature type: drive time-based features. Bars refer to the standard deviation
of permutations...................................................................................................................................174
8.25 adr of the test sets of the binary subject-dependent k-nn classifier for different numbers
of neighbors. Feature type: imbalanced kss input-based features of driving simulator
experiment. Bars refer to the standard deviation of permutations.............................................175
8.26 Comparing confusion matrix of the binary subject-dependent k-nn classifier with that of
8.27 Comparing confusion matrices of the binary ann, svm and k-nn classifiers for the subject-
dependent and subject-independent classifications. Feature type:drive time-based features 176
8.28 Comparing confusion matrices of the 3-class ann, svm and k-nn classifiers for the subject-
dependent classification. Feature type: drive time-based features.............................................177
8.29 Comparing confusion matrices of the binary subject-dependent ann, svm and k-nn clas-
sifiers for different kss input-based features..................................................................................178
List of figures 267
8.30 adr of the training and test sets of the binary subject-dependent ann classifier for
different numbers of neurons based on gsrd case. Feature type: drive time-based
features. Bars
refer to the standard deviation of permutations............................................................................180
8.31 Comparing confusion matrices of the binary ann ( Nh = 10), svm and k-nn (k = 7)
classifiers for the subject-dependent and subject-independent classifications of the gsrd
case. Feature type: drive time-based features................................................................................180
8.32 Boxplot of C , γ, training and test accuracies for the 2-class subject-dependent classification
of the gsrd case with svm for all 100 permutations. Feature type: drive time-based features181
8.33 adr of the 2-class subject-dependent k-nn classifier of the gsrd case for different
numbers of neighbors. Feature type: kss input-based features. Bars refer to the
standard deviation
of permutations...................................................................................................................................182
8.34 Comparing confusion matrices of the binary subject-independent ann (Nh = 10), svm
(C = 90.5, γ = 1.4) and k-nn (k = 7) classifiers for unseen real road drives of drop-outs.
Feature type: drive time-based features...................................................................................182
8.35 The ann classification accuracy of the best selected features by the sffs algorithm from
1 to 10-feature combination. Feature type: drive time-based features.......................................185
8.36 MR values of the best 10-feature combinations calculated based on the Pearson and Spear-
man’s rank correlation coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
A.1 Geometrical representation of tracking two successive tps during a curve negotiation . . 197
B.1 Description of a boxplot representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
E.1 Scaling and wavelet functions of two mother wavelets . . . . . . . . . . . . . . . . . . . . 211
F.1 Comparison of Spearman’s rank correlation coefficient between statistical metrics of fea-
tures and kss values. Feature type: drive time-based features...................................................213
F.2 Boxplot of normalized feature A versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of A [µV] for each subject.........................215
F.3 Boxplot of normalized feature A versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of A [µV] for each subject.........................................................216
F.4 Boxplot of normalized feature E versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of E [(mV)2] for each subject...................217
F.5 Boxplot of normalized feature E versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of E [(mV)2] for each subject...................................................218
F.6 Boxplot of normalized feature MCV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of MCV [mV/s] for each subject. 219
F.7 Boxplot of normalized feature MCV [mV/s] versus kss for subjects S23 to S43. The values
on the bottom left show the maximum of MCV for each subject..............................................220
F.8 Boxplot of normalized feature MOV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of MOV [mV/s] for each subject. 221
F.9 Boxplot of normalized feature MOV versus kss for subjects S23 to S43. The values on
the bottom left show the maximum of MOV [mV/s] for each subject.......................................222
F.10 Boxplot of normalized feature A/MCV versus kss for subjects S1 to S22 (except for
subject S2). The values on the bottom left show the maximum of A/MCV [s] for each
subject...................................................................................................................................................223
F.11 Boxplot of normalized feature A/MCV versus kss for subjects S23 to S43. The values on
the bottom left show the maximum of A/MCV [s] for each subject..........................................224
F.12 Boxplot of normalized feature A/MOV versus kss for subjects S1 to S22 (except for
subject S2). The values on the bottom left show the maximum of A/MOV [s] for each
subject...................................................................................................................................................225
F.13 Boxplot of normalized feature A/MOV versus kss for subjects S23 to S43. The values on
the bottom left show the maximum of A/MOV [s] for each subject..........................................226
F.14 Boxplot of normalized feature ACV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of ACV [mV/s] for each subject. 227
268 List of figures
F.15 Boxplot of normalized feature ACV versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of ACV [mV/s] for each subject...............................................228
F.16 Boxplot of normalized feature AOV versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of AOV [mV/s] for each subject. 229
F.17 Boxplot of normalized feature AOV versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of AOV [mV/s] for each subject...............................................230
F.18 Boxplot of normalized feature F versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of F [1/min] for each subject...................231
F.19 Boxplot of normalized feature F versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of F [1/min] for each subject....................................................232
F.20 Boxplot of normalized feature T versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of T [s] for each subject.............................233
F.21 Boxplot of normalized feature T versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of T [s] for each subject.............................................................234
F.22 Boxplot of normalized feature Tc versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of Tc [ms] for each subject.........................235
F.23 Boxplot of normalized feature Tc versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of Tc [ms] for each subject.........................................................236
F.24 Boxplot of normalized feature To versus kss for subjects S1 to S22 (except for subject S2).
The values on the bottom left show the maximum of To [ms] for each subject........................237
F.25 Boxplot of normalized feature To versus kss for subjects S23 to S43. The values on the bottom
left show the maximum of To [ms] for each subject......................................................................238
F.26 Boxplot of normalized feature Tcl,1 versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of Tcl,1 [ms] for each subject.............239
F.27 Boxplot of normalized feature Tcl,1 versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of Tcl,1 [ms] for each subject......................................................240
F.28 Boxplot of normalized feature Tcl,2 versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of Tcl,2 [ms] for each subject.............241
F.29 Boxplot of normalized feature Tcl,2 versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of Tcl,2 [ms] for each subject......................................................242
F.30 Boxplot of normalized feature Tro versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of Tro [ms] for each subject...............243
F.31 Boxplot of normalized feature Tro versus kss for subjects S23 to S43. The values on the bottom
left show the maximum of Tro [ms] for each subject.....................................................................244
F.32 Boxplot of normalized feature perclos versus kss for subjects S1 to S22 (except for
subject S2). The values on the bottom left show the maximum of perclos for each subject.245
F.33 Boxplot of normalized feature perclos versus kss for subjects S23 to S43. The values on
the bottom left show the maximum of perclos for each subject...............................................246
F.34 Boxplot of normalized feature T50 versus kss for subjects S1 to S22 (except for subject
S2). The values on the bottom left show the maximum of T50 [ms] for each subject..............247
F.35 Boxplot of normalized feature T50 versus kss for subjects S23 to S43. The values on the
bottom left show the maximum of T50 [ms] for each subject.......................................................248
F.40 Boxplot of normalized features versus kss values for subject S2. The maximum value of
each feature is shown on the bottom left of plots..........................................................................253
F.41 Absolute values of the Spearman’s rank correlation coefficient| ρ|s calculated between kss
input-based features...........................................................................................................................254
List of figures 269
F.42 Absolute values of the Spearman’s rank correlation coefficient | |ρs calculated between drive
time-based features............................................................................................................................255
H.1 Examples of convex and non-convex sets.......................................................................................259

H.2 An example of a convex function.....................................................................................................259
List of tables
2.1 Karolinska Sleepiness Scale (kss).......................................................................................................27

2.2 Literature review of the length of time intervals between successive kss inputs.......................28
2.3 Stanford Sleepiness Scale (sss)...........................................................................................................29
4.1 Means of moving standard deviations of H(n) for all rounds (R) and parts (P) of track 1
and track 2, for all subjects (excluding subject S3).........................................................................42
4.2 Summary of experiments studied in this work................................................................................51
5.1 confusion matrix: events of video labeling versus those of the proposed detection methods 83
6.1 Values of anova to assess the significant difference between means of blink rates for all
tasks........................................................................................................................................................95
6.2 Contingency table: saccade amplitude versus occurrence of gaze shift-induced blinks, for
subject S1, first selection procedure.................................................................................................102
7.1 Literature review of feature aggregation and the calculated statistic measure........................104
7.2 Extracted blink features.....................................................................................................................108
7.3 Literature review of the experiment setups. n. s.: not specified.................................................120
7.4 Literature review of the features introduced in this work. Trends versus drowsiness are
either pos.: positive or neg.: negative. n. s.: the feature was studied without its trend
being specified. * reduced vigilance, ** before a driving error, *** based on another end point
for blinks............................................................................................................................................................ 121
7.5 Left table: number of occurrences of kss values at the time of first unintended lane
departure and number of occurrences for the maximum value of kss, if no lane
departure
was detected. Right table: confusion matrix..................................................................................124
7.6 Results of paired-sample t-test (t0) and Wilcoxon signed-rank test (z0) for all features
shown in Figure 7.16. Red color indicates non-significant features.............................................127
7.7 Results of paired-sample t-test (t0) and Wilcoxon signed-rank test (z0) shown in Figure
7.17. Red color indicates non-significant features.........................................................................129
7.8 Left table: number of occurrences of kss values at the time of first microsleep and the
number of occurrences for the maximum value of kss, if no microsleep was detected. Right
table: confusion matrix......................................................................................................................129
7.9 Results of paired-sample t-test (t0) and Wilcoxon signed-rank test (z0) for all features
shown in Figure 7.18. Red color indicates non-significant features.............................................131
7.10 Sorted Spearman’s rank correlation coefficient ρs and Pearson correlation coefficient ρp
between all kss input-based features and kss values (N = 391). All p-values were smaller
than 0.05 except for red features......................................................................................................134
7.11 Sorted Spearman’s rank correlation coefficient ρs and Pearson correlation coefficient ρp
between all drive time-based features and kss values (N = 4021). All p-values were smaller
than 0.05...............................................................................................................................................135
8.1 Confusion matrix of a binary classifier...........................................................................................144

8.2 Confusion matrix of the binary subject-dependent ann classifier ( Nh = 5). Feature type:
kss input-based features....................................................................................................................154
8.3 Confusion matrices of the subject-dependent ann classifiers for 2-class (Nh = 10) and
3-class (Nh = 20) cases. Feature type: drive time-based features...........................................155
272 List of tables
8.4 Confusion matrices of the binary subject-dependent ann classifier for kss input-based
features of the driving simulator experiment. Left: imbalanced features ( Nh = 2). Right:
balanced features by smote (Nh = 2)
156
8.5 Confusion matrices of the binary subject-dependent ann classifier for kss input-based
features of the real road experiment applied to the network trained based on the smote.
Left: Nh = 3. Right: Nh = 10.............................................................................................158
8.6 Confusion matrix of the binary subject-independent ann classifier for drive time-based
features (Nh = 2)
158
8.7 Confusion matrix of the binary subject-dependent svm classifier. Feature type: kss input-
based features.....................................................................................................................................169
8.8 Confusion matrices of the subject-dependent svm classifiers for the 2-class and 3-class
cases. Feature type: drive time-based features..............................................................................170
8.9 Confusion matrices of the binary subject-dependent svm classifiers for kss input-based
features of driving simulator experiment. Left: imbalanced features, right: balanced
features by considering different misclassification costs..............................................................170
8.10 Confusion matrix of the binary subject-dependent svm classifier for kss input-based
features of the real road experiment applied to the model trained by considering
different
misclassification costs........................................................................................................................171
8.11 Confusion matrix of the binary subject-independent svm classifier for drive time-based
features.................................................................................................................................................172
8.12 Confusion matrix of the binary subject-dependent k-nn classifier for k = 5. Feature type:
kss input-based features....................................................................................................................174
8.13 Confusion matrices of the subject-dependent k-nn classifier (k = 7) for the 2-class and
3-class cases. Feature type: drive time-based features..................................................................174
8.14 Confusion matrix of the binary subject-dependent k-nn classifier (k = 7). Feature type:
imbalanced kss input-based features of driving simulator experiment
175
8.15 Confusion matrix of the binary subject-independent k-nn classifier for drive time-based
features (k = 9)
175
8.16 Best selected feature combination set by the sffs and ann classifier from 1 to 10 features.
Feature type: drive time-based features.........................................................................................185
8.17 Confusion matrices of the binary subject-dependent ann classifier ( Nh = 10) for
drive time-based features. Left: classification with 19 features. Right: classification
with 4
features.................................................................................................................................................186
8.18 Confusion matrices of the binary subject-dependent ann classifier (Nh = 10) for drive
time- based features of drop-outs. Left: classification with 19 features. Right:
classification with
4 features..............................................................................................................................................186
8.19 Values of ∆γ calculated based on the mia method for drive time-based features (D˘ = 4) . 187
8.20 Best selected drive time-based features based on the cfs method and Pearson
correlation coefficient (Red features were also selected by the Spearman’s rank
correlation coefficient
i n Table 8.21.).....................................................................................................................................188
8.21 Best selected drive time-based features based on the cfs method and Spearman’s rank
correlation coefficient (Red features were selected by the Pearson correlation coefficient in
Table 8.20.)..........................................................................................................................................189
8.22 Confusion matrices of the binary subject-dependent ann classifier (Nh = 10) for drive
time- based features and kcfs = 4. Left: Pearson correlation coefficient. Right: Spearman’s
rank correlation.
190
8.23 Best selected drive time-based features based on the cfs method using the Pearson cor-
relation coefficient regardless of the number of features.............................................................190
8.24 Best selected drive time-based features based on the cfs method using the Spearman’s
rank correlation coefficient regardless of the number of features...............................................190
D.1 Typical data set of one-way anova....................................................................................................206

D.2 Contingency table with two categories...........................................................................................209
Bibliography
Abdellaoui, A. (2013). Driver state classification based on eye movements by support vector machine.
Master’s thesis, University of Stuttgart.
Abe, S. (2010). Support Vector Machines for Pattern Classification. Advances in Computer Vision and
Pattern Recognition. Springer.
Addison, P. S. (2010). The Illustrated Wavelet Transform Handbook: Introductory Theory and Applications
in Science, Engineering, Medicine and Finance. Taylor & Francis.
Akbani, R., Kwek, S., and Japkowicz, N. (2004). Applying support vector machines to imbalanced
datasets. In Boulicaut, J.-F., Esposito, F., Giannotti, F., and Pedreschi, D., editors, Machine
Learning: ECML 2004, volume 3201 of Lecture Notes in Computer Science, pages 39–50. Springer
Berlin Heidelberg.
Åkerstedt, T. and Gillberg, M. (1990). Subjective and objective sleepiness in the active individual. The
International journal of neuroscience, 52(1-2):29–37.
Åkerstedt, T., Peters, B., Anund, A., and Kecklund, G. (2005). Impaired alertness and performance
driving home from the night shift: a driving simulator study. Journal of sleep research, 14(1):17–20.
Anund, A. (2009). Sleepiness at the wheel. PhD thesis, Karolinska Institutet, Stockholm.
Anund, A., Kecklund, G., Vadeby, A., Hjälmdahl, M., and Åkerstedt, T. (2008). The alerting effect
of hitting a rumble strip - a simulator study with sleepy drivers. Accident Analysis & Prevention,
40(6):1970 – 1976.
Applus+ IDIADA (2014). [Online; accessed 15-October-2014] http://www.applusidiada.com/en/.
Arnedt, J. T., Geddes, M. A. C., and MacLean, A. W. (2005). Comparative sensitivity of a
simulated driv-
ing task to self-report, physiological, and other performance measures during prolonged wakefulness.
Journal of Psychosomatic Research, 58(1):61 – 71.
Artusi, R., Verderio, P., and Marubini, E. (2002). Bravais-pearson and spearman correlation coefficients:
meaning, test of hypothesis and confidence interval. International Journal of Biological Markers,
17(2):148–151.
Asa, B. and Weston, J. (2010). A user’s guide to support vector machines. In Carugo, O. and
Eisenhaber, F., editors, Data Mining Techniques for the Life Sciences, volume 609 of Methods in
Molecular Biology, pages 223–239. Humana Press.
Authié, C. N. and Mestre, D. R. (2011). Optokinetic nystagmus is elicited by curvilinear optic flow
during high speed curve driving. Vision Research, 51(16):1791 – 1800.
Baranski, J. V. (2007). Fatigue, sleep loss, and confidence in judgment. Journal of Experimental Psy-
chology: Applied, 13(4):182–196.
Barea, R., Boquete, L., Ortega, S., López, E., and Rodríguez-Ascariz, J. M. (2012). EOG-based eye move-
ments codification for human computer interaction. Expert Systems with Applications, 39(3):2677 –
2683.
Barr, L., Howarth, H., Popkin, S., and Carroll, R. J. (2005). A review and evaluation of emerging
driver fatigue detection measures and technologies. In Proceedings of the International
Conference on Fatigue Management in Transportation Operations, Seattle, USA.
Bartula, M., Tigges, T., and Muehlsteff, J. (2013). Camera-based system for contactless monitoring of
respiration. In Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual
International Conference of the IEEE, pages 2672–2675.
274 BIBLIOGRAPHY
Beideman, L. R. and Stern, J. A. (1977). Aspects of the eye blink during simulated driving as a function
of alcohol. Human Factors, 19:73–77.
Belz, S. M., Robinson, G. S., and Casali, J. G. (2004). Temporal separation and self-rating of
alertness as indicators of driver fatigue in commercial motor vehicle operators. Human
Factors: The Journal of the Human Factors and Ergonomics Society, 46(1):154–169.
Bergasa, L., Nuevo, J., Sotelo, M., Barea, R., and Lopez, M. E. (2006). Real-time system for
monitoring driver vigilance. Intelligent Transportation Systems, IEEE Transactions on, 7(1):63–77.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Information Science and Statistics.
Springer.
Bouchner, P., Piekník, R., Novotny, S., Pekny, J., Hajny, M., and Borzová, C. (2006). Fatigue of car
drivers-detection and classification based on the experiments on car simulators. In 6th WEEAS In-
ternational Conference on Simulation, Modeling and Optimization, pages 727–732, Lisbon, Portugal.
Brain Products GmbH (2009). Selecting a suitable EEG recording cap - tutorial. [Online; accessed 14-
August-2014] http://www.brainproducts.com/downloads.php?kid=8.
Brown, T., Lee, J., Schwarz, C., Fiorentino, D., and McDonald, A. (2014). Assessing the feasibility
of vehicle-based sensors to detect drowsy driving. Technical report, National Advanced Driving
Simulator, The University of Iowa.
Bulling, A., Ward, J., Gellersen, H., and Troster, G. (2011). Eye movement analysis for activity recogni-
tion using electrooculography. Pattern Analysis and Machine Intelligence, IEEE Transactions on,
33(4):741–753.
Burrus, C. S., Gopinath, R. A., and Guo, H. (1998). Introduction to wavelets and wavelet transforms: a
primer. Prentice Hall.
Caffier, P. P., Erdmann, U., and Ullsperger, P. (2003). Experimental evaluation of eye-blink parameters
as a drowsiness measure. European Journal of Applied Physiology, 89(3-4):319–325.
Chang, C. C. and Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM
Transactions on Intelligent Systems and Technology, 2:27:1–27:27. Software available at
http://www.csie.ntu. edu.tw/~cjlin/libsvm.
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: Synthetic minority
over-sampling technique. Journal of Artificial Intelligence Research, 16:321–357.
Chiang, C. (2007). Optimization of Communication Systems, Lecture 1B: Convex Sets and Convex Func-
tions. Electrical Engineering Department, Princeton University. [Online; accessed 07-July-
2014] https://www.princeton.edu/~chiangm/ele539l1b.pdf.
Chua, E. C., Tan, W., Yeo, S., Lau, P., Lee, I., Mien, I. H., Puvanendran, K., and Gooley, J. (2012).
Heart rate variability can be used to estimate sleepiness-related decrements in psychomotor
vigilance during total sleep deprivation. Sleep, 35(3):325–334.
Čolić, A., Marques, O., and Furht, B. (2014). Driver Drowsiness Detection. Springer.
Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3):273–297.
Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. Information Theory, IEEE
Transactions on, 13(1):21–27.
Crawshaw, J. and Chambers, J. (2001). A Concise Course in Advanced Level Statistics: With Worked
Examples. Nelson Thornes.
Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other
Kernel-based Learning Methods. Cambridge University Press.
Daimler AG (2008). Hightech report 02. Technical report.
Daimler AG (2014a). Attention Assist. [Online; accessed 24-August-2014] http://www.daimler.com/
dccom/0-5-1210218-1-1210332-1-0-0-1210228-0-0-135-0-0-0-0-0-0-0-0.html.
BIBLIOGRAPHY 275
Daimler AG (2014b). DISTRONIC PLUS with Steering Assist. [Online; accessed 22-January-2015]
http://www.daimler.com/dccom/0-5-1210218-1-1210321-1-0-0-1210228-0-0-135-0-0-0-0- 0-0-0-
0.html.
Damousis, I. G. and Tzovaras, D. (2008). Fuzzy fusion of eyelid activity indicators for hypovigilance-
related accident prediction. Intelligent Transportation Systems, IEEE Transactions on, 9(3):491
–500.
DESTATIS (2013a). Unfallentwicklung auf Deutschen Sraßen. [Online; accessed
20-November-2014] https://www.destatis.de/DE/Publikationen/Thematisch/
TransportVerkehr/Verkehrsunfaelle/PK_Unfallentwicklung_PDF.pdf;jsessionid=
6C80F60C49D0724AACAFAD373BC163C7.cae2? blob=publicationFile.
DESTATIS (2013b). Verkehrsunfälle-Zeitreihen 2012. [Online; accessed 23-March-2014] https:
//www.destatis.de/DE/Publikationen/Thematisch/TransportVerkehr/Verkehrsunfaelle/
VerkehrsunfaelleZeitreihen.html.
Dinges, D. F. and Grace, R. (1998). PERCLOS: A valid psychophysiological measure of alertness as
assessed by psychomotor vigilance. Technical Report Tech. Rep. FHWA-MCRT-98-006, Federal
Highway Administration. Office of motor carriers.
Dong, Y., Hu, Z., Uchimura, K., and Murayama, N. (2011). Driver inattention monitoring system for
intelligent vehicles: A review. Intelligent Transportation Systems, IEEE Transactions on, 12(2):596
–614.
Duchowski, A. (2007). Eye Tracking Methodology: Theory and Practice. Springer.
Duda, R. O., Hart, P. E., and Stork, D. G. (2012). Pattern Classification. Wiley.
Dukas, R. (1998). Cognitive Ecology: The Evolutionary Ecology of Information Processing and Decision
Making. Cognitive Ecology: The Evolutionary Ecology of Information Processing and Decision
Making. University of Chicago Press.
Dureman, E. I. and Bodéén, C. (1972). Fatigue in simulated car driving. Ergonomics, 15(3):299–308.
Ebrahim, P. (2011). Drowsiness detection using lane data - event-based and driver model approaches.
Ebrahim, P., Stolzmann, W., and Yang, B. (2013a). Eye movement detection for assessing driver drowsi-
ness by electrooculography. In Systems, Man, and Cybernetics (SMC), 2013 IEEE International
Conference on, pages 4142–4148.
Ebrahim, P., Stolzmann, W., and Yang, B. (2013b). Road dependent driver eye movements under
real driving conditions by electrooculography. In Communications, Signal Processing, and their
Applications (ICCSPA), 2013 1st International Conference on, pages 1–6.
Ebrahim, P., Stolzmann, W., and Yang, B. (2013c). Spontaneous vs. gaze shift-induced blinks for
assessing driver drowsiness/inattention by electrooculography. In Driver Distraction and
Inattention, 2013 3rd International Conference on.
Eoh, H. J., Chung, M. K., and Kim, S. (2005). Electroencephalographic study of drowsiness in
simulated driving with sleep deprivation. International Journal of Industrial Ergonomics, 35(4):307 –
320.
Ergoneers GmbH (2014). Eye-tracking glasses - Dikablis essential. [Online; accessed 30-September-
2014] http://www.ergoneers.com/wp-content/uploads/2014/09/Dikablis-Essential-Eye- Tracking-
Glasses.pdf.
Eskandarian, A., Sayed, R., Delaigue, P., Blum, J., and Mortazavi, A. (2007). Advanced driver fatigue
research. Technical report, Center for Intelligent Systems Research (CISR) , School of Engineering
and Applied Science, The George Washington University.
Evinger, C., Manning, K., Pellegrini, J., Basso, M., Powers, A., and Sibony, P. (1994). Not looking while
leaping: the linkage of blinking and saccadic gaze shifts. Experimental Brain Research, 100(2):337–
344.
276 BIBLIOGRAPHY
Fairclough, S. H. and Gilleade, K. (2014). Advances in Physiological Computing. Human–Computer

Interaction Series. Springer London, Limited.
Field, A. (2007). Discovering Statistics Using SPSS. Introducing Statistical Methods Series. SAGE
Publications.
Ford Motor Company (2010). Ford technology news brief. [Online; accessed 25-August-2014] http:
//technology.fordmedia.eu/documents/newsletter/FordTechnologyNewsletter082010.pdf.
Friedrichs, F., Hermannstädter, P., and Yang, B. (2011). Consideration of influences on driver state
classification. In 2nd International Conference on Driver Distraction and Inattention.
Friedrichs, F., Miksch, M., and Yang, B. (2010). Estimation of lane data-based features by odometric
vehicle data for driver state monitoring. In Intelligent Transportation Systems (ITSC), 2010 13th
International IEEE Conference on, pages 611–616.
Friedrichs, F. and Yang, B. (2010a). Camera-based drowsiness reference for driver state classification
under real driving conditions. In Intelligent Vehicles Symposium (IV), 2010 IEEE, pages 101 –106.
Friedrichs, F. and Yang, B. (2010b). Drowsiness monitoring by steering and lane data based features
under real driving conditions. In 18th European Signal Processing Conference (EUPISCO).
Fürsich, A. (2009). Klassifikation des fahrerzustandes mit hidden markov modellen und bayes-netzen.
Gault, T. R. and Farag, A. A. (2013). A fully automatic method to extract the heart rate from
thermal video. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE
Conference on, pages 336–341.
Gershon, P., Shinar, D., Oron-Gilad, T., Parmet, Y., and Ronen, A. (2011). Usage and perceived effec-
tiveness of fatigue countermeasures for professional and nonprofessional drivers. Accident
Analysis & Prevention, 43(3):797 – 803.
Gosling, J. (1995). Introductory Statistics. Quicksmart University Guides Series. Pascal Press.
Hall, M. A. (1999). Correlation-based Feature Selection for Machine Learning. PhD thesis, University of
Waikato, Department of computer science.
Hall, M. A. (2000). Correlation-based feature selection for discrete and numeric class machine learning.
In ICML, pages 359–366.
Hallvig, D., Anund, A., Fors, C., Kecklund, G., Karlsson, J. G., Wahde, M., and Åkerstedt, T. (2013).
Sleepy driving on the real road and in the simulator-a comparison. Accident Analysis & Prevention,
50:44 – 50.
Hammoud, R. I. (2008). Passive Eye Monitoring: Algorithms, Applications and Experiments, chapter 14,
page 315. Signals and Communication Technology. Springer London, Limited.
Hargutt, V. (2003). Das Lidschlussverhalten als Indikator für Aufmerksamkeits- und Müdigkeitsprozesse
bei Arbeitshandlungen. Fortschritt-Berichte VDI.: Biotechnik, Medizintechnik. VDI-Verlag.
He, H. and Garcia, E. A. (2009). Learning from imbalanced data. Knowledge and Data Engineering,
IEEE Transactions on, 21(9):1263–1284.
He, H., Pang, C., and Li, Q. (2010). Driver fatigue monitoring method based on multi-information
fusion. In Image and Signal Processing (CISP), 2010 3rd International Congress on , volume 1,
pages 110–113.
Hermannstädter, P. and Yang, B. (2013). Driver distraction assessment using driver modeling. In
Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on, pages 3693–3698.
Hirshkowitz, M. (2013). Fatigue, sleepiness, and safety: Definitions, assessment, methodology. Sleep
Medicine Clinics, 8(2):183 – 189.
Hoddes, E., Zarcone, V., Smythe, H., Phillips, R., and Dement, W. C. (1973). Quantification of
sleepiness: A new approach. Psychophysiology, 10(4):431–436.
BIBLIOGRAPHY 277
Hoel, J., Jaffard, M., and Van Elslande, P. (2010). Attentional competition between tasks and its
impli- cations. In European Conference on Human Centred Design for Intelligent Transport Systems.
Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., and van de Weijer, J. (2011).
Eye Tracking: A comprehensive guide to methods and measures. OUP Oxford.
Horak, K. (2011). Fatigue features based on eye tracking for driver inattention system. In Telecommuni-
cations and Signal Processing (TSP), 2011 34th International Conference on, pages 593–597.
Horne, J. A. and Reyner, L. A. (1996). Counteracting driver sleepiness: effects of napping, caffeine,
and placebo. Psychophysiology, 33(3):306–309.
Hsu, C. W., Chang, C. C., and Lin, C. J. (2003). A practical guide to support vector classification.
[Online; accessed 10-April-2015] http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf.
Hu, S. and Zheng, G. (2009). Driver drowsiness detection with eyelid related parameters by support
vector machine. Expert Systems with Applications, 36(4):7651 – 7658.
Huang, R., Chang, S., Hsiao, Y., Shih, T., Lee, S., Ting, H., and Lai, C. (2012). Strong correlation
of sleep onset between EOG and EEG sleep stage 1 and 2. In Computer, Consumer and Control
(IS3C), 2012 International Symposium on, pages 614–617.
Ingre, M., ÅKerstedt, T., Peters, B., Anund, A., and Kecklund, G. (2006). Subjective sleepiness,
simulated driving performance and blink duration: examining individual differences. Journal of
Sleep Research, 15(1):47–53.
ISO 15007 (2013). Road vehicles - Measurement of driver visual behavior with respect to transport
information and control systems - Part 1: Definitions and parameters. ISO 15007-1:2013.
Jain, A. K., Mao, J., and Mohiuddin, K. M. (1996). Artificial neural networks: a tutorial. Computer,
29(3):31–44.
James, W. (1981). The Principles of Psychology (Vol. I). Cambridge, MA: Harvard University Press.
(see: James, W., The Principles of Psychology, H. Holt and Co., New York, 1890.).
Jammes, B., Sharabty, H., and Esteve, D. (2008). Automatic EOG analysis: A first step toward automatic
drowsiness scoring during wake-sleep transitions. Somnologie - Schlafforschung und Schlafmedizin,
12(3):227–232.
Johns, M., Tucker, A., Chapman, R., Crowley, K., and Michael, N. (2007). Monitoring eye and eyelid
movements by infrared reflectance oculography to measure drowsiness in drivers. Somnologie -
Schlafforschung und Schlafmedizin, 11(4):234–242.
Johns, M. W. (1991). A new method for measuring daytime sleepiness: the epworth sleepiness scale.
Sleep, 14(6):540–545.
Johns, M. W. (2003). The amplitude-velocity ratio of blinks: a new method for monitoring drowsiness.
Sleep, 26:A51.
Johns, M. W. and Tucker, A. J. (2005). The amplitude-velocity ratios of eyelid movements during
blinks: changes with drowsiness. Sleep, 28:A122.
Jürgensohn, T., Neculau, M., and Willumeit, H. P. (1991). Visual scanning pattern in curve negotiation.
Vision in Vehicles III, pages 171–178.
Kadambe, S., Murray, R., and Boudreaux-Bartels, G. F. (1999). Wavelet transform-based QRS complex
detector. Biomedical Engineering, IEEE Transactions on, 46(7):838–848.
Kaida, K., Takahashi, M., Åkerstedt, T., Nakata, A., Otsuka, Y., Haratani, T., and Fukasawa, K.
(2006). Validation of the karolinska sleepiness scale against performance and EEG variables.
Clinical Neurophysiology, 117(7):1574 – 1581.
Kandil, F. I., Rotter, A., and Lappe, M. (2010). Car drivers attend to different gaze targets when
negotiating closed vs. open bends. Journal of Vision, 10(4):24.1–11.
Kecklund, G. and Åkerstedt, T. (1993). Sleepiness in long distance truck driving: an ambulatory EEG
study of night driving. Ergonomics, 36(9):1007–1017.
278 BIBLIOGRAPHY
Keller, W. (2004). Wavelets in Geodesy and Geodynamics. Walter de Gruyter.

Kincses, W. E., Hahn, S., Schrauf, M., and Schmidt, E. A. (2008). Measuring driver’s mental workload
using EEG. ATZ worldwide, 110(3):12–17.
Kira, K. and Rendell, L. A. (1992a). The feature selection problem: Traditional methods and a new
algorithm. In Proceedings of AAAI.
Kira, K. and Rendell, L. A. (1992b). A practical approach to feature selection. In Proceedings of the
ninth international workshop on Machine learning, pages 249–256. Morgan Kaufmann Publishers
Inc.
Kircher, A., Uddman, M., and Sandin, J. (2002). Vehicle control and drowsiness. Technical report,
Swedish National Road and Transport Research Institute.
Klauer, S. G., Dingus, T. A., Neale, V. L., Sudweeks, J. D., and Ramsey, D. J. (2006). The impact of
driver inattention on near-crash/crash risk: An analysis using the 100-car naturalistic driving
study data. Technical Report DOT HS 810 594, Virginia Tech Transportation Institute, 3500
Transportation Research Plaza (0536) Blacksburg, Virginia 24061.
Knipling, R. R. and Wierwille, W. W. (1994). Vehicle-based drowsy driver detection: Current status
and future prospects. In Moving Toward Deployment. Proceedings of the IVHS America Annual
Meeting. 2 Volumes, number Volume 1.
Kolo, B. (2011). Binary and Multiclass Classification. Weatherford Press.
Kranjec, J., Beguš, S., Geršak, G., and Drnovšek, J. (2014). Non-contact heart rate and heart rate
variability measurements: A review. Biomedical Signal Processing and Control, 13(0):102 – 112.
Krupiński, R. and Mazurek, P. (2010). Convergence improving in evolution-based technique for
estimation and separation of electrooculography and blinking signals. In Piętka, E. and Kawa, J.,
editors, Information Technologies in Biomedicine, volume 69 of Advances in Intelligent and Soft
Computing, pages 293–302. Springer Berlin Heidelberg.
Kumar, D. and Poole, E. (2002). Classification of EOG for human computer interface. In
Engineering in Medicine and Biology, 2002. 24th Annual Conference and the Annual Fall Meeting of
the Biomedical Engineering Society EMBS/BMES Conference, 2002. Proceedings of the Second Joint,
volume 1, pages 64 – 67.
Lal., S. K. L. (2001). The psychophysiology of driver fatigue/drowsiness : electroencephalography, electro-
oculogram, electrocardiogram and psychological effects. PhD thesis, University of Technology,
Sydney. Faculty of Science.
Land, M. F. and Lee, D. N. (1994). Where we look when we steer. Nature, 369:742 – 744.
Leigh, R. J. and Zee, D. S. (1999). The Neurology of Eye Movements, chapter 4. Contemporary Neurology
Series. Oxford University Press.
Lemieux, C. (2009). Monte Carlo and Quasi-Monte Carlo Sampling. Springer Series in Statistics.
Springer. Li, H. D., Liang, Y. Z., Xu, Q. S., Cao, D. S., B., T. B., Deng, B. C., and Lin, C. C.
(2011a). Recipe
for uncovering predictive genes using support vector machines based on model population analysis.
Computational Biology and Bioinformatics, IEEE/ACM Transactions on, 8(6):1633–1641.
Li, L., Xie, M., and Dong, H. (2011b). A method of driving fatigue detection based on eye location.
In Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference on,
pages 480–484.
Liang, Y. and Lee, J. D. (2010). Combining cognitive and visual distraction: Less than the sum of
its parts. Accident Analysis & Prevention, 42(3):881 – 890.
Liang, Y., Reyes, M. L., and Lee, J. D. (2007). Real-time detection of driver cognitive distraction
using support vector machines. Intelligent Transportation Systems, IEEE Transactions on,
8(2):340–350.
Liu, C. C., Hosking, S. G., and Lenné, M. G. (2009). Predicting driver drowsiness using vehicle
measures: Recent insights and future challenges. Journal of Safety Research, 40(4):239–245.
BIBLIOGRAPHY 279
Lugger, M. (2011). Mehrstufige Klassifikation paralinguistischer Eigenschaften aus Sprachsignalen mit

Hilfe neuartiger Merkmale. PhD thesis, University of Stuttgart.
Magosso, E., Provini, F., Montagna, P., and Ursino, M. (2006). A wavelet based method for automatic
detection of slow eye movements: A pilot study. Medical Engineering & Physics, 28(9):860 – 875.
Mallat, S. (2009). A wavelet tour of signal processing: the sparse way. Academic Press Elsevier, Amster-
dam; Heidelberg [u.a.], 3. ed. edition.
Martínez, M., Soria, E., Magdalena, R., Serrano, A., Martín, J., and Vila, J. (2008). Comparative study
of several fir median hybrid filters for blink noise removal in electrooculograms. WSEAS Trans.
Sig. Proc., 4(3):53–59.
May, J. F. (2011). Chapter 21 - driver fatigue. In Porter, B. E., editor, Handbook of Traffic Psychology,
pages 287 – 297. Academic Press, San Diego.
McCulloch, W. S. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The
bulletin of mathematical biophysics, 5(4):115–133.
Moller, H. J., Kayumov, L., Bulmash, E. L., Nhan, J., and Shapiro, C. M. (2006). Simulator performance,
microsleep episodes, and subjective sleepiness: normative data using convergent methodologies
to assess driver drowsiness. Journal of Psychosomatic Research, 61(3):335 – 342.
Moller, M. F. (1993). A scaled conjugate gradient algorithm for fast supervised learning. Neural
Networks, 6(4):525 – 533.
Monk, T. H. (1989). A visual analogue scale technique to measure global vigor and affect. Psychiatry
Research, 27(1):89–99.
Montgomery, D. C. and Runger, G. C. (2006). Applied statistics and probability for engineers. Wiley.
Morris, T. L. and Miller, J. C. (1996). Electrooculographic and performance indices of fatigue during
simulated flight. Biological Psychology, 42(3):343 – 360.
NHTSA (1998). Drowsy driving and automobile crashes. Technical report, National Highway Traffic
Safety Administration.
Niedermeyer, E. and da Silva, F. L. D. (2005). Electroencephalography: Basic Principles, Clinical Appli-
cations, and Related Fields. LWW Doody’s all reviewed collection. Lippincott Williams &
Wilkins.
Niemann, H. (2003). Klassifikation von Mustern.
Nunez, P. L. and Srinivasan, R. (2006). Electric Fields of the Brain: The Neurophysics of EEG. Oxford
University Press.
O’Hanlon, J. F. (1972). Heart rate variability: a new index of driver alertness/fatigue. Technical
report, SAE Technical Paper.
O’Hanlon, J. F. and Kelley, G. R. (1977). Comparison of performance and physiological changes
between drivers who perform well and poorly during prolonged vehicular operation. In Mackie,
R., editor, Vigilance, volume 3 of NATO Conference Series, pages 87–109. Springer US.
Olson, D. L. and Delen, D. (2008). Advanced Data Mining Techniques. Springer.
Otmani, S., Pebayle, T., Roge, J., and Muzet, A. (2005). Effect of driving duration and partial sleep
deprivation on subsequent alertness and performance of car drivers. Physiology & Behavior,
84(5):715
– 724.
Oxford (2014). Oxford dictionaries. [Online; accessed 11-August-2014] http://www.
oxforddictionaries.com/definition/english/distraction.
Papadelis, C., Chen, Z., Kourtidou-Papadeli, C., Bamidis, P. D., Chouvarda, I., Bekiaris, E., and
Maglav- eras, N. (2007). Monitoring sleepiness with on-board electrophysiological recordings for
preventing sleep-deprived traffic accidents. Clinical Neurophysiology, 118(9):1906 – 1922.
Pettitt, M., Burnett, G., and Stevens, A. (2005). Defining driver distraction. In Proceedings of the 12th
ITS World Congress, San Francisco, USA, ITS America.
280 BIBLIOGRAPHY
Philip, P., Sagaspe, P., Taillard, J., Valtat, C., Moore, N., ÅKerstedt, T., Charles, A., and Bloulac, B.
(2005). Fatigue, sleepiness, and performance in simulated versus real driving conditions. Sleep, 28
(12):1511–1516.
Picot, A., Caplier, A., and Charbonnier, S. (2009). Comparison between EOG and high frame rate
camera for drowsiness detection. In Applications of Computer Vision (WACV), 2009 Workshop on,
pages 1–6.
Picot, A., Charbonnier, S., and Caplier, A. (2010). Drowsiness detection based on visual signs: blinking
analysis based on high frame rate video. In Instrumentation and Measurement Technology Conference
(I2MTC), 2010 IEEE, pages 801–804.
Pilutti, T. and Ulsoy, A. G. (1999). Identification of driver state for lane-keeping tasks. IEEE Transactions
on Systems, Man, and Cybernetics, Part A, 29(5):486–502.
Pimenta, P. A. D. M. (2011). Driver drowsiness classification based on lane and steering behavior.
Platho, C., Pietrek, A., and Kolrep, H. (2013). Erfassung der Fahrermüdigkeit. Berichte der
Bundesanstalt für Straßenwesen. Unterreihe Fahrzeugtechnik, F 89.
Poularikas, A. D. (2009). Transforms and Applications Handbook, Third Edition. Electrical Engineering
Handbook. Taylor & Francis.
Priddy, K. L. and Keller, P. E. (2005). Artificial Neural Networks: An Introduction. Tutorial Text Series.
SPIE Press.
Pudil, P., Ferri, F. J., Novovičoá, J., and Kittler, J. (1994). Floating search methods for feature
selection with nonmonotonic criterion functions. In Pattern Recognition, 1994. Vol. 2-Conference
B: Com- puter Vision & Image Processing., Proceedings of the 12th IAPR International.
Conference on, volume 2, pages 279–283. IEEE.
Rantanen, E. M. and Goldberg, J. H. (1999). The effect of mental workload on the visual field size
and shape. Ergonomics, 42(6):816–834. PMID: 10340026.
Records, R. E. (1979). Physiology of the human eye and visual system. Harper & Row.
Reddy, M. S., Narasimha, B., Suresh, E., and Rao, K. (2010). Analysis of EOG signals using wavelet
transform for detecting eye blinks. In Wireless Communications and Signal Processing (WCSP),
2010 International Conference on, pages 1–4.
Regan, M. A., Hallett, C., and Gordon, C. P. (2011). Driver distraction and driver inattention:
Definition, relationship and taxonomy. Accident Analysis & Prevention, 43(5):1771 – 1781.
Riemersma, J. B. J., Sanders, A. F., Wildervanck, C., and Gaillard, A. W. (1977). Performance
decrement during prolonged night driving. In Mackie, R., editor, Vigilance, volume 3 of NATO
Conference Series, pages 41–58. Springer US.
Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5):465 – 471.
Rosario, H. D., Solaz, J., Rodrguez, N., and Bergasa, L. M. (2010). Controlled inducement and measure-
ment of drowsiness in a driving simulator. IET Intelligent Transport Systems, 4 (4):280–288.
Rowland, L. M., Thomas, M. L., Thorne, D. R., Sing, H. C., Krichmar, J. L., Davis, H. Q., Balwinski,
S. M., Peters, R. D., Kloeppel-Wagner, E., Redmond, D. P., Alicandri, E., and Belenky, G. (2005).
Oculomotor responses during partial and total sleep deprivation. Aviation, space, and
environmental medicine, 76(7):C104–C113.
Royal, D. (2003). Volume I: Findings. national survey of distracted and drowsy driving. attitudes and
behavior: 2002 (vol. 1. findings). Technical Report DOT HS 809 566, U.S. Department of Trans-
portation National Highway Traffic Safety Administration (NHTSA), Washington.
Santillán-Guzmán, A. (2014). Digital Enhancement of EEG/MEG signals. PhD thesis, University of Kiel.
Saroj, K. L. L. and Craig, A. (2001). A critical review of the psychophysiology of driver fatigue.
Biological
Psychology, 55(3):173 – 194.
BIBLIOGRAPHY 281
Sağlam, M., Lehnen, N., and Glasauer, S. (2011). Optimal control of natural eye-head movements
minimizes the impact of noise. The Journal of Neuroscience, 31(45):16185–16193.
Savitzky, A. and Golay, M. J. E. (1964). Smoothing and differentiation of data by simplified least
squares procedure. Anal. Chem. (1964), pages 1627–1639.
Schleicher, R., Galley, N., Briest, S., and Galley, L. (2008). Blinks and saccades as indicators of fatigue
in sleepiness warnings: looking tired? Ergonomics, 51(7):982–1010.
Schmidt, E. A., Schrauf, M., Simon, M., Buchner, A., and Kincses, W. E. (2011). The short-term
effect of verbally assessing drivers’ state on vigilance indices during monotonous daytime
driving. Transportation Research Part F: Traffic Psychology and Behaviour, 14(3):251 – 260.
Schmidt, E. A., Schrauf, M., Simon, M., Fritzsche, M., Buchner, A., and Kincses, W. E. (2009). Drivers’
misjudgment of vigilance state during prolonged monotonous daytime driving. Accident Analysis
& Prevention, 41(5):1087 – 1093.
Schmieder, F. (2009). Support vector machine in emotion in recognition. Master’s thesis, University of
Stuttgart.
Seekircher, J., Woltermann, B., Gern, A., Janssen, R., Mehren, D., and Lallinger, M. (2009). Das
Auto lernt sehen - kamerabasierte Assistenzsysteme. ATZextra, 14(1):64–71.
Shahid, A., Wilkinson, K., Marcu, S., and Shapiro, C. M. (2012a). Epworth sleepiness scale (ESS). In
Shahid, A., Wilkinson, K., Marcu, S., and Shapiro, C. M., editors, STOP, THAT and One Hundred
Other Sleep Scales, pages 149–151. Springer New York.
Shahid, A., Wilkinson, K., Marcu, S., and Shapiro, C. M. (2012b). Karolinska sleepiness scale (KSS). In
Shahid, A., Wilkinson, K., Marcu, S., and Shapiro, C. M. (2012c). Stanford sleepiness scale (SSS). In
Sigari, M. H. (2009). Driver hypo-vigilance detection based on eyelid behavior. In Advances in Pattern
Recognition, 2009. ICAPR ’09. Seventh International Conference on, pages 426–429.
Simon, M. (2013). Neurophysiologische Analyse des kognitiven Fahrerzustandes. PhD thesis, Eberhard
Karls University of Tübingen.
Simon, M., Schmidt, E. A., Kincses, W. E., Fritzsche, M., Bruns, A., Aufmuth, C., Bogdan, M., R.,
W., and Schrauf, M. (2011). EEG alpha spindle measures as indicators of driver fatigue under real
traffic conditions. Clinical Neurophysiology, 122(6):1168 – 1178.
Sirevaag, E. J. and Stern, J. A. (2000). Ocular measures of fatigue and cognitive factors. Engineering
Psychophysiology: Issues and Applications., R. Backs & W. Boucsein (eds.), L. Erlbaum
Associates Press, New Jersey.
Skipper, J. H. and Wierwille, W. W. (1986). Drowsy driver detection using discriminant analysis.
Human Factors: The Journal of the Human Factors and Ergonomics Society, 28(5):527–540.
Soman, K. P., Ramachandran, K. I., and Resmi, N. G. (2010). Insight Into Wavelets : from Theory to
Practice. PHI Learning.
Sommer, D. and Golz, M. (2010). Evaluation of perclos based current fatigue monitoring technologies.
In Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of
the IEEE, pages 4456–4459.
Sonnleitner, A., Simon, M., Kincses, W. E., Buchner, A., and Schrauf, M. (2012). Alpha spindles as
neurophysiological correlates indicating attentional shift in a simulated driving task. International
Journal of Psychophysiology, 83(1):110 – 118.
Sonnleitner, A., Simon, M., Kincses, W. E., and Schrauf, M. (2011). Assessing driver state - neurophysi-
ological correlates of attentional shift during real road. In 2nd International Conference on Driver
Distraction and Inattention.
282 BIBLIOGRAPHY
Stern, J. A., Boyer, D., and Schroeder, D. (1994). Blink rate: a possible measure of fatigue. Human
Factors: The Journal of the Human Factors and Ergonomics Society, 36:285–297.
Stern, J. A., Walrath, L. C., and Goldstein, R. (1984). The endogenous eyeblink. Psychophysiology,
21(1):22–33.
Stern, R. M., Ray, W. J., and Quigley, K. S. (2001). Psychophysiological Recording. Oxford University
Press.
Straube, A. and Büttner, U. (2007). Neuro-ophthalmology: Neuronal Control of Eye Movements. Devel-
opments in ophthalmology. Karger.
Summala, H., Häkkänen, H., Mikkola, T., and Sinkkonen, J. (1999). Task effects on fatigue
symptoms in overnight driving. Ergonomics, 42(6):798–806. PMID: 10340025.
Suzuki, M., Yamamoto, N., Yamamoto, O., Nakano, T., and Yamamoto, S. (2006). Measurement of
driver’s consciousness by image processing -a method for presuming driver’s drowsiness by eye-
blinks coping with individual differences -. In Systems, Man and Cybernetics, 2006. SMC ’06. IEEE
International Conference on, volume 4, pages 2891–2896.
Svensson, U. (2004). Blink behavior based drowsiness detection method development and validation.
Master’s thesis, Linköpig University.
Tefft, B. C. (2014). Prevalence of motor vehicle crashes involving drowsy drivers, United States, 2009 –
2013 (November 2014). Technical report, AAA Foundation for Traffic Safety.
Thiffault, P. and Bergeron, J. (2003). Monotony of road environment and driver fatigue: a
simulator study. Accident Analysis & Prevention, 35(3):381–391.
Thorslund, B. (2003). Electrooculogram analysis and development of a system for defining stages of
drowsiness. Master’s thesis, Linköpig University.
Tinati, M. A. and Mozaffary, B. (2006). A wavelet packets approach to electrocardiograph baseline drift
cancellation. International Journal of Biomedical Imaging, pages 1–9.
Tomek, I. (1976). Two modifications of CNN. Systems, Man and Cybernetics, IEEE Transactions on, SMC-
6(11):769–772.
Tran, Y., Wijesuriya, N., Tarvainen, M., Karjalainen, P., and Craig, A. (2009). The relationship
between spectral changes in heart rate variability and fatigue. Journal of Psychophysiology,
23(3):143–151.
Uhlich, S. (2006). Emotion recognition of speech signals. Master’s thesis, University of Stuttgart.
Veropoulos, K., Campbell, C., and Cristianini, N. (1999). Controlling the sensitivity of support vector
machines. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 55–
60.
Verwey, W. B. and Zaidel, D. M. (2000). Predicting drowsiness accidents from personal attributes, eye
blinks and ongoing driving behaviour. Personality and Individual Differences, 28(1):123–142.
ViaMichelin (2014). Michelin. [Online; accessed 26-August-2014] http://www.viamichelin.de/web/
Karten-Stadtplan.
Volkswagen AG (2014). Fatigue Detection. [Online; accessed 20-November-2014] http:
//www.volkswagen.com.au/en/technology_and_service/technical-glossary/fatigue-
detection.html.
Volvo Group (2014). Driver Alert control. [Online; accessed 24-August-2014]
http://www.volvocars. com/de/sales-services/service/specialsales/Pages/techniklexikon-
d.aspx.
von Helmholtz, H. (1925). Treatise on Physiological Optics. Volume III: The Perceptions of Vision. The
Optical Society of America.
Wallén Warner, H., Ljung Aust, M., Sandin, J., Johansson, E., and Björklund, G. (2008). Manual for
DREAM 3.0, driving reliability and error analysis method. Technical report, Deliverable D5.6 of
the EU FP6 project SafetyNet, TREN-04-FP6TRSI2.395465/506723.
BIBLIOGRAPHY 283
Wei, Z. and Lu, B. (2012). Online vigilance analysis based on electrooculography. In Neural Networks
(IJCNN), The 2012 International Joint Conference on, pages 1 –7.
Wigh, F. (2007). Detection of driver unawareness based on long- and short-term analysis of driver lane
keeping. Master’s thesis, Linköpings University.
Williamson, A., Friswell, R., Olivier, J., and Grzebieta, R. (2014). Are drivers aware of sleepiness and
increasing crash risk while driving? Accident Analysis & Prevention, 70(0):225 – 234.
Yang, B. (2014). Detection and pattern recognition. Slides to the lecture.
Young, L. and Sheena, D. (1975). Survey of eye movement recording methods. Behavior Research Methods,
7:397–429.
Young, R. K. (1993). Wavelet Theory and Its Applications. Kluwer international series in engineering
and computer science: VLSI, computer architecture, and digital signal processing. Springer US.
Zamora, M. E. (2001). The study of the sleep and vigilance Electroencephalogram using neural network
methods. PhD thesis, University of Oxford.
Zeeb, E. (2010). Daimler’s new full-scale, high-dynamic driving simulator – a technical overview. In
Conference Proc. Driving Simulator Conference Europe, Paris.
Zulley, J. and Popp, R. (2012). Müdigkeit im Straßenverkehr. [Online; accessed 04-April-2013] http:
//www.adac.de/_mmm/pdf/vm_muedigkeit_im_strassenverkehr_flyer_48789.pdf.
78

Dissertation Parisa Ebrahim

Uploaded by

Copyright:

Available Formats

Dissertation Parisa Ebrahim

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dissertation Parisa Ebrahim

Uploaded by

Copyright:

Available Formats

Driver drowsiness monitoring using eye movement

features derived from electrooculography

Von der Fakultät Informatik, Elektrotechnik und

Hauptberichter: Prof. Dr.-Ing. Bin Yang

Cao Tag der mündlichen Prüfung: 06.06.2016

Institut für Signalverarbeitung und Systemtheorie

Notation and abbreviations v

2 Driver state measurement 13

3 Human visual system 31

4 In-vehicle usage of electrooculography and conducted experiments 37

5 Eye movement event detection methods 53

5.3 Eye movement detection using the wavelet transform-based method........................61

6 Blink behavior in distracted and undistracted driving 91

7 Extraction and evaluation of the eye movement features 103

8 Driver state detection by machine learning methods 141

8.3.4 Model construction...............................................................................................163

9 Summary, conclusion and future work 191

A Derivation of sawtooth occurrence frequency during curve negotiation 197

B Description of a boxplot 199

C k-means clustering 201

D Statistical test 203

E Mother wavelets 211

F Additional results 213

H On the understanding of the dual form of the optimization problem 259

List of figures 263

List of tables 271

|x absolute value of scalar

a scale of the wavelet transform

cj approximation coefficient of the dwt at

Tc closing duration of a blink

Vehicle dynamics and sensors

Γ displacement of the vehicle

kkt Karush-Kuhn-Tucker theorem

Die Zunahme müdigkeitsbedingter Verkehrsunfälle in den letzten Jahren verdeutlicht die

bezüglich ihrer unterschiedlichen Herkunft untersucht. Im Rahmen eines Experimentes unter

mit unausgeglichenen Datensätzen die Klassifikatoren nicht verallgemeinern, sondern zu ihrer

an experiment under fully controlled conditions is carried out on a proving ground to

1.1. Problem statement and motivation

Number of injured persons

• a person was injured: 20%

1.2. Definition of drowsiness and inattention

Fatigue and sleepiness

Distraction and inattention

Alertness and vigilance

Terminologies of this work

1.3. Countermeasures against drowsiness during driving

1.4. Driver drowsiness detection systems on the market

Mercedes-Benz In 2009, Daimler AG introduced the Attention Assist to warn drowsy

Steering wheel velocity [◦/s]

Volkswagen The Fatigue Detection system is developed by Volkswagen AG and is based on

1.5. Thesis outline

Data collection Event detection Feature extractionDriver state classification

Figure 1.4.: Tool chain of this thesis

1.6. Goals and new contributions of the thesis

for in-vehicle measurements is evaluated on a proving ground and under fully

• Apart from scrutinizing features as separate and independent source of information,

2.1. Objective driver state measures

2.1.1. Driving performance measures

Driver’s lane keeping behavior