Emotion Recognition Elsevier 17 Esfand

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 19

A Real-time Emotion Recognition Embedded System using

an Optimized Deep Learning Model

Mehdi Bazargani a, Amir Tahmasebi b, Mohammad Reza Yazdchi c, Zahra Baharlouei d

a Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran, e-mail address
a Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran, e-mail

a Department of Biomedical Engineering, Faculty of Engineering, University of Isfahan, Isfahan, Iran, e-mail

Medical Image and Signal Processing Research Center, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences ,

Isfahan, Iran, [email protected].


Diagnosing emotional states would improve human-computer interaction (HCI) systems to be more effective in

practice. Correlations between Electroencephalography (EEG) signals and emotions have been shown in the

researches. Therefore, EEG signal based methods are the most accurate and informative. In this study, a

Convolutional Neural Network (CNN) model is optimized to diagnose emotions using EEG signals and a Raspberry

Pi minicomputer is used to implement the optimized and lightweight model. The emotional states were recognized

for every three-second epochs of received signals on the embedded system. Average classification accuracy of

99.11% in the valance and 99.19% in the arousal was achieved on DEAP dataset. Comparing the results with the

related works show that we achieved a high accurate and implementable model in practice.


Electroencephalography; Emotion Recognition; Embedded system; Convolutional Neural Network.

1. Introduction

Nowadays, human-computer interaction (HCI) systems are a big part of human lives. It seems that such interactions

need to have the same social and natural principles as the human to human interactions. In many related
applications, emotional information is required to have more effective systems. For example, in some diseases,

understanding the emotions of patients has an effect on the therapy manner. Some patients, for example with autism

disorder, could not express their emotions. Therefore, the ability to understand the users’ emotions is of interest (J.

Zhang et al., 2020). In recent research, the lack of emotional information in HCI has been considered. To improve

such ability in HCI systems, machines need to understand and interpret the emotions of humans. The aim is to have

adaptive and personalized means of emotion recognition which needs research in different fields of science, e.g.

Artificial Intelligence, psychology, computer science and neuroscience (Egger et al., 2019)

Humans may have different emotions such as happiness, sadness, joy, satisfaction and etc. In the literature, different

models have been proposed for emotion states (Al-Nafjan et al., 2017). One of the most popular ones is Russell’s

Circumplex 2D model which defines emotions as a two-dimensional space of valence and arousal. The term Valence

indicates the level of pleasure and the Arousal indicates the level of excitation (Russell, 1980). Although In

Russell’s model and some studies, e.g. (Javidan et al., 2021), the emotions have been considered as continuous

variables, but in the most related works, emotions have been considered as discrete states.

Emotions can be recognized from speech, behavior, motion, facial expression or physiological signals. Physiological

data that have been used for this purpose are Electrocardiography (ECG), Heart Rate Variability (HRV),

Electroencephalography (EEG), Facial Recognition (FR), Forehead Bio-Signal (FBS), Speech Recognition (SR),

Skin Temperature (SKT), Blood Volume Pulse (BVP), and Respiration (RSP) (Goshvarpour & Goshvarpour, 2020a;

Ko, 2018; Lieskovská et al., 2021; Nikolova et al., 2019; Villarejo et al., 2012). In (Q. Zhang et al., 2017),

respiration signals were studied to recognize emotions. The model was developed using DEAP dataset (Russell,

1980) and Augsburg University dataset. In some studies, Galvanic Skin Response (GSR) signals are used for

emotion recognition. For example in (Ayata et al., 2016), valance and arousal were categorized using GSR. In

(Villarejo et al., 2012), GSR was used to build a stress sensor. In (Domínguez-Jiménez et al., 2020), information

about heart rate as well as GSR was considered to recognize three target emotions. In some works, ECG signals

were decoded to detect the emotional states. For example, a deep neural network in (Keren et al., 2017) and a

scattering wavelet algorithm in (Sepúlveda et al., 2021) were employed to detect emotion from ECG signals.

To improve the accuracy in emotion recognition, some studies use both physical signs and physiological signals. In

(Tarnowski et al., 2018), an experiment was designed with 22 subjects using a movie as the stimulus meanwhile
GSR and EEG signals of each subject were extracted and processed. Frequency domain features were extracted and

two classifiers, SVM and KNN, were implemented. In (Goshvarpour et al., 2017), also ECG and GSR signals were

used to recognize emotions. An experiment was designed with 11 subjects and the stimulus was a music clip.

Features were extracted using wavelet and discrete cosine transform. After reducing the dimension of the features,

Probabilistic Neural Network (PCA) was used to detect four classes of valance and arousal plane. The results of this

paper showed that the accuracy using ECG features is more than the ones of GSR. Facial expression data, ECG, skin

temperature and conductance, Breathing signal, mouth length and pupil size were used in (Tan et al., 2020) to

recognize the emotions by enhanced neural networks.

Although the researches on emotion recognition are very extensive, but some methods are subject-based and in some

cases, an external reaction against a stimulus depends on the personality of the subjects. For example, if a subject

decides to conceal his feeling, the performance of some methods would be affected. Overall, methods based on

physiological signals are more reliable. Among them, as the brain is the source of human reactions to external

stimuli, EEG signal based methods are the most accurate and informative. Correlations between EEG signals and

emotions have been shown in the researches. The frontal scalp seems to store more emotional activation compared

to other regions of the brain (Wang et al., 2014). Furthermore, processing EEG signals has more advantages

compared to some other techniques. Providing immediate medical care with low cost and ease in use for patients

who cannot respond or have any movement makes EEG signals favorable in detecting some diseases as well as

emotion states (Suhaimi et al., 2020).

The researches on emotion recognition using EEG signals are extensive. There are differences in the extracted

features categories, classifiers, the number of used channels, datasets and experiments. In (Y. Zhang et al., 2016),

EEG signals of only two channels were employed and EMD strategy and SVM classifier were used. Two neural

models, CNN and DNN, were employed in (Tripathi et al., 2017) on the DEAP dataset. Results in 2-class and 3-

class mode were compared. In some studies, to find the most important features of EEG signals to recognize

emotions, different categories of features have been considered. In (Khateeb et al., 2021), time, frequency and

wavelet domain features were extracted and by using SVM, nine classes of emotions were identified. In (Moon et

al., 2018), power spectrum and correlation between two electrodes were extracted and fed to a CNN for

classification. In (Gannouni et al., 2021), multi-class emotion recognition were studied on the DEAP dataset.

Considering nine emotion states, the authors achieved more accuracy rate using QDC and RNN.
In the literature, most of the deep learning algorithms achieved higher accuracy compared to machine learning ones.

On the other hand, such algorithms are usually too complicated for practical implementation. In this paper, our aim

is to develop an emotion recognition model using EEG signals that is highly accurate and implementable on an

embedded system. We use different state-of-the-art CNN models and assess them to find the most accurate one in

diagnosing the emotional states. Each network is assessed in two setting ways: subject-dependent and subject-

independent. Next, we optimize the selected CNN model to be lightweight and implementable on a Raspberry Pi

processor. Using EEG signals from the DEAP (Russell, 1980) dataset, we investigate the model while the processing

is done on the embedded board in real-time. The results show that this optimized model could achieve high accuracy

in recognizing the emotions in real-time.

The rest of this paper is organized as follows. The method is explained in Section 2. The simulation results are

presented in Section 3. The implementation on the hardware is explained in Section 4. Section 5 concludes the


2. Material and Methods

In this study, a deep learning model is used to detect emotions using the DEAP dataset. The study is performed in

both subject-dependent and subject-independent settings. We have included preprocessing in our technique to

remove artifacts from EEG data. Then, the baseline signal is removed, the data is segmented and finally passed to

convolutional network. EEGNet, Shallow Convolutional Network (ShallowConvNet), and Deep Convolutional

Neural Network (DeepConvNet) are carried out to recognize emotions in both subject-dependent and subject-

independent settings. Finally, the accuracy and F-score of the three convolutional networks are compared. We also

implement real-time emotion recognition process on an embedded system using a Raspberry Pi board. The steps

applied in this paper are shown in Fig. 1. These steps are described more precisely in the rest of the section.
Fig 1. The workflow diagram applied in this paper.

2.1. Dataset

The well-known DEAP dataset (Russell, 1980) is used in this study, which includes the electroencephalogram and

other peripheral physiological signals of 32 subjects aged between 19 and 37 while watching 40 one-minute music

videos as the stimuli. EEG signals of 32 channels are available in the DEAP dataset. The level of arousal and

valance of the subjects’ emotions after each experiment were assessed using Self-assessment manikins (SAM), with

values from 1 to 9 for each dimension. The emotional states were presented in two dimension valance-arousal model

in which the valance ranged from sad to joyful; and arousal ranged from bored to excited (X. Li et al., 2018). We

segment each valance-arousal space to two parts. The values greater than 5 are high valence/arousal and the ones

below 5 are low valence/arousal.

2.2. Preprocessing

In the DEAP dataset, EEG signals were recorded by international standard 10-20 electrode systems with a sampling

rate of 512 Hz. In the preprocessing step, the signals were down-sampled to 128Hz and a band pass filter from 4.0

Hz to 45.0 Hz was applied to reduce EMG (Electromyography) and ECG effects from the signals. Eye movement
artifacts and interferences of other sources were removed using blind source separation techniques such as

Independent Component Analysis (ICA).

The time duration of each EEG signal in the DEAP dataset is 63-second, which contains 3-second pre-trial baseline

and 60-second of emotional information. The first 3-second pre-trial signal, in which the video has not started

playing, was repeated 20 times to get a 60-second signal and then this signal was subtracted from the 60-second

trial. Then the pre-trial times were removed from the signals. Next, each 60-second signal was segmented into 3-

second epochs and finally, Z-normalization was applied.

2.3. Processing

After preprocessing, the signals are prepared to be processed. In this work, for emotion recognition, three networks

namely, EEGNet, ShallowConvNet, and DeepConvNet were used. For each network, two approaches were

conducted in learning; subject-dependent and subject-independent learning. In the subject-dependent approach, for

each subject a model was trained and parameters were extracted. In this method, we had 800 samples (40

experiments×20 epochs with 3-seconds interval) for each subject. In subject-independent approach, the model was

trained for all subjects and 40×800 samples were available. We adopted a 5-fold cross-validation for both


The results of the test on the three networks and with the two mentioned approaches were compared in terms of

accuracy of emotion recognition in arousal and valance. Based on the results, the best method was determined. The

details are presented in the result section.

In the following, we introduce the three used networks and the parameters that we set in this study.

2.3.1. EEGNet

EEGNet (Lawhern et al., 2018) is a compact convolutional network that can be applied in different Brain-Computer

Interface (BCI) models and can be trained using limited data. The structure of this network is shown in Table 1.

Input of this network is as (C, T), in which C is the number of channels (in this study C=32) and T is time samples

(in this study T=384 =3 second × 128 Hz). Signals are passed from eight 2D convolutional filters. The output of this

layer is EEG signals in 8 frequency bands. Next, the signals are fed to DepthwiseConv2D as a special filter. To

prevent over-fitting, we use dropout layer. Average polling is applied to reduce features size. After Separable
Convolution, the last block is a softmax classifier with N units, where N is the number of classes that in this study is

set to 2. The model is trained using Adam optimizer and batch size of 64. We run the model 50 and 30 training

iterations for subject-dependent and subject-independent approaches, respectively.

Table 1. EEGNet network structure (Lawhern et al., 2018).

Layer Number Kernel Padding Output parameters

of filters size
Input (1,32,384)
Conv2D 8 1×32 same (8,32,384) 256
BatchNorm2D (8,32,384) 16
DepthwiseConv2D 16 32×1 valid (16,1,384) 512
BatchNorm2D (16,1,384) 32
ELU Activation (16,1,384)

AveragePooling2D 1×4 valid (16,1,96)

DropOut (16,1,96)
SeparableConv2D 16 1×16 same (16,1,96) 512
BatchNorm2D (16,1,96) 32
ELU Activation (16,1,96)
AveragePooling2D 1×8 valid (16,1,12)
DropOut (16,1,12)
Flatten (16×12)
Dense 384 (2)
Softmax Activation

2.3.2. DeepConvNet

DeepConvNet (Schirrmeister et al., 2017) is an EEG decoding network which is compatible with any type of

feature. The structure of this network is shown in Table 2. This network contains of five convolution layers and one

dense softmax classifier.

Table 2. DeepConvNet network structure (Schirrmeister et al., 2017).

Block Layer Activation Padding Filter Size

1 Convolution Linear Valid 25 1,5
Spatial filter Linear Valid 25 32,1
Max Polling 1,2
2 Convolution Linear valid 50 1,5
Max polling 1,2
3 Convolution Linear valid 100 1,5
Max polling 1,2
4 Convolution Linear valid 200 1,5
Max polling 1,2
5 Classification Softmax 2

2.3.3. ShallowConvNet

ShallowConvNet (Schirrmeister et al., 2017) has more shallow architecture than Deep ConvNet and is designed to

decode band power features of signals. The structure of this network is shown in Table 3. This model consists of a

temporal and then spatial convolutional layer, a mean polling and finally classification layer.

Table 3. ShallowConvNet network structure (Schirrmeister et al., 2017).

Layer Activation Padding Filter Size

Convolution Linear Same 40 1,13
Spatial filter Linear Valid 40 32,1
Mean Polling strides=(1, 7)
Classification Softmax 2
1. Simulation Results

As explained in Section 2, in this study, EEG signals from the DEAP dataset were used for emotion detection on an

embedded system. After preprocessing, baseline removal, and segmentation, the signals were fed to EEGNet,

DeepConvNet and ShallowConvNet using subject-dependent and subject-independent approaches to recognize the

emotional states. For evaluation of the model, 5-fold cross-validation was used and for comparing the results,

accuracy (Acc) and F-score (Yin et al., 2021) were utilized. The Acc parameter is defined as:

TP+TN + FP+ FN (1)
in which, TP and TN are true classified cases (low arousal/negative valence named positive emotion and high

arousal/positive valence named negative emotion) and FN and FP are false identified emotion ones.

The F-score parameter, considers precision (Pre) and recall (Rec) rate, is as follow:

2∗Rec∗Pr e
Re c +Pr e , (2)

in which, Pre and Rec are:

Pr e= Re c=
TP+FN , TP+ FN (3)

Table 4 compares the results of the three networks EEGNet, DeepConvNet and Shallowconvnet in subject-

independent mode. As it turns out, the ShallowConvNet model outperforms the other two models in subject-

independent approach. We achieved the best accuracy of 90.49% for valance and 90.97% for arousal using the

Shallowconvnet model.

Table 4. Classification results of subject-independent approach using EEGNet, ShallowConvNet and


Valence Arousal
Model Accuracy F score Accuracy F score
EEGNet 70.85 ± 0.85 72.12 ± 1.26 73.30 ± 0.85 75.91 ± 1.41

ShallowConvNet 90.49 ± 0.93 91.38 ± 0.82 90.97 ± 1.35 91.96 ± 1.39

DeepConvNet 86.37 ± 0.64 87.60 ± 0.61 88.62 ± 0.44 90.21 ± 0.43

Table 5 and 6 show the accuracy and F-score results of subject-dependent method for valence and arousal,

respectively. To compare the results, the valance and arousal accuracy acquired in this method are presented in Fig.

3 and Fig.4, respectively.

Table 5. Classification results of subject-dependent approach using EEGNet, ShallowConvNet and

DeepConvNet for valance.

Model Accuracy (in average) F-score (in average)

EEGNet 86.9 ± 8.3 87.1 ± 9.32
ShallowConvNet 99.11 ± 1.16 99.15 ± 1.18

DeepConvNet 96.82 ± 5.43 96.81 ± 6.57

Table 6. Classification results of subject-dependent approach using EEGNet, ShallowConvNet and

DeepConvNet for arousal.

Model Accuracy (in average) F-score (in average)

EEGNet 87.06 ± 8.87 86.45 ± 12.4

ShallowConvNet 99.19 ± 1.02 99.22 ± 1.09

DeepConvNet 96.91 ± 6.56 96.87 ± 7.05

Fig 2. The accuracy of three networks in subject-depended approach for the valence dimension.
Fig 3. The accuracy of three networks in subject-depended approach for the arousal dimension.

As it can be seen in the Table 2 and 3 and also in the Fig. 3 and 4, ShallowConvNet is more accurate in subject-

dependent approach for both the arousal and valance dimensions. Furthermore, the results in Table 1 showed that

ShallowConvNet works more accurate in subject-independent approach, too. Therefore, we used this network in the

embedded system.

To compare the results with other studies, we presented Table 7. To be comparable, the studies on the DEAP dataset

are selected. As the table shows, our method is more accurate in both the arousal and valence dimensions than the

methods presented in other articles.

Table 7. Comparison of the accuracy in different studies on emotion recognition.

Study Year Method accuracy

(Yin et al., 2021 ECLGCNN 90.45% in valance and 90.60% in arousal
(Cui et al., 2020 RACNN 96.65 ± 2.65 in valance and 97.11 ± 2.01 in arousal
(Goshvarpo 2020 Lagged Poincare 98.97% in valance and 98.94% in arousal
ur & Indices, RSSF, SVM
r, 2020b)
(R. Li et al., 2021 SVM , CNN 52.50 ± 11.29% in valance and 56.00 ± 12.46% in arousal
(Nath et al., 2020 LSTM 94.69% in valance and 93.13% in arousal
(Huang et 2021 BiDCNN 94.38% in valence and 94.72% in arousal
al., 2021)
Our method EEGNet, 99.11% for valance and 99.19 % for arousal (in average
ShallowConvNet, using subject-dependent method and ShallowConvNet)

2. Hardware Implementation

In this study, to design an embedded system a Raspberry Pi processor (version 4) was used. This hardware has Quad

core, 64-bit ARM-Cortex A72 running at 1.5GHz, 2 Gigabyte LPDDR4 RAM, ARMv8 based and has different

communication interfaces. The board and the connections are shown in Fig 4 (Ltd, n.d., p. 4).
Fig 4. Hardware of the embedded system

In the implementation step of this study, operating system (Armbian) was installed on the SD card. Commands were

fed to the board using SSH (Secure Shell) protocol and socket programming was used to feed the data to the

processor. The emotional states are recognized for every three-second epochs of received signals.

To reduce the size of the model and to increase the speed, tensorflow lite tools were used. Tensorflow lite version of

the model is executed efficiently on devices with limited resources. In this work, to optimize the size of the model

more, different quantization techniques were applied. Quantization reduces the precision of the numbers used to

represent a model’s parameters. Optimization and conversion reduces the size of the model and the latency, with

minimal (or no) loss in accuracy.

The results of applying two quantization techniques are shown in Table 8 and 9 for subject dependent and

independent, respectively. Optimization and conversion, result in significant reduction in size of the model and
faster computation (in most of the cases) without loss of accuracy. According to the results, the best model for

implementing is the ShallowConvNet subject dependent that resized with the post-training dynamic range


Table 8. Subject dependent results on board

Technique Dimension Accuracy F score Latency (ms) Model Size(K


Without Quantization Arousal 99.19 99.20 12.5629 221

Valence 99.09 99.13

Post-training dynamic range Arousal 99.20 99.21 12.5554 61

Valence 99.09 99.12

Post-training float16 quantization Arousal 99.19 99.20 12.8859 114

Valence 99.09 99.13

Table 9. Subject independent results on board

Technique Dimension Accuracy F score Latency (ms) Model Size(K bytes)

Without quantization Arousal 91.70 92.84 12.7359 221

Valence 91.37 92.24

Post-training dynamic Arousal 91.74 92.88 12.5564 61

range quantization
Valence 91.35 92.21

Post-training float16 Arousal 91.70 92.84 12.5254 113

Valence 91.37 92.24
3. Conclusion

In this study, we recognized the emotional states by implementing three convolutional networks, EEGNet,

ShallowConvNet and DeepConvNet using EEG signals. For every network, we used two methods: subject-

dependent and subject-independent. The best average classification accuracy of 99.11% in the valance and 99.19%

in the arousal was achieved by using ShallowConvNet and subject-dependent method. Furthermore, since the

models did not need feature extraction and selection steps, the processing steps were reduced. This makes it possible

to implement the algorithm on embedded systems. We used Raspberry Pi processor in our embedded system. After

optimization and quantization, we achieved a lightweight model that could recognize emotional states for every

three-second epochs of received signals. It is possible to use such hardware in applicable devices such as emotion-

detection wearable headbands.

Future studies aim to use mobile and wearable sensors to collect facial expressions and some other physiological

signals such as ECG and EEG; and combine them in an appropriate framework to improve accuracy in real-time

emotion detection. In this study, as well as most papers in the literature, the emotions have been identified in two

dimensions of valance and arousal. Expanding the model to include additional dimensions may also be considered in

future approach. For example, by analyzing situational information, the subject can be predicted in a three-

dimensional model of emotions, namely arousal, valance and position.


This work is supported by Isfahan University of Medical Sciences, Grant No. 2400207.


Al-Nafjan, A., Hosny, M., Al-Ohali, Y., & Al-Wabil, A. (2017). Review and Classification of Emotion Recognition

Based on EEG Brain-Computer Interface System Research: A Systematic Review. Applied Sciences, 7(12),

1239. https://doi.org/10.3390/app7121239

Ayata, D., Yaslan, Y., & Kamaşak, M. (2016). Emotion recognition via random forest and galvanic skin response:

Comparison of time based feature sets, window sizes and wavelet approaches. 2016 Medical Technologies
National Congress (TIPTEKNO), 1–4. https://doi.org/10.1109/TIPTEKNO.2016.7863130

Cui, H., Liu, A., Zhang, X., Chen, X., Wang, K., & Chen, X. (2020). EEG-based emotion recognition using an end-

to-end regional-asymmetric convolutional neural network. Knowledge-Based Systems, 205, 106243.


Domínguez-Jiménez, J. A., Campo-Landines, K. C., Martínez-Santos, J. C., Delahoz, E. J., & Contreras-Ortiz, S. H.

(2020). A machine learning model for emotion recognition from physiological signals. Biomedical Signal

Processing and Control, 55, 101646. https://doi.org/10.1016/j.bspc.2019.101646

Egger, M., Ley, M., & Hanke, S. (2019). Emotion Recognition from Physiological Signal Analysis: A Review.

Electronic Notes in Theoretical Computer Science, 343, 35–55. https://doi.org/10.1016/j.entcs.2019.04.009

Gannouni, S., Aledaily, A., Belwafi, K., & Aboalsamh, H. (2021). Emotion detection using electroencephalography

signals and a zero-time windowing-based epoch estimation and relevant electrode identification. Scientific

Reports, 11(1), 7071. https://doi.org/10.1038/s41598-021-86345-5

Goshvarpour, A., Abbasi, A., & Goshvarpour, A. (2017). An accurate emotion recognition system using ECG and

GSR signals and matching pursuit method. Biomedical Journal, 40(6), 355–368.


Goshvarpour, A., & Goshvarpour, A. (2020a). The potential of photoplethysmogram and galvanic skin response in

emotion recognition using nonlinear features. Physical and Engineering Sciences in Medicine, 43(1), 119–

134. https://doi.org/10.1007/s13246-019-00825-7

Goshvarpour, A., & Goshvarpour, A. (2020b). A Novel Approach for EEG Electrode Selection in Automated

Emotion Recognition Based on Lagged Poincare’s Indices and sLORETA. Cognitive Computation, 12(3),

602–618. https://doi.org/10.1007/s12559-019-09699-z

Huang, D., Chen, S., Liu, C., Zheng, L., Tian, Z., & Jiang, D. (2021). Differences first in asymmetric brain: A bi-

hemisphere discrepancy convolutional neural network for EEG emotion recognition. Neurocomputing, 448,

140–151. https://doi.org/10.1016/j.neucom.2021.03.105

Javidan, M., Yazdchi, M., Baharlouei, Z., & Mahnam, A. (2021). Feature and channel selection for designing a

regression-based continuous-variable emotion recognition system with two EEG channels. Biomedical

Signal Processing and Control, 70, 102979. https://doi.org/10.1016/j.bspc.2021.102979

Keren, G., Kirschstein, T., Marchi, E., Ringeval, F., & Schuller, B. (2017). End-to-end Learning for Dimensional

Emotion Recognition from Physiological Signals. 985–990. https://hal.archives-ouvertes.fr/hal-02080895

Khateeb, M., Anwar, S. M., & Alnowami, M. (2021). Multi-Domain Feature Fusion for Emotion Classification

Using DEAP Dataset. IEEE Access, 9, 12134–12142. https://doi.org/10.1109/ACCESS.2021.3051281

Ko, B. C. (2018). A Brief Review of Facial Emotion Recognition Based on Visual Information. Sensors, 18(2), 401.


Lawhern, V. J., Solon, A. J., Waytowich, N. R., Gordon, S. M., Hung, C. P., & Lance, B. J. (2018). EEGNet: A

Compact Convolutional Network for EEG-based Brain-Computer Interfaces. ArXiv:1611.08024 [Cs, q-

Bio, Stat]. https://doi.org/10.1088/1741-2552/aace8c

Li, R., Liang, Y., Liu, X., Wang, B., Huang, W., Cai, Z., Ye, Y., Qiu, L., & Pan, J. (2021). MindLink-Eumpy: An

Open-Source Python Toolbox for Multimodal Emotion Recognition. Frontiers in Human Neuroscience, 15.


Li, X., Song, D., Zhang, P., Zhang, Y., Hou, Y., & Hu, B. (2018). Exploring EEG Features in Cross-Subject

Emotion Recognition. Frontiers in Neuroscience, 12, 162. https://doi.org/10.3389/fnins.2018.00162

Lieskovská, E., Jakubec, M., Jarina, R., & Chmulík, M. (2021). A Review on Speech Emotion Recognition Using

Deep Learning and Attention Mechanism. Electronics, 10(10), 1163.


Ltd, R. P. (n.d.). Raspberry Pi 4 Model B specifications. Raspberry Pi. Retrieved March 2, 2022, from


Moon, S.-E., Jang, S., & Lee, J.-S. (2018). Convolutional Neural Network Approach for Eeg-Based Emotion

Recognition Using Brain Connectivity and its Spatial Information. 2018 IEEE International Conference on

Acoustics, Speech and Signal Processing (ICASSP), 2556–2560.


Nath, D., Anubhav, Singh, M., Sethia, D., Kalra, D., & Indu, S. (2020). A Comparative Study of Subject-Dependent

and Subject-Independent Strategies for EEG-Based Emotion Recognition using LSTM Network.

Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis, 142–147.

Nikolova, D., Mihaylova, P., Manolova, A., & Georgieva, P. (2019). ECG-Based Human Emotion Recognition

Across Multiple Subjects. In V. Poulkov (Ed.), Future Access Enablers for Ubiquitous and Intelligent

Infrastructures (pp. 25–36). Springer International Publishing. https://doi.org/10.1007/978-3-030-23976-


Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–

1178. https://doi.org/10.1037/h0077714

Schirrmeister, R. T., Springenberg, J. T., Fiederer, L. D. J., Glasstetter, M., Eggensperger, K., Tangermann, M.,

Hutter, F., Burgard, W., & Ball, T. (2017). Deep learning with convolutional neural networks for EEG

decoding and visualization: Convolutional Neural Networks in EEG Analysis. Human Brain Mapping,

38(11), 5391–5420. https://doi.org/10.1002/hbm.23730

Sepúlveda, A., Castillo, F., Palma, C., & Rodriguez-Fernandez, M. (2021). Emotion Recognition from ECG Signals

Using Wavelet Scattering and Machine Learning. Applied Sciences, 11(11), 4945.


Suhaimi, N. S., Mountstephens, J., & Teo, J. (2020). EEG-Based Emotion Recognition: A State-of-the-Art Review

of Current Trends and Opportunities. Computational Intelligence and Neuroscience, 2020, 1–19.


Tan, C., Ceballos, G., Kasabov, N., & Puthanmadam Subramaniyam, N. (2020). FusionSense: Emotion

Classification Using Feature Fusion of Multimodal Data and Deep Learning in a Brain-Inspired Spiking

Neural Network. Sensors, 20(18), 5328. https://doi.org/10.3390/s20185328

Tarnowski, P., Kołodziej, M., Majkowski, A., & Rak, R. J. (2018). Combined analysis of GSR and EEG signals for

emotion recognition. 2018 International Interdisciplinary PhD Workshop (IIPhDW), 137–141.


Tripathi, S., Acharya, S., Sharma, R., Mittal, S., & Bhattacharya, S. (2017). Using Deep and Convolutional Neural

Networks for Accurate Emotion Classification on DEAP Data. Proceedings of the AAAI Conference on

Artificial Intelligence, 31(2), 4746–4752.

Villarejo, M. V., Zapirain, B. G., & Zorrilla, A. M. (2012). A Stress Sensor Based on Galvanic Skin Response

(GSR) Controlled by ZigBee. Sensors, 12(5), 6075–6101. https://doi.org/10.3390/s120506075

Wang, X.-W., Nie, D., & Lu, B.-L. (2014). Emotional state classification from EEG data using machine learning

approach. Neurocomputing, 129, 94–106. https://doi.org/10.1016/j.neucom.2013.06.046

Yin, Y., Zheng, X., Hu, B., Zhang, Y., & Cui, X. (2021). EEG emotion recognition using fusion model of graph

convolutional neural networks and LSTM. Applied Soft Computing, 100, 106954.


Zhang, J., Yin, Z., Chen, P., & Nichele, S. (2020). Emotion recognition using multi-modal data and machine

learning techniques: A tutorial and review. Information Fusion, 59, 103–126.


Zhang, Q., Chen, X., Zhan, Q., Yang, T., & Xia, S. (2017). Respiration-based emotion recognition with deep

learning. Computers in Industry, 92–93, 84–90. https://doi.org/10.1016/j.compind.2017.04.005

Zhang, Y., Ji, X., & Zhang, S. (2016). An approach to EEG-based emotion recognition using combined feature

extraction method. Neuroscience Letters, 633, 152–157. https://doi.org/10.1016/j.neulet.2016.09.037

You might also like