Engineering and Technology Journal e-ISSN: 2456-3358
Volume 06 Issue 03 March-2021, Page No.-822-828
DOI: 10.47191/etj/v6i3.04, I.F. – 6.39
© 2021, ETJ
Emotion Detection Using Deep Learning Algorithm
Shital S. Yadav1, Anup S. Vibhute2
1,2
Department of Electronics and telecommunication Engineering, SVERI’s College of Engineering,
Pandharpur, Maharashtra, India.
ABSTRACT: Automatic emotion detection is a key task in human machine interaction, where emotion detection makes system
more natural. In this paper, we propose an emotion detection using deep learning algorithm. The proposed algorithm uses end to end
CNN. To increase computational efficiency of the deep network, we make use of trained weight parameters of the Mobile Net [1] to
initialize the weight parameters of our system. To make our system independent of the input image size, we place global average
pooling layer On top of the last convolution layer of it. Proposed system is validated for emotion detection using two benchmark
datasets viz. Cohn–Kanade+ (CK+) and Japanese female facial expression (JAFFE). The experimental results show that the proposed
method outperforms the other existing methods for emotion detection.
KEYWORDS: Emotion detection, Human machine interaction, deep learning
1. INTRODUCTION
Through outside agents such as speech, gestures, and facial
expressions, individuals communicate their real intent and
emotions. Facial features provide valuable information on an
individual’s emotional state, theory of mind, public persona,
and psychology [2]. Fully automated Facial Expression
Recognition (FER) is a non-intrusive approach to the analysis
of human affective behavioral patterns. The FER device plays
a key role in human–computer interaction, tracking, deception
or lie detection, behavioral profiling, and healthcare
applications. Automated emotion study has received
increasing attention in recent years because of a very diverse
variety and continued development in such applications.
In [3], authors make use of facial action coding system
(FACS) for emotion detection. This study witnessed the
characterization of emotions is approximately same across
the globe. Also, they categorized the human emotions into
anger, sad, fear, happy, disgust, and surprise. A traditional
emotion detection approach includes (1) image acquisition
(2) image pre-processing (3) feature extraction and (4)
classification (emotion detection). Accuracy of such
traditional emotion detection system is depends upon the
robustness of feature extraction and classification stage.
In this paper, we proposed an emotion detection approach
using convolution neural network (CNN). The proposed endto-end CNN is therefore named as ENet. Rest of the
manuscript is organized as follow:
Section 1 introduces the automatic emotion detection system,
Section 2 enlightens the existing approaches for emotion
detection, the proposed approach for emotion detection is
discussed in the Section 3. Training details of the proposed
ENet are given in the Section 4. The experimental analysis has
822
been carried out in Section 5. Conclusion is drawn in Section
6.
2. LITERATURE SURVEY
A general FER follows five stages which includes the task of
image capturing, the creation of pre-processing techniques,
effective feature extraction, recognition and post- processing.
The usefulness of such a structure depends to a large degree
on the exact mechanism of abstraction and classification of
features. Even whilst using the best classification model,
insufficient extraction of the function will degrade the
efficiency. For a reliable FER system, developing a suitable
characteristic descriptor is indeed vital.
Techniques for feature extraction process may be
generally grouped into two types: handmade features and [4]
learned features. The handmade features are well beforedesigned to capture specific facial expressions while the
learned features are coded utilizing convolution neural
networks (CNN). The CNN based methods [5–17] jointly
learn to classify the facial expression through the correct
attributes and weights. Handmade features proposed in the
existing method broadly comes under appearance based
features and geometric features. The geometric features [18,
19] encode the face image with the help of geometric
properties like deformation, contour, and various other
geometric properties. Zhang et al. [20] represented face
image by 34 facial points and utilized them as a landmark
points. Further, these landmark points are used to extract
geometric features. Valstar et al. [21] proposed to track the
facial points and detect the AUs (Action Units) in the face
image. The facial expressions can be recognized based on the
detected AUs in the image. The geometric features fail to
Shital S. Yadav1, ETJ Volume 6 Issue 03 March 2021
“Emotion Detection Using Deep Learning Algorithm”
identify the minute characteristics such as ridges and skin
texture changes and are dependent on reliable and accurate
feature detection and tracking. In addition, preprocessing
techniques are required to localize various facial components
before the extraction of facial features.
Appearance based methods have been widely used to
measure the physical appearance of a facial image.
Especially, the local feature descriptor based methods for
facial appearance analysis have gained popularity due to their
ease of implementation, pose invariance, and robustness to
illumination. These methods capture the spatial topology of
the image in the local neighborhood. Shan et al. [22] used
Local Binary Pattern (LBP) to extract the facial features for
expression recognition. Lai et al. [23] employed a two stage
feature extraction; first, they retrieved threshold LBP
responses and then applied centre symmetric LBP. In local
directional patterns (LDP) [24], the local edge responses in
eight directions were computed using eight different masks.
Extraction of the salient directional responses increases the
discriminative capability of the descriptors. Rivera et al. [25]
represented the directional information by encoding a more
discriminative Local Directional Number (LDN) patterns.
Rivera et al. [26] proposed to extract the texture information
by identifying the principal directions and encoded the
intensity variation of the principal directions into response
numbers called Local Directional Ternary Pattern (LDTP).
Ryu et al. [27] used the ternary patterns to extract the
directional information and designed a multi-level grid based
approach to characterize the coarse and finer features
separately. The coarse grids are used for stable codes, which
are closely related to non-expressions, whereas the finer grids
are used for active codes which are closely related to
expressions. Furthermore, the facial image can be divided
into non-overlapping regions to extract features, which
increase the performance of the system [22].The selection of
salient grids, size, and location of the grids directly affect the
recognition accuracy of the FER system.
Fig1: End-to-end deep network for emotion detection
Lee et al. [28] proposed to use the sparse representation of
images to reduce the intra-class variations of expressions.
Mohammadzade and Hatzinakos [29] introduced the concept
of expression subspace which represents a particular
expression with one subspace and new expressions can be
synthesized from an image by applying projections into
different expression subspaces. Hybrid methods incorporate
various techniques from geometric and appearance-based
approaches to attain enhanced performance. Zhang et al. [30]
proposed to capture facial movement features based on
distance features. These distance features are computed by
extracting the salient patch-based Gabor features. Happy et
al. [31] extracted the salient patches from various active
facial patches during the emotion elicitation. They further
extracted LBP features from the salient patches to generate
the feature vector and classified the expression images
using the SVM classifier. Furthermore, many significant
works for the advancement of FER systems were done in [32–
823
34].
More recent works in facial expression analysis have used
deep learning approaches to solve the recognition problem.
The deep learning methods learn both the feature extraction
and the network weight parameters for accurate classification
using the training data. Burkert et al. [5] designed a deep
convolutional neural network (CNN) inspired by Google Net
and introduced a parallel feature extraction (FeatEx) block to
extract features at different scales. Mollahosseini et al. [6]
applied the concept of network in network architecture and
added inception layers after two traditional CNN layers. The
inception layers are concatenated as output and connected to
the fully connected layers. Barsoum et al. [7] trained a
customized VGG13 network to verify different crowd
sourced label distribution techniques and undertook facial
expression classification as a case study. Hasani and Mahoor
[8] designed a network consisting of 3D inception ResNet
layers followed by a Long Short Term Memory (LSTM)
Shital S. Yadav1, ETJ Volume 6 Issue 03 March 2021
“Emotion Detection Using Deep Learning Algorithm”
module. These layers extract both the spatial and temporal
relations in the face image and frame sequences in the
video, respectively. A two step training method was proposed
by Ding et al. [9] where, in the first stage, the convolution
layer weights are regularized and in the second stage, the fully
connected layers are added to the pre-tuned convolution
layers and train the complete network to learn the optimal
classification parameters. A combination of multiple CNN
architectures is yielding better classification accuracy. With
this consideration, Pons et al. [10] proposed to improve the
FER accuracy by supervised learning of committee of CNNs.
Kim et al. [11] combined the decisions from a hierarchical
committee of CNNs and hand-crafted hierarchical decision
×
rule. Other CNN based systems such as VGG [12], ResNet
[13], DTAGN [14], DTAGN-Joint [15], spatio-temporal [16]
and GCNet [17] have also achieved accelerated growth in the
field of FER.
3. PROPOSED METHOD FOR EMOTION
DETECTION
In this section, the proposed approach for emotion detection
is discussed. We propose an end-to-end convolution neural
network named as ENet for emotion detection from images.
Keeping in mind the computational efficiency of the deep
network, we make use of pre- trained weight parameters of
the existing MobileNet [1] to initialize the weight parameters
of the ENet. MobileNet is a deep network designed for object
recognition. Due to the following facts we choose pre-trained
weight parameters of the MobileNet:
• It is trained on large-scale dataset for object recognition task.
It is compact and computationally less expensive deep
• to the other existing deep networks [6,
network as compared
12, 13].
The proposed ENet is divided into two parts viz. (1) feature
extraction (2) emotion detection.
3.1Feature Extraction
The basic building blocks of CNN to extract features are
convolution, pooling and rectified linear unit (ReLU) layers.
A standard convolution layer both filters and combines input
feature maps into a new set of features in one step. This turns
into the increase in number of multiplications and additions in
performing convolution operation. In [1], authors showed an
effective way to reduce the number of computations by
splitting the standard convolution layer operation into two
layers without much affecting the accuracy. Thus, in ENet,
instead of using standard convolution filters, we use factorized
convolution filters [1] which factorize a standard convolution
into a depth-wise convolution and a point-wise convolution.
Here, depth-wise convolution applies a single filter to
each input feature map while the point-wise convolution
applies a 1 1 convolution to combine the outputs of the depthwise convolution. This factorization has the effect of
drastically reducing computation and model size. Feature maps
of both the layers are passed through the batch norm and
ReLU nonlinearities. To reduce the number of computations
further, each convolution layer with stride 2 is used instead of
convolution followed by pooling layer . These building blocks
are used to make feature extraction module of ENet for
emotion detection. The detailed network architecture is given
in Table 1.
3.2 Emotion Detection
To detect the human emotion from the extracted features, we
employed global feature aver- age layer followed by the
classification layer on top of the extracted feature maps.
3.2.1 Global Average Pooling (GAP)
is an operation that calculates the average output of each
feature map in the previous layer. This fairly simple operation
reduces the data significantly and prepares the model for the
final classification layer. Also, due to the averaging operation
over the feature maps, the model becomes more robust to
spatial translations in the data. 3.2.2 Classification layer
Comprise of seven neurons belongs to seven emotions viz.
anger, sad, fear, happy, disgust and surprise. SoftMax layer is
employed as an activation function
Feature Extraction
Table 1: Network architecture details of the proposed ENet for emotion recognition.
824
Input Size
Type/Stride
Filter Shape
Output
Size
128x128x3
Conv/2
3x3x3x32
64x64x32
64x64x3
64x64x3
Conv dw/1
Conv pw/1
3x3x32 dw
1x1x32x64
64x64x32
64x64x64
64x64x64
32x32x64
Conv dw/2
Conv pw/1
3x3x64 dw
1x1x64x128
32x32x64
32x32x128
32x32x128
32x32x128
Conv dw/1
Conv pw/1
3x3x128 dw
1x1x128x128
32x32x128
32x32x128
32x32x128
Conv dw/2
3x3x128 dw
16x16x128
16x16x128
Conv pw/1
1x1x128x256
16x16x256
16x16x256
16x16x256
Conv dw/1
Conv pw/1
3x3x256 dw
1x1x256x256
16x16x256
16x16x256
Shital S. Yadav1, ETJ Volume 6 Issue 03 March 2021
“Emotion Detection Using Deep Learning Algorithm”
16x16x256
8x8x256
Conv dw/2
Conv pw/1
3x3x256 dw
1x1x256x512
8x8x256
8x8x512
8x8x512
8x8x512
Conv dw/1
Conv pw/1
3x3x512 dw
1x1x512x512
8x8x512
8x8x512
8x8x512
4x4x512
Conv dw/2
Conv pw/1
3x3x512 dw
1x1x512x1024
4x4x512
4x4x1024
4x4x1024
Conv dw/1
3x3x1024 dw
4x4x1024
4x4x1024
Conv pw/1
1x1x1024x1024
4x4x1024
4x4x1024
GAP
-
1024
1x1x1024
FC
1024x512
1x1x512
FC
FC
Classifier
512x128
128x7
SoftMax
1x1x128
1x1x7
Detection
Emotion
1x1x512
1x1x128
1x1x7
*GAP: Global Average Pooling Layer
*FC: Fully Connected Layer
4. EXPERIMENTAL ANALYSIS
In this Section, we have validated the propose ENet for
emotion detection from images. We have considered two
benchmark datasets for the evaluation viz. Cohn–Kanade+
(CK+) [35] and Japanese female facial expression (JAFFE) .
The considered evaluation measure is emotion recognition
accuracy which is formulated as:
𝐸𝑅𝐴 =
N1
N
…………………..[1]
Where ERA - Emotion Recognition Accuracy
N1 - Total No of Correctly detected samples
N - Total no of Samples
In this study, image sets are randomly divided into N parts for
getting better classification accuracy. The N-1 parts are used
as a training set and rest is used as a testing set. We have
randomly divided the dataset into a ratio of 80:20 and
selected the training and testing set images, respectively. The
final recognition accuracy is calculated by taking the average
of the accuracy produced after five iterations.
Fig 2: Sample images from each category of ck+ and JAFFE dataset
4.1 Results on Jaffe Database
The (JAFFE) [36] dataset contains 213 facial images of ten
Japanese females. The subjects posed for neutral and six basic
facial expressions. Each expression set consists of an almost
same number of images. The facial images were captured
from the frontal view and hair of the female subjects were tied
back to expose all the expressive zones of facial region.
Sample image from each emotion category are shown in the
Figure 2. We have given the average recognition accuracy for
825
six class and seven class problem over JAFFE dataset in Table
2. The proposed method achieves better recognition accuracy
as compared to state of the art handcrafted approaches as well
as some of the deep learning techniques. More specifically,
the proposed method attain recognition rate improvement of
10.8 and 9.3% for the six class problem and 11.1 and 9.2%
for the seven class problem over LBP [22] and LDN [25],
respectively.
Shital S. Yadav1, ETJ Volume 6 Issue 03 March 2021
“Emotion Detection Using Deep Learning Algorithm”
Table 2: Emotion recognition accuracy of the proposed ENet and existing approaches on CK+ and JAFFE dataset
Method
CK+
6EX
7EX
JAFFE
6EX
7EX
LBP [22]
93.50
89.00
85.20
84.30
two-phase [23]
LDP [24]
LDN [25]
LDTP [26]
LDTerP [27]
VGG16 [12]
VGG19 [12]
ResNet50 [13]
RADAP [37]
XRADAP [37]
ARADAP [37]
DRADAP [37]
88.20
96.20
94.80
95.30
95.70
96.70
97.20
94.00
96.20
96.60
96.20
96.00
79.50
92.90
91.70
91.90
91.50
95.20
81.20
91.80
94.70
93.80
93.80
95.40
83.30
85.00
86.70
83.30
80.50
80.50
86.20
82.90
61.10
66.70
75.00
95.00
88.40
91.10
93.90
73.80
73.80
64.30
88.10
88.10
88.10
90.50
ENet
98.44
97.22
96.00
95.40
4.2 Results on CK+ Database
The CK+ [35] [38] database includes 593 image sequences of
123 different subjects. The subjects are of American,
African–American, Asian and Latin origin. Each sequence
starts with the neutral state and ends at the apex of an
expression. The dataset provides frontal facial images of six
expressions: anger, disgust, fear, happy, sad, and surprise. We
selected three apex frames from each sequence to prepare the
image set for an expression class. We also collected the
neutral state images from the onset of the image sequences to
create a neutral image set. Finally, we augmented a total of
1043 images: 132-anger, 180-disgust, 75-fear, 204-happy,
87-sad, 249-surprise, and 116-neutral. Sample image from
each emotion category are shown in the Figure 2.
In our experiments, in order to compare the performance
of the proposed ENet with other deep learning techniques, we
have trained the VGG16, VGG19 and ResNet50 networks
over the CK+ dataset. The pre-trained weights from
ImageNet were used as initial weight parameters while finetuning these networks. We have measured the performance of
the FER system in terms of average recognition accuracy. In
Table 2, we have shown a comprehensive comparative
analysis of the proposed methods with state-of-the-art
approaches including recent deep learning methods. From
Table 2, it is evident that the proposed methods achieve
superior recognition accuracy as compared to the existing
feature descriptors. It also outperforms ResNet50 for both 6class and 7- class recognition problem for emotion
recognition
5. CONCLUSION
In this paper, we have proposed an emotion recognition
network named as ENet. To make the ENet computationally
efficient, we make use of factorized convolution layer. The
global average pooling layer is employed on top of the
826
extracted feature maps to make the proposed ENet invariant
to the spatial translations in the data. Experimental analysis
has been carried out on two benchmark datasets used in the
emotion recognition studies viz.CK+ and JAFFE. The
performance of the proposed ENet compared with the
existing deep learning as well as non-deep learning based
approaches. Comparative analysis shows that the proposed
ENet outperforms the other existing approaches for emotion
recognition.
REFERENCES
1. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W.
Wang, T. Weyand, M. Andreetto, and H. Adam,
“Mobilenets: Efficient convolutional neural
networks for mobile vision applica- tions,” arXiv
preprint arXiv:1704.04861, 2017.
2. G. Donato, M. S. Bartlett, J. C. Hager, P. Ekman,
and T. J. Sejnowski, “Classifying facial
actions,”
IEEE Transactions on pattern analysis and machine
intelligence, vol. 21, no. 10, pp. 974–989, 1999.
3. P. Ekman and W. V. Friesen, “Constants across
cultures in the face and emotion.” Journal of
personality and social psychology, vol. 17, no. 2, p.
124, 1971.
4. C. A. Corneanu, M. O. Sim ón, J. F. Cohn, and S. E.
Guerrero, “Survey on rgb, 3d, thermal, and
5. multimodal approaches for facial expression
recognition: History, trends, and affect-related
applications,” IEEE transactions on pattern analysis
and machine intelligence, vol. 38, no. 8, pp. 1548–
1568, 2016.
6. P. Burkert, F. Trier, M. Z. Afzal, A. Dengel, and M.
Liwicki, “Dexpres- sion: Deep convolutional neural
network for expression recognition,” arXiv preprint
arXiv:1509.05371, 2015
Shital S. Yadav1, ETJ Volume 6 Issue 03 March 2021
“Emotion Detection Using Deep Learning Algorithm”
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
827
A. Mollahosseini, D. Chan, and M. H. Mahoor,
“Going deeper in facial expression recognition
using deep neural networks,” in 2016 IEEE Winter
conference on applications of computer vision
(WACV). IEEE, 2016, pp. 1–10.
E. Barsoum, C. Zhang, C. C. Ferrer, and Z. Zhang,
“Training deep networks for facial expression
recognition with crowd-sourced label distribution,”
in Proceedings of the 18th ACM International
Conference on Multimodal Interaction, 2016, pp.
279–283.
B. Hasani and M. H. Mahoor, “Facial expression
recognition using enhanced deep 3d convolutional
neural networks,” in Proceedings of the IEEE
Conference on Computer Vision and Pattern
Recognition Workshops, 2017, pp. 30–40.
H. Ding, S. K. Zhou, and R. Chellappa,
“Facenet2expnet: Regularizing a deep face
recognition net for expression recognition,” in 2017
12th IEEE International Conference on Automatic
Face & Gesture Recognition (FG 2017). IEEE,
2017, pp. 118–126.
G. Pons and D. Masip, “Supervised committee of
convolutional neural networks in automated facial
expression analysis,” IEEE Transactions on
Affective Computing, vol. 9, no. 3, pp. 343–350,
2017.
B.-K. Kim, J. Roh, S.-Y. Dong, and S.-Y. Lee,
“Hierarchical committee of deep convolutional
neural networks for robust facial expression
recognition,” Journal on Multimodal User
Interfaces, vol. 10, no. 2, pp. 173–189, 2016.
K. Simonyan and A. Zisserman, “Very deep
convolutional networks for large-scale image
recognition,” arXiv preprint arXiv:1409.1556, 2014.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual
learning for image recognition,” in Proceedings of
the IEEE conference on computer vision and pattern
recognition, 2016, pp. 770–778.
H. Jung, S. Lee, S. Park, I. Lee, C. Ahn, and J. Kim,
“Deep temporal appearance-geometry network for
facial expression recognition,” arXiv preprint
arXiv:1503.01532, 2015.
H. Jung, S. Lee, J. Yim, S. Park, and J. Kim, “Joint
fine-tuning in deep neural networks for facial
expression recognition,” in Proceedings of the IEEE
international conference on computer vision, 2015,
pp. 2983– 2991.
K. Zhang, Y. Huang, Y. Du, and L. Wang, “Facial
expression recognition based on deep evolutional
spatial-temporal networks,” IEEE Transac- tions on
Image Processing, vol. 26, no. 9, pp. 4193–4203,
2017.
Y. Kim, B. Yoo, Y. Kwak, C. Choi, and J. Kim,
“Deep generative- contrastive networks for facial
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
expression
recognition,”
arXiv
preprint
arXiv:1703.07140, 2017.
M. Pantic and I. Patras, “Dynamics of facial
expression: recognition of facial actions and their
temporal segments from face profile image
sequences,” IEEE Transactions on Systems, Man,
and Cybernetics, Part B (Cybernetics), vol. 36, no.
2, pp. 433–449, 2006.
N. Sebe, M. S. Lew, Y. Sun, I. Cohen, T. Gevers,
and T. S. Huang, “Authentic facial expression
analysis,” Image and Vision Computing, vol. 25, no.
12, pp. 1856–1863, 2007.
Z. Zhang, M. Lyons, M. Schuster, and S. Akamatsu,
“Comparison between geometry-based and gabor
wavelets-based facial expression recognition using
multi-layer perceptron,” 05 1998, pp. 454 – 459.
M. F. Valstar, I. Patras, and M. Pantic, “Facial action
unit detection using probabilistic actively learned
support vector machines on tracked facial point
data,” in 2005 IEEE Computer Society Conference
on Computer Vision and Pattern Recognition
(CVPR’05)-Workshops. IEEE, 2005, pp. 76–76.
C. Shan, S. Gong, and P. W. McOwan, “Facial
expression recognition based on local binary
patterns: A comprehensive study,” Image and vision
Computing, vol. 27, no. 6, pp. 803–816, 2009.
C.-C. Lai and C.-H. Ko, “Facial expression
recognition based on two- stage features extraction,”
Optik, vol. 125, no. 22, pp. 6678–6680, 2014.
T. Jabid, M. H. Kabir, and O. Chae, “Robust facial
expression recog- nition based on local directional
pattern,” ETRI journal, vol. 32, no. 5, pp. 784–794,
2010.
A. R. Rivera, J. R. Castillo, and O. O. Chae, “Local
directional number pattern for face analysis: Face
and expression recognition,” IEEE transactions on
image processing, vol. 22, no. 5, pp. 1740–1752,
2012.
A. R. Rivera, J. R. Castillo, and O. Chae, “Local
directional texture pattern image descriptor,” Pattern
Recognition Letters, vol. 51, pp. 94– 100, 2015.
B. Ryu, A. R. Rivera, J. Kim, and O. Chae, “Local
directional ternary pattern for facial expressionrecognition,”IEEETransactionsonImagePro
cessing,vol.26,no.12,pp.6006–6018, 2017.
S. H. Lee, K. N. K. Plataniotis, and Y. M. Ro, “Intraclass variation reduction using training expression
images for sparse representation based facial
expression recognition,” IEEE Transactions on
Affective Computing, vol. 5, no. 3, pp. 340–
351,2014
H. Mohammadzade and D. Hatzinakos, “Projection
into expression subspaces for face recog- nitionfrom
single sample per person,” IEEE Transactions on
Affective Computing, vol. 4, no. 1, pp. 69–82,2012.
Shital S. Yadav1, ETJ Volume 6 Issue 03 March 2021
“Emotion Detection Using Deep Learning Algorithm”
31. L. Zhang and D. Tjondronegoro, “Facial expression
recognition using facial movement fea- tures,” IEEE
Transactions on Affective Computing, vol. 2, no. 4,
pp. 219–229, 2011.
32. S. Happy and A. Routray, “Automatic facial
expression recognition using features of salient
facial patches,” IEEE transactions on Affective
Computing, vol. 6, no. 1, pp. 1–12, 2014.
33. T. Zhang, W. Zheng, Z. Cui, Y. Zong, J. Yan, and K.
Yan, “A deep neural network-driven feature
learning method for multi-view facial expression
recognition,” IEEE Transactions on Multimedia, vol.
18, no. 12, pp. 2528–2536,2016.
34. W. Zheng, Y. Zong, X. Zhou, and M. Xin, “Crossdomain color facial expression recognition using
transductive transfer subspace learning,” IEEE
transactions on Affective Computing, vol. 9, no. 1,
pp. 21–37, 2016.
35. O. Rudovic, M. Pantic, and I. Patras,“Coupled
gaussian processes for pose-invariant fa-cial
expression recognition,” IEEE transactions on
pattern analysis and machine intelligence, vol. 35,
no. 6, pp. 1357–1369,2012.
36. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z.
Ambadar, and I. Matthews, “The extended cohnkanade dataset (ck+): A complete dataset for action
unit and emotion-specified expression,” in 2010
IEEE Computer Society Conference on Computer
Vision and Pattern Recognition - Workshops, 2010,
pp. 94–101.
37. M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba,
“Coding facial expressions with gabor wavelets,” in
Proceedings Third IEEE international conference on
automatic face and gesture recognition. IEEE, 1998,
pp. 200–205.
38. M. Mandal, M. Verma, S. Mathur, S. K. Vipparthi,
S. Murala, and D. K. Kumar, “Regional adaptive
affinitive patterns (radap) with logical operators for
facial expression recognition,” IET Image
Processing, vol. 13, no. 5, pp. 850–861, 2019.
39. T. Kanade, J. F. Cohn, and Yingli Tian,
“Comprehensive database for facial expression analysis,” in Proceedings Fourth IEEE International
Conference on Automatic Face and Gesture
Recognition (Cat. No. PR00580), 2000, pp. 46–53
828
Shital S. Yadav1, ETJ Volume 6 Issue 03 March 2021