Final Report

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

Chapter 1: Introduction

1.1 Introduction
A Facial expression is the visible manifestation of the affective state, cognitive activity,
intention, personality and psychopathology of a person and plays a communicative role in
interpersonal relations. It have been studied for a long period of time and obtaining the progress
recent decades. Though much progress has been made, recognizing facial expression with a high
accuracy remains to be difficult due to the complexity and varieties of facial expressions.
Generally human beings can convey intentions and emotions through nonverbal ways such as
gestures, facial expressions and involuntary languages. This system can be significantly useful,
nonverbal way for people to communicate with each other. The important thing is how fluently
the system detects or extracts the facial expression from image. The system is growing attention
because this could be widely used in many fields like lie detection, medical assessment and
human computer interface. The Facial Action Coding System (FACS), which was proposed in
1978 by Ekman and refined in 2002, is a very popular facial expression analysis tool.
On day to day basics humans commonly recognize emotions by characteristic features, displayed
as a part of a facial expression. For instance happiness is undeniably associated with a smile or
an upward movement of the corners of the lips. Similarly other emotions are characterized by
other deformations typical to a particular expression. Research into automatic recognition of
facial expressions addresses the problems surrounding the representation and categorization of
static or dynamic characteristics of these deformations of face pigmentation.
The system classifies facial expression of the same person into the basic emotions namely
anger, disgust, fear, happiness, sadness and surprise. The main purpose of this system is efficient
interaction between human beings and machines using eye gaze, facial expressions, cognitive
modelling etc. Here, detection and classification of facial expressions can be used as a natural
way for the interaction between man and machine. And the system intensity vary from person to
person and also varies along with age, gender, size and shape of face, and further, even the
expressions of the same person do not remain constant with time.

However, the inherent variability of facial images caused by different factors like variations in
illumination, pose, alignment, occlusions makes expression recognition a challenging task. Some
1
surveys on facial feature representations for face recognition and expression analysis addressed
these challenges and possible solutions in detail.

Fig1.1: System Block Diagram

1.2 Aim:

The research aims to evaluate the potential for emotion recognition technology to
improve the quality of human-computer interaction.

1.3 Objectives:

 To detect and extract the user state of mind.


 To provide at least 80% of accuracy with the results
 To establish the extent to which people will naturally express emotions when they know
they are interacting with an emotion-detecting computer.
 To identify the conditions under which the application of emotion detection can lead to
improvements in subjective and/or objective measures of system usability.

 To experiment machine learning algorithm in computer vision fields.

 To detect emotion thus facilitating Intelligent Human-Computer Interaction


1.4 Motivation:

The motivation behind choosing this topic specifically lies in the huge
Investments large corporations do in feedbacks and surveys but fail to get
equitable response on their investments.

Also, in today’s networked world the need to maintain security of information or physical
property is becoming both increasingly important and increasingly difficult. In countries like
India the rate of crimes are increasing day by day. No automatic systems are there that can track
person’s activity. If we will be able to track Facial expressions of persons automatically then we
can find the criminal easily since facial expressions changes doing different activities. So we
decided to make a Facial Expression Recognition System.
We are interested in this project after we went through few papers in this area. The papers
were published as per their system creation and way of creating the system for accurate and
reliable facial expression recognition system.
As a result we are highly motivated to develop a system that recognizes facial expression
and track one person’s activity.

1.5 Problem Statement:

Human emotions and intentions are expressed through facial expressions and deriving an
efficient and effective feature is the fundamental component of facial expression system.
Emotion Detection through facial gestures is a technology that aims to improve product and
services performance by monitoring customer behavior to certain products or service staff by
their evaluation. Facial expressions convey non-verbal cues, which play an important role in
interpersonal relations. Automatic recognition of facial expressions can be an important
component of natural human-machine interfaces; it may also be used in behavioral science and in
clinical practice. An automatic Facial Expression Recognition system needs to solve the
following problems: detection and location of faces in a cluttered scene, facial feature extraction,
and facial expression classification.
Chapter 2: Review of Literature
Research paper 1

A Literature Review on Emotion Recognition using Various Methods


By Reeshad Khan & Omar Sharif
American International University

Abstract- Emotion Recognition is an important area of work to improve the interaction


between human and machine. Complexity of emotion makes the acquisition task more difficult.
Quondam works are proposed to capture emotion through unimodal mechanism such as only
facial expressions or only vocal input. More recently, inception to the idea of multimodal
emotion recognition has increased the accuracy rate of the detection of the machine. Moreover,
deep learning technique with neural network extended the success ratio of machine in respect of
emotion recognition. Recent works with deep learning technique has been performed with
different kinds of input of human behavior such as audio-visual inputs, facial expressions, body
gestures, EEG signal and related brainwaves. Still many aspects in this area to work on to
improve and make a robust system will detect and classify emotions more accurately. In this
paper, we tried to explore the relevant significant works, their techniques, and the effectiveness
of the methods and the scope of the improvement of the results.
Classification: I.4.8, I.7.5

A Literature Review on Emotion Recognition using Various Methods

Strictly as per the compliance and regulations of:

GJCST-F
Volume 17 Issue 1 Version 1.0 Year 2017
Online ISSN: 0975-4172 & Print ISSN: 0975-4350 Publisher: Global Journals Inc. (USA)
Type: Double Blind Peer Reviewed International Research Journal Graphics & vision G
I. Introduction
Most common exposition of an idea of emotion could be found as "a natural instinctive
state of mind deriving from one's circumstances, mood, or relationships with others". Which
misses depicting the driving force behind all motivation which may positive, negative or neutral?
This is very important information to understand emotion as an intelligent agent. It is very
complicated to detect the emotions and distinguish among them. Before a decades or two
emotions started to become a concern as an important addition towards the modern technology
world. Rises the hope of new dawn for intelligence apparatus. Imagine a world where machines
do feel what humans need or want. With the special kind of calculation then that machine could
predict the further consequences and by which mankind could avoid serious circumstances and
lot more. Humans are far more strong and intelligent due to the addition of the emotion but less
effective than machines. But what if machines get this special features of human? It will be the
strongest addition to the technology ever. And to make the dreams come true this is the first step;
train a system to spot and recognize emotions. This is the start of an intelligent system.
Intelligent Systems are becoming more efficient by predicting and classifying decision in various
aspects of practical life.

II. Recent Related Work in the Relevant Field


Machines used to predict emotion by only facial expressions. After a while, multimodal
systems that use more than one features to predict emotion has more effective and gives more
accurate results. Research has demonstrated that deep neural networks can effectively generate
discriminative features that approximate the complex non-linear dependencies between features
in the original set. These deep generative models have been applied to speech and language
processing, as well as emotion recognition tasks.
Martin et al. showed that bidirectional Long Short Term Memory(BLSTM) network is
more effective that conventional SVM approach
Yelin et al. showed three layered Deep Belief Networks(DBNs') give better performance
than two layered DBNs' by using audiovisual emotion recognition process.
Samira et al used Recurrent neural network combined with Convoluted Neural
Network(CNN) in an underlying CNN-RNN architecture to predict emotion in the video. Some
noble methods and techniques also enriched this particular research. They are more accurate,
stable and realistic
Yelin Kim and Emily Mower Provos explore whether a subset of an utterance can be
used for emotion inference and how the subset varies by classes of emotion and modalities. They
propose a windowing method. This method shows a significantly higher performance in emotion
recognition while the method only uses 40–80% of information within each utterance. But after
all achievement, the result is not consistent with this method .
A. Yao, D. Cai, P. Hu, S. Wang, L. Shan, and Y. Chen used a well-designed
Convolutional Neural Network (CNN) architecture regarding the video based emotion
recognition . They proposed the method named as HOLONET has three critical considerations in
network design. This method more realistic than other methods here. It's focused on adaptability
in real-time scenario than accuracy and theoretical performance. Though its accuracy is also
impressive but only this method is applicable only in the video based emotion recognition. Other
types of data rather than video, this method can't produce results .
Y. Fan, X. Lu, D. Li, and Y. Liu. proposed a method for video-based emotion
recognition in the wild. They used CNN-LSTM and C3D networks to simultaneously model
video appearances and motions
Zixing Zhang, Fabien Ringeval, Eduardo Coutinho, Erik Marchi and Björn Schüller
proposed some improvement in SSL technique to improve the low performance of a classifier
that can deliver on challenging recognition tasks reduces the trust ability of the automatically
labeled data and gave solutions regarding the noise accumulation problem.
Wei-Long Zheng and Bao-Liang Lu proposed EEG-based effective models without
labeled target data using transfer learning techniques (TCA-based Subject Transfer) which is
very accurate in terms of positive emotion recognition than other techniques used before. Their
method achieved 85.01% accuracy. They used to transfer learning and their method includes
three pillars, TCA-based Subject Transfer, KPCA-based Subject Transfer and Transductive
Parameter Transfer. Though this achievement is limited to the positive emotion recognition
only. This method is limited in terms of negative and neutral emotion recognition. Yet a lot
improvement needed to recognize negative and neutral emotion more accurately .

Reference and year Approach and Method Performance


Wei-Long Zheng and EEG-based affective models Positive (85.01%) emotion recognition
BaoLiang Lu (2016) without labeled target data rate is higher than other approaches
using transfer learning but neutral (25.76%) and negative
techniques (TCA-based (10.24%) emotions are often confused
Subject Transfer) with each other.

Zixing Zhang, Fabien Semi-Supervised Learning ) Delivers a strong performance in the


Ringeval, Fabien (SSL) technique classification of high/low emotional
Ringeval, Eduardo arousal (UAR = 76.5%), and
Coutinho, Erik Marchi significantly outperforms traditional
and Björn Schüller SSL methods by at least 5.0%
(2016 ) (absolute gain).

Y. Fan, X. Lu, D. Li, Video-based Emotion Achieved accuracy 59.02% (without


and Y. Liu. (2016) Recognition Using CNN- using any additional Emotion labeled
RNN and C3D Hybrid video clips in training set) which is the
Networks best till now.

A. Yao, D. Cai, P. Hu, HoloNet: towards robust Achieved mean recognition rate of
S. Wang, L. Shan and Y. emotion recognition in the 57.84%.
Chen (2016) wild
Yelin Kim and Emily Data driven framework to Achieved 65.60% UW accuracy,
Mower Provos (2016) explore patterns (timings 1.90% higher than the baseline.
and durations) of emotion
evidence, specific to
individual emotion classes
Table 2.1: Emotion recognition different approach and successes
III.Future Scope
We are working towards a machine with emotions. A machine or a system, which can
think like humans, can feel warmness of heart; can judge on events, prioritized between choices
and with many more emotional epithets. To make the dream reality first we need the machine or
system to understand human emotions, ape the emotion and master it. We just started to do that.
Though there is some real example exists this days. Some features and services are getting
popularity like Microsoft Cognitive Services but still there is a lot works required in the terms of
efficiency, accuracy and usability. Therefore, in future Emotion Recognition is an area requires a
great intentness.
IV. Conclusions
In this Paper we discussed about the work done on emotion recognition and for achieving
that all superior and novel approaches and methods. We have proposed a glimpse of a probable
solution and method towards recognition the emotion. Work so far substantiate that emotion
recognition using users EEG signal and audiovisual signal has the highest recognition rate and
has highest performance.
Research paper 2

A Study on Emotion Recognition Method and Its Application using Face


Image
978-1-5090-4032-2/17/$31.00 ©2017 IEEE

Abstract— In this paper, we introduce seven emotions and positive and negative
emotion recognition methods using facial images and the development of apps based on the
method. In previous researches, they used the deep-learning technology to generate models with
emotion-based facial expressions to recognized emotions. There are existing apps that express
six emotions, but not seven emotions and positive and negatives in graphs and percentages.
Thus, we recognized seven emotions such as Angry, Disgust, Fear, Happy, Sad, Surprise, and
Neutral and also classified the calculated emotion-recognition scores into positive, negative and
neutral emotions. Then we implemented an app that provides the user with seven emotions
scored and positive and negative emotions. Keywords— Emotion recognition, Positive and
negative, Scores, Facial images, Deep-learning.

I. INTRODUCTION In this paper, we introduce a method for recognizing seven


emotions such as Angry, Disgust, Fear, Happy, Sad, Surprise, and Neutral and positive and
negative emotions using facial images and the development of apps based on the method.. The
emotion-recognition Software Development Kit (SDK) [1] made by the US company "Affectiva"
extracted features from facial expressions using a Histogram of Oriented Gradient (Hog)
algorithm and learned 10,000 images using a Support Vector Machine (SVM) classifier. Seven
facial expressions, such as anger, disgust, fear, joy, sadness, surprise, and contempt, were used
for learning to recognize emotions. The generated emotion recognizer was used to develop SDK,
which provides an easy interface for other users.We also provided positive and negative
emotion-recognition results using the ranking and average of the scores from seven emotion-
recognition results. We developed an emotion-recognition model using deep learning’s
Convolutional Neural Networks (CNN) to develop this app and proposed a method for
recognizing emotions. Thus, in this study, we classified the calculated emotion-recognition
scores into positive and negative emotions and implemented an app that provides the user with
scores for seven positive and negative emotions.

II. RELATED WORK


A. Emotion recognition apps
Emotion Recognition in the US Artificial Intelligence Startup 'Affectiva' [2] has
developed a technique to capture the emotions of people in their digital images. Affectiva uses
computer vision and deep-learning techniques to analyze human facial expressions and
nonverbal signals. Digital images in video calls, live broadcasts, recorded videos, and moving
pictures were collected and categorized into categories, which were used to match facial
information, such as happy, sad, anxiety, interest, and surprise, registered in the past. To do this,
they collected 4.25 million images from people in 75 countries around the world and extracted
50 billion bits of data from images. They are distributing the technology to users free of charge
in the form of SDK and API.

Fig 2.1 Running app for affectiva

B. Emotion recognition using deep-learning


One study [3] proposed a more advanced method than the one that recognized only seven
emotions with CNN [4]. Their emotion-recognition method using deep learning followed four
steps, as follows. 1. Training the public face database with CNN. 2. Extraction of seven
probabilities for each frame of face. 3. Aggregation of single-frame probabilities into fixedlength
video descriptors for each video in the dataset. 4. Classification of all video clips using a support
vector machine (SVM) trained on video descriptors of the competition training set. Unlike the
previous study [4], they modified the image size of 48 * 48 to 40 * 40 and recognized emotions
by their proposed method. A final recognition rate of 47.67% was derived.

IV. PERFORMANCE EVALUATION


We conducted experiments to evaluate the performance of our apps. The face images
with six emotions except for neutral were taken each five times from each of the three twenties
men and women participated in the experiment. The recognition rate of seven emotions was
calculated by whether the emotions presented match the emotions of the first rank result. Positive
and negative emotion recognition rate was calculated whether the result were positive when the
proposed emotion was Happy, or negatvie when Angry, Disgust, Fear, Sad and Surprise.

V. CONCLUSION
In this paper, we proposed an emotion-recognition method using facial images and
implemented an app that provides seven emotion and positive and negative emotion-recognition
results to users. As a result, when we applied those recognition methods into apps, application
performance rate was 50.7% in seven emotions and in positive and negative was 72.3%. In the
future, we will improve the recognition rate by adding more emotional databases and modified
some parts of deep-learning algorithm. In addition, our research will be carried out to recognize
the user’s intention as well as the current user’s emotion recognition.
Research paper 3

Emotion Recognition with Consideration of Facial Expression and


Physiological Signals

Abstract—An emotion recognition system with consideration of facial expression and


physiological signals is proposed in this paper. A specific designed mood induction experiment
is performed to collect facial expressing images and physiological signals of subjects. We
detected 14 feature points and extracted 12 facial features from facial expression images. Two
learning vector quantization (LVQ) neural networks were applied to classify four emotions: love,
joy, surprise and fear. Experimental results show the proposed recognition system is able to
identify four emotions by facial expressions, physiological signals, and both of them.

I. INTRODUCTION
General expressions may include word choices, tone of voice, and body languages, such as
posture and physiology responses. Among the sensible emotional reactions, facial expressions
usually changes distances between facial feature points while emotion is excited. According to
the significant facial feature points, such as positions of eyebrows, eyes, and lips, we can
determine various facial expressions. Ekman and Friesen identified six basic human emotions:
fear, surprise, sad, anger, disgust, happy and their associated facial expressions.
Physiological signals can greatly help assessing and quantifying stress, tension, anger, and
other emotions that influence health. In general, physiological reactions are non-autonomic
nerves in physiology. Thus, the three physiological signals were adopted in this paper. In this
paper, a novel approach is proposed to recognize four emotions (fear, love, joy and surprise) by
facial expression and physiological signals.
Remaining parts of this paper are organized as follows. In Section II, the mood induction
experiment setup for collecting facial and physiological features is described. Experimental
results are provided in Section III to show the effectiveness of the proposed system. Finally,
conclusions are drawn in Section IV.
II.FACIAL EXPRESSION AND PHYSIOLOGICAL SIGNALS
ACQUISITION

Fig 2.2 The schematic diagram of the mood induction and signal acquisition environment.
A series of specific designed mood induction events is performed to acquire facial
expressions and physiological signals. The schematic diagram of the acquisition procedure is
shown.

III.RESULTS
A series of specific designed emotion induction experiment is performed to collect facial
expression images and their corresponding physiological signals including four kinds of emotion
(fear, love, joy, and surprise) from six subjects. The duration of physiological signal of each
emotion is ten seconds. The sampling rate of the physiological signal and the camera is 20Hz.
Samples of facial expression image are shown in Fig. 2.3. Figure 2.3(a-d) show the facial images
of expression of ―fear‖, ―love‖, ―joy‖, and ―surprise‖, respectively. Samples of physiological signals
are shown in Fig. 2.3.

Fig 2.3.Samples of facial expression images.


IV.CONCLUSIONS
In this paper, we propose an emotion recognition system with consideration of facial
expression and physiological signals. We conclude four general expressions from six subjects.
We analyzed external (facial expression) and internal factors (physiological signals) of human
responses to determine what the inherent emotion is.
Research paper 4
A Comprehensive Study On Facial Expressions Recognition
Techniques
Abstract: Motion of one or more than one muscles underneath the skin is Facial
Expression. These movements plays very important role in conveying the emotional
states of individual to the observer. Human face-to-face communication is important
in human interaction. In recent years, different approaches have been put forward for
developing methods for fully automated facial expressions analysis that is important
for human computer interaction. In Facial Expression Recognition System the image
is processed to extract such information from it, which can help in recognizing six
universal expressions that are neutral, happy, sad, angry, disgust and surprise. This
processing is done in several phases including image acquisition, features extraction
and finally expressions classification. This paper surveys some of the techniques that
are used for the purpose of facial expression recognition

Introduction: Facial expressions are considered as the most indicative, strong and
natural way of knowing the psychological state of a person during the communication.
Fifty five percent of the communication is done by the means of facial expressions. In
Facial Expressions Recognition System the image is processed to extract such
information from it, which can help in recognizing six universal expressions that
are neutral, happy, sad, angry, disgust and surprise. This processing is done in several
phases including image acquisition, features extraction and finally expressions
classification using different techniques.

AUTOMATIC FACIAL EXPRESSIONS RECOGNITION SYSTEM


A)Architecture
The method of Facial Expression Recognition is categorized into following stages:
1. Detection of Face
2. Extraction of Features
3. Classification of Expressions.
Firstly, the image is taken from test database and face detection from the image is done. When the
face is detected important features are extracted from the facial image like eyes, eyebrows, lips
etc. After extracting these important features, the expression is classified by comparing the image
with the images in the training dataset using some algorithm.

Fig 2.4 Flow of Facial Expression and recognition system


B)Level Of Description

1)Using whole frontal face:


Using the frontal image of the face as it is (whole) and processing it for classifying the six
prototypic emotions comes under this approach. In this approach, it is considered that every
prototypic emotion has its own characteristic expressions on the face of the person and the
recognition of those characteristics is important.
2)Using Action Units:
In this approach the image is divided into sub-sections and then used for analysis. There are total
44 Action Units at all and out of them 30 are acquired by the contractions of certain muscles. 12
of these 30 action units are of the upper portion of the face while the remaining 18 are for the
lower portion of the face. It is an effective method for facial expressions recognition as it divides
the face into action units by applying objectivity and flexibility on the image. This approach is
applicable in those applications in which fine level of changes in the expressions are needed to be
identified. There are some methods in which the whole frontal face or all the 44-action units are
not used; rather some regions are selected manually from the face and used for the recognition of
expressions.

Fig2.5 : Different Upper Face Action Units & their Combinations Fig2.6 : Different Lower Face Action Units & their Combinations

C)Parameterization:
1)Geometric based parameterization:
It is one of the oldest techniques in which the tracking and processing is done on some
of the spots on the facial images. Spatial location and shapes of the facial points as
feature vectors for classification of the expressions.
2) Appearance based parameterization:
Instead of using the spatial points for tracking position, movement parameters are
used in this approach. These parameters vary with the time and colors of the related
regions of the face. Different types of features like haar wavelet coefficients, gabor
wavelets along with the feature extraction and selection techniques like PCA and
LDA are used within this concept.
Challenges:
The very first problem that is being faced recently is unavailability of the databases
having the spontaneous facial expressions and creating such database is one of the
major challenges. if the subjects have prior knowledge that they are being captured
then the expressions will not remain natural .
The next problem is having these spontaneous expressions with different medical and
lighting conditions. For this type of cases the approach of using the hidden camera
will not work out well. Finding out the labeled data for testing as well as training is
also a challenge for working in this field. Unlabeled data is easily available and one
can find such data in huge amount. Labeling of data is a lengthy and a complicated
process, it consumes a lot of time and the chances of error are also very high.
Other factors that affects the expressions like: Subjects belonging to different cultures
like Asians, Europeans, age groups will have different expressions.
Angles of head and their rotations are also a big concern.

Future scope:
A future extension of facial expressions analysis could be the analysis of micro expression.
Nowadays only few training techniques are available that works for micro expressions. To avoid
the problems with labeling of data semi-supervised learning techniques could be used because
they allows the use of labeled data along with the unlabeled data.
Chapter 3: Proposed Solution
The Facial Expression Recognition system includes the major stages such as face image
preprocessing, feature extraction and classification.
Firstly, the image is taken from test database and face detection from the image is done. When
the face is detected important features are extracted from the facial image like eyes, eyebrows,
lips etc. After extracting these important features, the expression is classified by comparing the
image with the images in the training dataset using some algorithm.

Fig. 3.1 .Block diagram for face expression recognition system.

I. Preprocessing
Preprocessing is a process which can be used to improve the performance of the FER
system and it can be carried out before feature extraction process .Image preprocessing includes
different types of processes such as image clarity and scaling, contrast adjustment, and additional
enhancement processes to improve the expression frames .The cropping and scaling processes
were performed on the face image in which the nose of the face is taken as midpoint and the
other important facial components are included physically.Normalization is the preprocessing
method which can be designed for reduction of illumination and variations of the face images
with the median filter and to achieve an improved face image. The normalization method also
used for the extraction of eye positions which make more robust to personality differences for
the FER system and it provides more clarity to the input images. Localization is a preprocessing
method and it uses the Viola-Jones algorithm to detect the facial images from the input image.
Detection of size and location of the face images using Adaboost learning algorithm and haar
like features.The localization is mainly used for spotting the size and locations of the face from
the image. Face alignment is also the preprocessing method which can be performed by using the
SIFT (Scale Invariant Feature Transform) flow algorithm. For this, first calculate reference
image for each face expressions. After that all the images are aligned through related reference
images. The histogram equalization method is used to conquer the illumination variations .This
method is mainly used for enhancing the contrast of the face images and for exact lighting also
used to improve the distinction between the intensities.
In FER, more preprocessing methods are used but the ROI segmentation process is more
suitable because it detects the face organs accurately which organs are is mainly used for
expression recognition. Next the histogram equalization is also another one important
preprocessing technique for FER because it improves the image distinction.
II. Feature extraction
Feature extraction process is the next stage of FER system. Feature extraction is finding
and depicting of positive features of concern within an image for further processing. In image
processing computer vision feature extraction is a significant stage, whereas it spots the move
from graphic to implicit data depiction. Then these data depiction can be used as an input to the
classification. The feature extraction methods are categorized into five types such as texture
feature-based method, edge based method, global and local feature-based method, geometric
feature-based method and patch-based method.

III. Classification
Classification is the final stage of FER system in which the classifier categorizes the
expression such as smile, sad, surprise, anger, fear, disgust and neutral. The CNN contains two
important perceptions likely shared weight and sparse connectivity . In FER, the CNN classifier
used as multiple classifiers for the different face regions. If CNN is framed for entire face image
then first frame the CNN for mouth area and next for eye area likely for each other area CNNs
are framed . Deep Neural Network (DNN) contains various hidden layers and the more difficult
functions are trained efficiently comparing with other neural networks. The neural network based
classifier CNN gives better accuracy than the other neural network based classifiers. The various
FER techniques with their algorithm are analyzed which includes the algorithms that are used for
three important requirements such as preprocessing, feature extraction and classification.

IV. Database description


This Project is performed on FER by using various databases of Japanese Female Facial
Expressions (JAFFE, 2017) ,Real-time database ,Own database, Karolinska Directed Emotional
Faces etc. Each image in JAFFE database contains 256 256 pixel resolution. CK database also
has seven expressions but it contains 132 subjects that are posed with natural and smile. It
contains totally 486 image sequences with 640 490 pixel resolution of gray images

Fig 3.2 Sample images from JAFFE Fig 3.3 Sample images from CK
Database Database
Flowchart of Facial Expression Recognition

Phases in Facial Expression Recognition:

The facial expression recognition system is trained using supervised learning approach in which
it takes images of different facial expressions. The system includes the training and testing phase
followed by image acquisition, face detection, image preprocessing, feature extraction and
classification. Face detection and feature extraction are carried out from face images and then
classified into six classes belonging to six basic expressions which are outlined below:
1. Image Acquisition
The design starts with the initializing CNN model by taking an input image (static or
dynamic) .Images used for facial expression recognition are static images or image sequences.
Images of face can be captured using camera.

2. Face detection
Face Detection is useful in detection of facial image. Face Detection is carried out in training
dataset using Haar classifier called Voila-Jones face detector and implemented through Opencv.
Haar like features encodes the difference in average intensity in different parts of the image and
consists of black and white connected rectangles in which the value of the feature is the
difference of sum of pixel values in black and white regions.

3. Image Pre-processing
Image pre-processing includes the removal of noise and normalization against the variation of
pixel position or brightness.
a) Color Normalization
b) Histogram Normalization

4. Feature Extraction
Selection of the feature vector is the most important part in a pattern classification problem.
The image of face after pre-processing is then used for extracting the important features. The
inherent problems related to image classification include the scale, pose, translation and
variations in illumination level.
The feature will be extracted through the max-pooling method by creating the model with .h5
extension and then compile the model with loss and optimizer. Here we import haar cascade for
face recognition which is in XML format.

OpenCV: Open Source Computer Vision Library provides a common infrastructure for
computer vision applications for humans and computer vision which contains 2500 optimized
algorithms. These algorithms used for face detection, identification of objects for training and
detecting objects.

TensorFlow: TensorFlow is a second-generation arrangement for the google company and


the implementation and deployment of large-scale machine learning projects. It is flexible
enough to be used both in research and product invention. It creates large scale neural networks,
used for creation, classification, discovery, prediction, prescription. The main applications of
Tensorflow are the voice to text and text to voice, recognition while capturing video, audio,
image, and time series, and text-based applications.

Keras: Keras is an open-source neural network in python, which is used for the
preprocessing, modeling, evaluating, and optimization. It is used for high-level API as it handled
by backend. It is designed for making a model with loss and optimizer function, and training
process with fit function. Keras does not support low-level graphs and computations as it
handled by the backend engine. For backend, it designed for convolution and low-level
computation under tensors or TensorFlow. Importing the below python libraries are used for
preprocessing, modelling, optimization, and testing.

Importing the below python libraries are used for preprocessing, modelling, optimization,
and testing.

Python Libraries

5.Classification
The dimensionality of data obtained from the feature extraction method is very high so it is
reduced using classification. Features should take different values for object belonging to
different class so classification will be done. Here emotions are classified as happy, sad, angry,
surprise, neutral, disgust, and fear with 34,488 images for the training dataset and 1,250 for
testing. Each emotion is expressed with different facial features like eyebrows, opening the
mouth, Raised cheeks, wrinkles around the nose, wide-open eyelids and many others.
Trained the large dataset for better accuracy and result that is the object class for an input
image.Based on those features it performs convolution layers and max pooling.

Fig 3.4 Different Human Facial Emotions

6.System Evaluation
Evaluation of the system will be done.
ACKNOWLEDGEMENT
We would like to express special thanks of gratitude to our mentor Professor Srindi
Gindi as who guided us and gave us this opportunity to do this wonderful project on the topic of
―Emotion Recognition with Consideration of Facial Expression‖, which also helped us in doing a
research and we came to know about so many new things. We would also like to thanks our
HOD Er. Zainab Mirza for providing the opportunity to implement our project. We are very
thankful to them. Finally we would like to thank our parents and friends who help us a lot in
finalizing this project within the limited time frame.
REFERENCE

[1] Franc¸ois Chollet. Xception: Deep learning with depthwise separable convolutions.
CoRR, abs/1610.02357, 2016.
[2] Andrew G. Howard et al. Mobilenets: Efficient convolutional neural networks for
mobile vision applications. CoRR, abs/1704.04861, 2017.
[3] Dario Amodei et al. Deep speech 2: End-to-end speech recognition in english and
mandarin. CoRR, abs/1512.02595, 2015.
[4] Ian Goodfellow et al. Challenges in Representation Learning: A report on three
machine learning contests, 2013.
[5] Xavier Glorot, Antoine Bordes, and Yoshua Bengio. Deep sparse rectifier neural
networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence
and Statistics, pages 315–323, 2011.
[6] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning
for image recognition. In Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 770–778, 2016.
[7] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network
training by reducing internal covariate shift. In International Conference on Machine Learning,
pages 448–456, 2015.
[8] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv
preprint arXiv:1412.6980, 2014.
[9] Rasmus Rothe, Radu Timofte, and Luc Van Gool. Deep expectation of real and apparent age
from a single image without facial landmarks. International Journal of Computer Vision (IJCV),
July 2016.
[10] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for
large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

You might also like