Academia.eduAcademia.edu

Affective Computing: A Fuzzy Approach

2016, IEEE

The recent developments in the field of Human-Computer Interaction have emphasized the focus of design more towards user-centered rather than computer-centered approach. This led to design of interfaces which are highly effective, intelligent and adaptive, and can adjust themselves with respect to the user's behavioral changes. Such designs are called intelligent and affective interfaces. The proposed framework is designed for affective computing machines, which can sense and adapt according to the moods and emotions of the user, using Fuzzy and Image processing techniques.

Affective Computing: A Fuzzy Approach Madhusudan Department of computer science Himachal Pradesh University Shimla [email protected] Abstract: The recent developments in the field of Human-Computer Interaction have emphasized the focus of design more towards user-centered rather than computer-centered approach. This led to design of interfaces which are highly effective, intelligent and adaptive, and can adjust themselves with respect to the user’s behavioral changes. Such designs are called intelligent and affective interfaces. The proposed framework is designed for affective computing machines, which can sense and adapt according to the moods and emotions of the user, using Fuzzy and Image processing techniques. Keywords: HCI, MMHCI, affective interfaces, intelligentinterfaces, emotions, features, fuzzy, Ubiquitous, feature sets, global processing, local processing, UbiComp. I. INTRODUCTION Affective computing is a field of Human-Computer Interaction (HCI) and primarily focus on research and design of usercentered machines which provide better user’s experiences and greater level of satisfaction in users while they are interacting with computing machines. The term “Affective Computing” was coined by Rosalind W. Picard and reflects the research developments in emotion-aware interface design in computing machines [1]. The ability of machines to recognize, interpret, adapt and respond according to the emotions of the user is called adaptive computing, and the interfaces which implement such emotion-sensing techniques are called affective interfaces. Since the end user of the computers and computer-like machines are mostly the Human, the interfaces need to be developed which take human factors in account and provides greater level of satisfaction and usability. HCI played an important role in the development of user-centred machines which included the factors of user satisfaction, which were earlier usually neglected. Initially, it was thought that creation of such computing machines was impossible and seemed to be an idiotic task, but with rapid changes and development in the field of computing, speed, architectures and software’s of machines, the vision of affect-oriented intelligent interfaces and Ubiquitous computing can be realized [2]. The omnipresence of machines, yet unknown to the users, supported by networking to fulfil the information needs of the users all the times, is called “Ubiquitous Computing” and was coined by Mark Weiser. Earlier there were limited modes of interaction like Dr. Aman Kumar Sharma Department of computer science Himachal Pradesh University Shimla [email protected] keyboard, mouse, joystick etc. and they made the user move towards the interface for interaction. One of the main aims of Ubiquitous computing is design of machines which are capable of moving towards the user for interaction rather than movement of user himself to the machines [1]. The idea was to support the main character in the computing environment i.e. the user. This led to development of interfaces which are intelligent and could accept the input from the user through multiple channels (called modes) for e.g. audio, gestures, text, gaze, facial expressions etc. Intelligent machines tend to provide greater functionality & usability under different circumstances such as for the person with disability. The information from different modes is fused at some point depending upon certain criteria like in humans the information through different senses is processed [3]. Once the machines that support multimodal interfaces were designed, the concept of affective computing came to light, where the machines could sense and interpret the emotions of the user and adapt themselves in a manner so that user can have a better experience comparatively. The development of affective or emotion-sensing interfaces require the understanding of human emotions, how they are generated, how they can be transformed and what are the features that can help to judge them [4]. The main reason behind all these developments was to support Human-computer interaction as Human-Human interaction and design the machine which could communicate with the users the way human interacts with each other, thereby making the process of communication between human and machines more natural. Emotion sensing is a confluence of different fields like Digital Image Processing (DIP), biology, physiology, neural networks, audio processing and psychology etc. When a particular instance of a user is captured by camera or sensor, it need to be digitally processed for e.g. image enhancement for noise reduction and better interpretation, image compression for size reduction, segmentation for capturing the objects of interest, classification and recognition for categorizing the extracted features and finally processing of the classified features [5][6]. This field of computer science has grown many folds in the last few decades and as such there has been lot of development in the field of intelligent and emotion-sensing HCIs. Emotion sensing or emotion recognition can be defined as a process of acquiring, analyzing, interpreting some emotions/moods of the user. The output of such process is a feedback which makes the interface to adapt according to the emotion of a user [7]. According to the principles of interface design, the interfaces should be simple, easy-to-use, userfriendly and most importantly, should support the user-point of view rather than designer-point of view. With the advancement in the field of HCI, more attention has been paid to the design of interfaces and as a result intelligent and affective interfaces have come as an effect which employs user-centered approach while designing software or some interface of a hardware. Intelligent HCIs are defined with respect to the mode of information gathering. The interaction with a machine may take place at three different levels namely; physical, cognitive, and affective level [8]. Physical level includes devices like keyboard, mouse etc., whereas the cognitive level defines the interpretation or the way understands the system. The affective level determines the user's experiences better while interacting with the system. The input devices used for communication/interaction includes three different classes of input methods namely; vision-based like mouse, audio-based like speech analyzer, and touch-based haptic devices like touchscreens [9]. The various modes, channels that can be used for communication with a machine determine the intelligent behavior of the machines. If there are more than one way of interaction, then such machines are called multimodal human computer interaction (MMHCI) [10]. The term intelligent is used with an interface, if it is able to capture the information from the user with minimum physical inputs. Such devices require some kind of intelligence in perception of the response from the user [12]. The affective HCIs are those which are able to interpret the human emotions and adapt itself with respect to certain emotion. This aspect of HCI deals entirely user experiences and mechanism to improve the level of satisfaction in the interaction whereas intelligent HCIs deal with the ways of information gathering [13]. Recent developments have turned the paradigm of research towards creation of emotion-sensed systems that can provide pleasant experiences to the users with the help of digital images [11] [14]. We propose an affective computing framework using Fuzzy logic on digital images using different algorithms of segmentation and pattern recognition for extracting emotions. The paper is organized as: section II investigate recent developments in affective computing, section III discuss the proposed framework, followed by section IV in which underlying complexities and architecture is discussed. The paper is concluded in Section V along with future scope. II. RECENT DEVELOPMENTS AND CHALLENGES Affective computing is trying to assign human like capabilities to the machines for create a better environment of interaction, making them capable of sensing, interpreting, and generating affect features [15]. The research on affect computing can be traced to 19th century when the psychology and physiology were applied to study the emotions and emotion-like features. In the field of emotions sensing and processing, a lot of recent developments has taken place in last three decades. These research developments include various fields and factors like emotional speech processing, facial expressions, body gestures and movements, gaze detection, and MMHCIs. Emotional speech processing differs with respect to acoustic features and some variables were used to establish a relationship between these acoustic features and the delivered emotion [16]. These acoustic features were used for pattern recognition to identify the features which reflect in the emotions and hence could be used for designing a speech recognizer which could judge the emotion of the user [17] [18]. In speech synthesis, emotion control parameters were used which resulted in higher performance and emotional keywords helped in generating the emotional TTS system [19] [20]. The final emotion state was determined on the basis of emotion outputs from textual content module. Use of cognitive sciences and perception model also reflected higher accuracy and consistency [21]. The most researched and investigated field in emotion processing is facial expressions. Since facial expressions or the digital images carry most information, as such study of facial expressions and their physiological changes with changes in behaviour and mood of the user, could serve as one of the medium for extracting emotions [22]. Most of the work related to facial expressions has been carried out with the help of digital image processing and pattern recognition algorithms [23] [24] [25] [26]. Mesh model of human faces were also used to study the physiological changes and their resemblance to emotions [27]. As the advances were made in artificial intelligence and machine learning, the algorithms under supervised conditions were used to implement self-detection of changes on facial features and the respective emotion from these changes [28]. The quality in facial expressions processing has improved remarkably which can be credited to the recent development in computing power, performance, speed and storage. As such larger software could be designed for facial expressions and their processing which could implement Hidden Markov Model (HMM), Point Distribute Model (PDM), and Geometric Tracking Method (GTM), EGM, and Gabor representations [29] [30] [31] [32]. Some approaches containing both audio and video format were used for better accuracy and quality [33]. Large body movements and their relationship to emotions had always been an area of study, and there has been considerable amount of research developments in this field as in facial expression, or speech processing. Also, most of the research in this context was focused on hand movements only. Various designs and frameworks have been created for judging the emotions and their coordination with hand and body gestures [34] [35] [36]. To provide a human like communication or interaction environment, various modes of information were fused together for better user experience and better decision making in machines [37] [40] [38] [39]. Multimodal systems provided more realistic and satisfied user experiences and were implemented with Bayesian networks and machine learning for affect sensing [43] [42] [41]. Digital image processing has played a significant role in the field of affective computing and design of intelligent MMHCIs. The algorithms of image segmentation, recognition, enhancement, classifications, and compressions are widely used in processing of the images acquired while interaction. Also, analyses of these digital images containing facial features and hand movements help in emotion-sensing of the user [2] [44]. There have been various projects related to affective computing which were conducted in recent decade like Oz project [45], Affective-Computing Framework for Learning and Decision Making [46], HUMAINE [47], BlueEyes [48] etc. There have been many challenges in affective computing and lot of development will be required for making the dream of human robots come true. Some of the challenges include affective information acquisition, affect identification, affect understanding, differences in emotions with changes in customs and countries, non-standardization of emotion states and effect of different factors on emotions, affect generation, and creation of affective databases which are quite different and complex as compared to traditional databases [49]. III. PROPOSED FRAMEWORK The general framework for intelligent and affective machines comprises of four steps namely; image acquisition, followed by segmentation for extracting regions of interest (ROI), feature extraction and finally feature analysis. In the proposed framework, fuzzy logic and its techniques are used for better classification of features and their effect on overall emotion sensing. Initially, some of the features and their values are stored in the affective database for operation with the input features that are collected during interaction with the computing machine. The algorithm starts with image acquisition through some sensor or camera which collects the facial features and expressions of the user along with hand movements. The proposed framework can be implemented in two ways: firstly, with calibration, and secondly with selflearning but no calibration. In case of calibration, some images and features of the user are pre-stored in the affective database which contains their values, for applying arithmetic operations on them with the acquired features. This help the learning module in the initial stages for fast processing and learning. In the second scenario, no images and features are kept in the database and all the operations are performed by applying the acquired features with the best collected features till date. Second case is fully implemented with machine learning, while there is little use of machine learning in the first case. Once the features or the facial expressions are acquired, they are processed digitally for better perception and analysis. It includes image operations like histogram equalization, contrast and brightness adjustments, smoothing, sharpening or negative of the image, as per the demand of the application in which the module is used. In the next step, the enhanced image is segmented with the help of region growing and merging algorithms coupled with thresholding, to generate the regions of interest or features of the facial expressions. Set of segmentation techniques are used that take the segments out of the image in an efficient way. When the features or the segments of the image are available, they are applied with the fuzzy techniques. Here the fuzzy techniques are used on the acquired segments and memberships of features on individual basis are counted. It is possible that a single feature may be tagged with different emotion values, but we keep all the tags and membership values since they are used again while calculating the overall value of emotion in the final step. Bayesian theorem is also applied, mostly in the cases of ambiguity, which works on the basis of conditional probability to sense the emotion values more accurately and precisely compared with some set of features. The probability value or likelihood is calculated for each possible matched tag and the one with maximum value is selected. These membership values are passed to the next module, which classify these emotions under different emotion categories. The classifier uses machine learning and selflearning module for operation on the membership values and feature’s tag, and for the growth of affective databases since with time the framework is supposed to grow monotonically and making the classification process more accurate and precise with time. Fuzzy membership calculations: In this framework, we used the concept of fuzzy logic, in which not only true or false values are calculated to assign membership, but fractional membership values can be assigned to the objects which can be used to describe the degree of resemblance to a certain class. In the third step of framework, fuzzy membership values are used to assign a value or degree of membership to the individual features on the basis of their resemblance to a certain set of features. This is called local fuzzy processing. For e.g. if we consider a feature namely lips in a happy mood, then we can assign a range of values between 0 and 1 to the lips and their deviation from the mean position. Mean positions and values are calculated/extracted from user’ facial expressions under normal emotional condition. Image Acquisition (Via a sensor or camera with high resolution) Image Enhancement (Application of image processing techniques for better analysis) Image Segmentation (Segmentation algorithms applied for dividing the image into regions of interest) Fuzzy Application on Segments at Local Scale (Application of proposed fuzzy technique on segments obtained in the last step) Classification of Regions of Interest (Necessary techniques of machine learning for self-categorization of features for future access and analysis) Fuzzy Implementation at Global Scale (Application of proposed fuzzy logic on image as a whole for sensing the overall emotion value) Fig. 1. The proposed framework using fuzzy techniques Let us consider the shape of the lips when a person is in happy mood as; smile look, smiling, laughing, and exhaustive laughing. Assume that the in the full exhaustive laugh, in which the lips shape have maximum deviation from the mean position, we assign 1 value as its membership. Similarly, in laughing position we assign .75 value to it, in case of smile .5 value is assigned as membership, and in smile look .25 value is considered as membership. Other feature like eyes can similarly be assigned membership values during happy mood. For e.g. in case of exhaustive laugh, the size of eyes tends to be smaller as compared to normal condition, and hence is assigned 1 value as membership. In the same way all the features or ROI are assigned membership for each mood. Here machine learning plays an important role by replacing the features in the best affective state with time, as new features with higher values are acquired. The features stored in the database with time, can be analysed to generalize the framework for a class of people with some similarities. The other way to assign membership value to the features is: we apply the acquired features with pre-stored images, arithmetically subtract them, and calculate the deviation/difference of the feature from the set of different features. The deviation values are assigned membership values in this case and little deviation or membership values imply more similarity to that class. Also, since the same features can have same membership values, yet they can belong to more than one set of class of emotions at the same time. For e.g. the small size of eyes can appear in laugh, weeping, hard thinking, angry mood etc. As such we create different feature sets for each class like happy, sad, angry, frustrated etc. All these sets in turn contain the values of all the features and their possible combinations to other features. In natural language processing, a word may contain more than one part-of-speech as a tag, similarly in facial features, a single feature may contain different tags of class and hence lead to ambiguity. This ambiguity is resolved by applying Bayesian theorem to calculate the value of probability with conditional comparison of that feature with some class. The probability value which results in the maximum membership, is assigned as the tag to that feature. It is possible that in some cases the total ambiguity or ambiguity of all the features might not be resolved since it may not be possible to resolve ambiguity by using individual feature. This is because the overall emotion value that appears in facial expressions is sum total of all the features in certain proportion, and as such there is interdependence among the features and hence need to be incorporated while determining the probability value of other features. For e.g. while determining the membership or class of eyes, when they are diminished, we need to compare it with different classes like laughing, weeping, thinking etc. So it is not possible to assign an unambiguous tag to eyes in such case. But we can use other feature like shape of lips to find the correct class for eyes. Let F[i][j] be the feature set for some class, where i represent the class of the emotion like happy, sad etc. and j represent particular feature. The value of F[i][j] gives the membership value of jth feature to the ith class of emotion. i may contain the values like happy, sad, angry, frustrated etc. in an enumerated manner like 1, 2, 3, 4 etc. Similarly, j may contain values like eyes, lips, cheeks, ears; forehead etc. and they can also be enumerated. Hence finally the feature sets can be represented in an array of two dimensions and be easily accessed. Let i=1 represent Emotion=happy, and j=1 represent Feature=lips, then we have possible combinations as: 0, no smile/not happy .25, little smile/not so much happy F[1][1]= .50, healthy smile/happy .75, laughing/quite happy 1.0, exhaustive laugh/extremely happy In the similar way, we calculate the feature sets for all the emotion values and store them into affective database. These values are calculated using the law of averages and may differ with set of people and hence database can be supplied with fresh values upon calibration. These values are used for comparison with new sets of acquired images. Now, if some feature sets like F[1][1]=F[2][1]=F[3][1], i=1, 2, 3 represent emotions Happy, Sad and angry, and j=1, represent the feature ‘eyes’, then Bayesian theorem is used to calculate the likelihood of the feature for assigning correct tag to it. Let the membership of ‘eyes’=0.5 for i=1,2,3, then we calculate Ᵽ(έ\Ḟi), determines the value of likelihood of the feature to some class, depending upon the membership value of the feature and maximum possible favourable outcomes in the past. When we compare the conditional probability of the ambiguous feature with other extracted features and feature sets, then Bayesian theorem give the actual tag of the feature. For e.g. if there is ambiguity in tagging lips, then we compare the membership of the lips in the extracted feature, and compare it with other extracted features first and then with the classes of emotions. In the beginning some predefined probability value can be used and is subjected to iteration via machine learning to refine the results for accuracy and highest probability. If still the ambiguity is not resolved, then it is left to be resolved at the global fuzzy processing step. In the final step of algorithm, fuzzy techniques are again applied on the segments as a whole image and is called global fuzzy processing. This is to resolve either the ambiguities in the third step, or to redefine the emotion value on the larger scale. In this step, we calculate the sum total of different membership values belonging to different features and different emotion classes. All the possible feature sets were generated in third step, and in global fuzzy processing these smaller feature sets are combined. The feature set which results in maximum likelihood or membership value, is selected as the final emotion value. For e.g. we calculate the {F[1][1] + F[1][2] + F[1][3] + F[1][4] + F[1][5]+…..} {F[2][1] + F[2][2] + F[2][3] + F[2][4] + F[2][5]+..…..} For i, j=1, 2, 3, 4…. The steps of fuzzy technique differ at both levels in this algorithm only on the basis of their scale of application. At minor scale with individual values, it’s called local processing while on the final set of features as a whole, it’s called global processing. The final feature sets are represented as fi, where i, represent the final feature as emotion value and not in constituents. We use Bayesian probability in this step also, but only when there are ambiguities in final emotion value. Self-Learning Module: Machine learning framework is also proposed in this algorithm, which is supported by pattern analysis module to help in deciding the pattern of emotions with their reflection on particular feature sets. Physiological studies of human face contour are used to locate the pattern of the features and their resemblance to some emotion value. Also, the machine learning module helps in monotonic growth of affective database, and as such we have more test cases to be operated with, which can result in better accuracy with time. IV. ADVANTAGES, ARCHITECTURE AND UNDERLYING COMPLEXITIES In the proposed model, n sensors are processing the behaviour feature of the human beings. These multiple sensors are collecting the data which is processed at the real time so all the acquired data is fed to the algorithm for feature classification. The use of the multiple processing units in the system is to make the processing faster and feasible to design the system that can supply real time decisions. The results of the processed data from the individual processing units are fed to the fuzzy inference systems. The use of the fuzzy system is due to the fact that the multiple inputs are crisp in nature but the particular value may belong to the multiple features for example if the human have facial expression describing happiness but his hand expression may show the nature of the angriness. This means a value may belong to the multiples set of behaviour. To handle these data, we define IF-THEN rules and associated membership functions. For simplicity, the Gaussian membership functions are preferred to each of them. On the basis of the rule base, the resultant is obtained and this result will go through the process of defuzzification and the system will produce the desired results. In the Figure 2, the detail description the complete system has been depicted. The sensors are generally referring to the video cameras or other affective sensors like wearable devices. The results of the different cameras have been fed to the individual processing unit to process according the body component has been focusing. The results from the feature extraction algorithm are provided to the fuzzy inference system which is using the rule base and providing the results accordingly. Massive parallelized and distributed architecture is required for real time results. The processors should have high speed, storage space must be fast and large, quality of sensors need to be high, and classification algorithms should be robust. Use of fuzzy technique results in a large number of feature sets and hence increase the amount of datasets produced which require a lot of storage space. Some compression algorithm like run length coding is used to reduce the amount of storage space. But on the other side this large configuration of feature sets helps in better analysis and comparison cases, and as such results in better accuracy and lesser ambiguity. Secondly, a study and standardization of emotion values and their respective features is required for calibration and scalability or generalization of the framework. Psychology is also included in this framework since facial expressions are one of the medium of expressing the emotion, which transitively is function of psychological states of human mind. Very minute observations and large data sets may result in slow processing and elevate the requirement for fast processing machine supported by large caches. Affective database creation and maintenance is also a complex and timeconsuming task. There are no common theories regarding emotions which also present an issue towards its scalability. Fig. 2. Architecture of proposed framework using Fuzzy Inference System Real time processing of multimodal information from different sensors need high-performance computing environment. Since information from different channels like facial features, hand movements, body movements, gaze, skin conductance and speech is fused to judge the affective levels of users, processing of this varied and voluminous data require distributed and parallel processing. Input signals received by sensors are fed into processing machine to analyse the features via feature extraction and pattern recognition algorithms. Simply machines with limited computing and storage capacity are not sufficient to achieve the same. Keeping in mind the user factors of satisfaction, we need to acquire and process information at very high rates with greater accuracy and precision. These processing requirements can only be achieved in massively parallelized and highly-coupled distributive environments. Finally, the variations in emotions with respect to culture, customs, nationals, and creeds present a challenge in the generalization of the framework. V. CONCLUSION AND FUTURE SCOPE The advancement in the field of MMHCI with the creation of intelligent and affective interfaces has helped in achieving the dream of ubiquitous computing though in small scale only. The proposed framework can sense the physiological changes in the facial features and hand movements of the user and can adapt the affective interface with accordance to the sensed emotion value. It provides better results in the MMHCIs which are user-centered and are driven by the emotional behavior of the user. Since the modular approach is used for implementation, this framework can be used with different sets of features and emotion values and hence achieves increased scalability. Such framework can be implemented in Ubicomp machines for affect sensing and adaptive interfaces. Furthermore, there are chances of improvement in the algorithms for better and efficient implementation. Some other feature sets and emotion can be included and a new range of values may be tested with different feature sets. REFERENCES: [1]. Rosalind W. Picard, “Affective computing’’, MIT Press, 1997. [2]. Alan Dix, Janet Finlay, Gregory D. Abowd, Russell Beale, “Human Computer Interaction”, Third Edition, Pearson. [3]. Fakhreddine karray et. Al, “Human-Computer Interaction: Overview on State of the Art”, International Journal on Smart sensing and Intelligent Systems Vol. 1 No. 1, March 2008. [4]. Liam J. Bannon, “From Human Factors to Human Actors: The role of Psychology and Human-Computer Interaction Studies in Systems Design”, Book Chapter in Grenbaum, J. & Kyng, M. (Eds.), 1991. [5]. Maja Pantic, “Towards an Affect-Sensitive Multimodal HumanComputer Interaction”, proceedings of the IEEE, Vol. 91, No. 09, September 2003 [6]. Nicu Sebe, “Multimodal interfaces: Challenges and perspectives”, Journal of Ambient Intelligence and Smart Environments 1 pg. no. 19–26, 2009. [7]. Alejandro Jaimes et. Al., “Multimodal human-computer interaction: A survey”, Computer Vision and Image Understanding pg. no. 116-134, Elsevier, 2007. [8]. Zoran Duric et. Al., “Integrating Perceptual and Cognitive Modeling for Adaptive and Intelligent Human-Computer Interaction”, Invited paper in Proceedings of the IEEE, vol. 90 no. 07, July 2002. [9]. Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing”, Third Edition, Pearson Prentice Hall, 2013. [10]. S Jayaraman, S Esakkirajan, T Veerakumar, “Digital Image Processing”, McGraw Hill Education Private Limited, Eleventh Reprint 2013. [11]. Minakshi Kumar, “DIGITAL IMAGE PROCESSING”, Indian Institute of Remote Sensing, Dehra Dun 2010. [12]. Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing”, Third Edition, Pearson Prentice Hall, 2013. [13]. Terry Winograd, “Shifting viewpoints: Artificial intelligence and human–computer interaction”, Elsevier, 1 November 2006. [14]. Rafael C. Gonzalez, Richard E. Woods, “Digital Image Processing”, Third Edition, Pearson Prentice Hall, 2013. [15]. Jianjua Tao and Tieniu Tan, “Affective Computing: A Review”, ACII 2005, LNCS 3784, pp. 981-995, 2005. [16]. K. R. Scherer, “Vocal affect expression: A review and a model for future research”, Psychological Bulletin, vol. 99, pp. 143–165, 1986. [17]. F. Dellaert, T. Polzin, and A. Waibel, “Recognizing Emotion in Speech”, In Proc. Of ICSLP 1996, Philadelphia, PA, pp. 1970-1973, 1996. [18]. A. Petrushin, ‘‘Emotion Recognition in Speech Signal: Experimental Study, Development and Application.’’ ICSLP2000, Beijing. [19]. Mozziconacci J. Sylvie and Hermes Dik, ‘‘Expression of emotion and attitude through temporal speech variations’’, ICSLP2000, Beijing, 2000. [20]. Chuang Ze-Jing and Wu Chung-Hsien, ‘‘Emotion Recognition from Textual Input using an Emotional Semantic Network,’’ In Proceedings of International Conference on Spoken Language Processing, ICSLP 2002, Denver, 2002. [21]. F. Yu et. al., ‘‘Emotion Detection from Speech to Enrich Multimedia Content’’, in the second IEEE Pacific-Rim Conference on Multimedia, October 24-26, 2001, Beijing, China. [22]. “Human Computer Interaction”, Alan Dix, Janet Finlay, Gregory D. Abowd, Russell Beale, Third Edition, Pearson. [23]. Massaro W. Dominic et. al., “Picture My Voice: Audio to Visual Speech Synthesis using Artificial Neural Networks”, Proceedings of AVSP’99, pp.133-138. Santa Cruz, CA, August, 1999. [24]. E. Cosatto et. al.,“Audio-visual unit selection for the synthesis of photo-realistic talking-heads”, IEEE International Conference on Multimedia and Expo, ICME 2000. [25]. N.L. Etcoff and J.J. Magee, Categorical perception of facial expressions, Cognition, 44, 227- 240, 1992. [26]. C. Bregler et. al.‘‘Video Rewrite: Driving Visual Speech with Audio’’, ACM SIGGRAPH, 1997. [27]. A. Murat Tekalp, ‘‘Face and 2-D mesh animation in MPEG-4’’, Signal Processing: Image Communication 15 (2000) 387-421. [28]. H. Kobayashi and F. Hara, ‘‘Recognition of Six Basic Facial Expressions and Their Strength by Neural Network,’’ in Proc. International Workshop Robot and Human Comm., pp. 381- 386, 1992. [29]. E. Yamamoto, S. Nakamura, and K. Shikano, ‘‘Lip movement synthesis from speech based on Hidden Markov Models’’, Speech Communication, 26, (1998). [30]. Michael J. Lyons et. al., ‘‘Coding Facial Expressions with Gabor Wavelets’’, In Proceedings of Third IEEE International Conference on Automatic Face and Gesture Recognition, April 14-16 1998, Nara Japan, IEEE Computer Society, pp. 200-205. [31]. http://www.cs.bham.ac.uk/%7Eaxs/cogaff.html [32]. Ashish Verma, L. Venkata Subramaniam, Nitendra Rajput, Chalapathy Neti, Tanveer A. Faruquie, ‘‘Animating Expressive Faces Across Languages’’. IEEE Trans on Multimedia, Vol. 6, No. 6, Dec, 2004. [33]. A. Hunt, A. Black, ‘‘Unit selection in a concatenative speech synthesis system using a large speech database’’, ICASSP, vol. 1, pp. 373-376, 1996. [34]. J. K. Aggarwal, Q. Cai, ‘‘Human Motion Analysis: A Review’’, Computer Vision and Image Understanding, Vol. 73, No. 3, 1999. [35]. Vladimir et. al., ‘‘Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review’’, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997. [36]. D. M. Gavrila, ‘‘The Visual Analysis of Human Movement: A Survey’’, Computer Vision and Image Understanding, Vol. 73, No.1. January, 1999. [37]. H. Schlossberg, Three dimensions of emotion, Psychological review, 61, 81-88, 1954. [38]. Mozziconacci J. Sylvie and Hermes Dik, ‘‘Expression of emotion and attitude through temporal speech variations’’, ICSLP2000, Beijing, 2000. [39]. Rosalind W. Picard, “Affective computing: From Laughter to IEEE’’, IEEE Transaction on Affective Computing, vol. 1, no. 1, JanJun, 2010. [40]. Rosalind W. Picard, “Affective computing’’, MIT Press, 1997. [41]. R. Brunelli and D. Falavigna, ‘‘Person Identification Using Multiple Cues’’, IEEE Transaction On Pattern Analysis and Machine Intelligence, Vol. 12, No. 10, pp. 955-966, Oct 1995. [42]. A.Camurri, G.De Poli, M.Leman, G.Volpe, “A Multi-layered Conceptual Framework for Expressive Gesture Applications”, in Proc. Intl MOSART Workshop, Barcelona, Nov. 2001. [43]. R. Cowie., “Emotion recognition in human-computer interaction”, IEEE Signal Processing Magazine, 18(1):32-80, 2001. [44]. P. Romero et. al., “A Novel Real Time Facial Expression Recognition system based on Candide-3 Reconstruction Model”, in proc. of the XIV Workshop on Physical Agents, September, 2013. [45]. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/oz/web/oz.html. [46]. http://affect.media.mit.edu/ [47]. http://emotion-research.net/. [48]. http://www.almaden.ibm.com/cs/BlueEyes/index.html. [49]. Rosalind Picard, ‘‘Affective Computing: Challenges’’, International Journal of Human-Computer Interaction, ELSEVIER, 2003.