LINDASALWA BINTI MUDA

Speech Recognition for Connected Word Using Cepstral and Dynamic Time Warping Algorithms

Many people have contributed to this thesis. First and foremost, everything in me that I count th... more Many people have contributed to this thesis. First and foremost, everything in me that I count that as good comes only from the Allah s.w.t, who leads, grants and guided me along in completing this thesis. All honor and glory be to Him. I deeply thank my supervisor, AP. Dr. Irraivan Elamvazuthi for his guidance. Thanks for sharing invaluable advances and experiences. His attitude and enthusiasm in doing research and his consistent vision to make higher technology research grounded in reality, applicable and even commercialized always inspired me. Secondly, very special thanks to my co-supervisor, Dr. Mumtaj Begam for her endless support to build up my foundation knowledge on Speech Recognition field. I am also indeed indebted to all the friends for their generosity in sharing the resources and knowledge with me during this research. Special thanks to UTP lecturers, staff and other talented individuals for providing me resources, and conducive environment to make my research a success. Lastly, I dedicate this thesis to my beloved husband, daughter and son. Not forgetting, I would also extend my sincere gratitude and thanks to all my family members for their encouragement and support.

Download

Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

Corr, Mar 22, 2010

Digital processing of speech signal and voice recognition algorithm is very important for fast an... more Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. Several methods such as Liner Predictive Predictive Coding (LPC), Hidden Markov Model (HMM), Artificial Neural Network (ANN) and etc are evaluated with a view to identify a straight forward and effective method for voice signal. The extraction and matching process is implemented right after the Pre Processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) introduced by Sakoe Chiba has been used as features matching techniques. Since it's obvious that the voice signal tends to have different temporal rate, the alignment is important to produce the better performance.This paper present the viability of MFCC to extract features and DTW to compare the test patterns.

Download

Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

Corr, Mar 22, 2010

Digital processing of speech signal and voice recognition algorithm is very important for fast an... more Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. Several methods such as Liner Predictive Predictive Coding (LPC), Hidden Markov Model (HMM), Artificial Neural Network (ANN) and etc are evaluated with a view to identify a straight forward and effective method for voice signal. The extraction and matching process is implemented right after the Pre Processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) introduced by Sakoe Chiba has been used as features matching techniques. Since it's obvious that the voice signal tends to have different temporal rate, the alignment is important to produce the better performance.This paper present the viability of MFCC to extract features and DTW to compare the test patterns.

Download

Uploads

Papers by LINDASALWA BINTI MUDA

Log In