Mehran University Research Journal of Engineering and Technology, 2012
This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ ... more This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients) and ITEM (Information Theoretic Expectation Maximization). To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models) based on EM (Expectation Maximization) have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed ITEM which has faster convergence, to train speaker models. ITEM uses information theory principles such as PDE (Parzen Density Estimation) and KL (Kullback-Leibler) divergence measure. ITEM acclimatizes the weights, means and covariances, like EM. However, ITEM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic) metric. The ITEM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and ITEM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. ITEM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.
Mehran University Research Journal of Engineering and Technology, 2012
This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ ... more This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients) and ITEM (Information Theoretic Expectation Maximization). To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models) based on EM (Expectation Maximization) have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed ITEM which has faster convergence, to train speaker models. ITEM uses information theory principles such as PDE (Parzen Density Estimation) and KL (Kullback-Leibler) divergence measure. ITEM acclimatizes the weights, means and covariances, like EM. However, ITEM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic) metric. The ITEM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and ITEM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. ITEM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.
Uploads
Papers by Sheeraz Memon