MFCC CZT

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 10

SEMINAR ON

Acoustic Feature Comparison of MFCC and CZT-based Cepstrum for Speech Recognition

Guided by:Prof. R.V.Pawar

Presented by:Neehal B. Jiwane

Introduction
The Mel-Frequency Cepstral Coefficients (MFCC) are the most widely used features in speech recognition field. Automatic speech recognition (ASR) systems. Feature extraction. The MFCC parameters perform better than others in the recognition accuracy.

MFCC

Fig. 1. MFCC Block Diagram

Chirp Z-Transform

Fig 2: Oreration in CZT

Data Time Warping


DTW algorithm is based on Dynamic Programming techniques

Fig. 3. A Warping between two time series

Experiment condition
Process 1) Speaker 2) Tools 3) Environment 4) Sampling Frequency, fs 4) Utterance Description 3 Female 3 Male Cool Edit Pro 2.0 tool Laboratory 300-3000 Hz Noisy area

RECOGNITION
Testing Set Testing Number Correct Number Percentage % Testing Set Testing Number Correct Number Percentage %

0 1 2 3 4 5 6 7 8 9

8 8 8 8 8 8 8 8 8 8

6 7 8 3 4 8 6 8 8 8

75 87.5 100 37.5 50 100 75 100 100 100

0 1 2 3 4 5 6 7 8 9

8 8 8 8 8 8 8 8 8 8

6 7 8 4 5 8 7 8 8 8

75 87.5 100 50 62.5 100 87.5 100 100 100

conditions:fl=300,fh=3000,M=256 Table 1. Recognition Rate of the MFCC Table 2. Recognition Rate of the MFCC+CZTBased

Testing Set

Testing Number

Correct Number

Percenta ge % 75 87.5 100 50 62.5 100 87.5 100 100 100 fl=300,fh=3000,M=256 Table 4. Different Cepstral Coefficients Testing Number Correct Number Percentage / % 80 66 79.825 80 69 86.25 Cepstral Coefficients MFCC MFCC&CZTbased

0 1 2 3 4 5 6 7 8 9

8 8 8 8 8 8 8 8 8 8

6 7 8 6 6 8 7 8 8 8

conditions:fl=300,fh=3000,M=512 Table 3. Recognition Rate of the MFCC+CZTBased

Conclusion
The design and implementation of the experiment, we come to the following conclusions, a new approach, called CZT-based algorithm, was developed to extract speech signals that are highly transient in nature. We combine the CZTbased method with MFCC has demonstrated its superiority over the previously reported MFCC method in that the frequency resolution of the highly transient speech signals is much enhanced, with better accuracy, widespread integration of speech recognition technology into enduser applications is ahead.

REFERENCES
[1] L.R. Rabiner, B.Gold, in: Theory and Application of Digital Signal Processing, Prentice-Hail, Englewood Cliffs, NJ, 1975, p.393. [2] J.P. Openshaw, Z.P. Sun, J.S. Mason, "A comparison of composite features under degraded speech in speaker recognition", Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. [3] R. Vergin, D. OShaughnessy, V. Gupta, "Compensated mel frequency cepstrum coefficients", Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. [4] Picone J W, "Signal modeling techniques in speech recognition", In Proceedings of the IEEE,1993,81(9):1215- 1247.

[5] Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques. Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi

You might also like