Ronald Pascual
Dr. Ronald M. Pascual is an associate professor at the Department of Computer Technology, College of Computer Studies, De La Salle University. He received his Ph.D. degree in Electrical and Electronics Engineering from the University of the Philippines Diliman (UPD), his M.S. degree in Electronics and Communications Engineering from De La Salle University (DLSU) Manila, and his B.S. degree in Electronics and Communications Engineering from Pamantasan ng Lungsod ng Maynila (PLM). His research interests include audio and speech processing, children’s speech technology, computer-aided language learning systems, computational linguistics, and digital signal processing.
less
Uploads
Papers by Ronald Pascual
in children’s Filipino speech for application in automated oral reading fluency
assessment. Automatic syllabication was optimised in the context of children’s
Filipino read speech. Using the Children Filipino Speech Corpus, prosodic
features were automatically extracted which were then classified according to
human rater assessment of fluency. Analysis of variance showed that speech
and articulation rates, pauses, syllable duration, and pitch can be used to
classify children’s oral reading fluency in Filipino into three levels, namely,
independent, instructional and frustration. Using machine learning
classification methods, fivefold cross-validation showed that speech rate,
articulation rate and number of pauses can be used to predict oral reading
fluency at 92%, 85% and 76% accuracy for 2, 3 and 4 levels of fluency
classification, respectively. Pitch and syllable duration patterns were also
characterised for the assessment of phrasing and expression between fluent and
non-fluent readers.
recent years have shown the feasibility of developing an automatic
speech recognition (ASR) system for Filipino-speaking children.
However, most of these studies are solely based on the Hidden
Markov Model (HMM) with Gaussian Mixture Model (GMM). In
this paper, we present the development of a hybrid ASR system
using both HMM and Time Delay Neural Network (TDNN). The
Filipino Children’s Speech Corpus (FCSC), which is purely composed
of read speech, was used to train and test all the models.
We performed several sets of experiments on various phoneme
sets, various numbers of HMM states, and various enhanced
models that employed vocal tract length normalization (VTLN),
linear discriminant analysis (LDA), and speaker adaptive training
(SAT). Our experiments show that a basic TDNN-HMM model
could consistently outperform an HMM-GMM model regardless
of how many HMM states are present. We also present that
VTLN slightly enhances the performance of the model. The
best performing model is the 4-state TDNN-HMM hybrid that
obtained the lowest word error rate (WER) of 0.97%.
in children’s Filipino speech for application in automated oral reading fluency
assessment. Automatic syllabication was optimised in the context of children’s
Filipino read speech. Using the Children Filipino Speech Corpus, prosodic
features were automatically extracted which were then classified according to
human rater assessment of fluency. Analysis of variance showed that speech
and articulation rates, pauses, syllable duration, and pitch can be used to
classify children’s oral reading fluency in Filipino into three levels, namely,
independent, instructional and frustration. Using machine learning
classification methods, fivefold cross-validation showed that speech rate,
articulation rate and number of pauses can be used to predict oral reading
fluency at 92%, 85% and 76% accuracy for 2, 3 and 4 levels of fluency
classification, respectively. Pitch and syllable duration patterns were also
characterised for the assessment of phrasing and expression between fluent and
non-fluent readers.
recent years have shown the feasibility of developing an automatic
speech recognition (ASR) system for Filipino-speaking children.
However, most of these studies are solely based on the Hidden
Markov Model (HMM) with Gaussian Mixture Model (GMM). In
this paper, we present the development of a hybrid ASR system
using both HMM and Time Delay Neural Network (TDNN). The
Filipino Children’s Speech Corpus (FCSC), which is purely composed
of read speech, was used to train and test all the models.
We performed several sets of experiments on various phoneme
sets, various numbers of HMM states, and various enhanced
models that employed vocal tract length normalization (VTLN),
linear discriminant analysis (LDA), and speaker adaptive training
(SAT). Our experiments show that a basic TDNN-HMM model
could consistently outperform an HMM-GMM model regardless
of how many HMM states are present. We also present that
VTLN slightly enhances the performance of the model. The
best performing model is the 4-state TDNN-HMM hybrid that
obtained the lowest word error rate (WER) of 0.97%.