Mounya Elhilali

Johns Hopkins University, Electrical and Computer Engineering, Faculty Member

Followers

Following

Co-authors

Public Views

InterestsView All (6)

Uploads

Papers by Mounya Elhilali

Discriminant spectrotemporal features for phoneme recognition

In this paper we investigate the modulation domain as an alternative to the acoustic domain for s... more In this paper we investigate the modulation domain as an alternative to the acoustic domain for speech enhancement. More specifically, we wish to determine how competitive the modulation domain is for spectral subtraction as compared to the acoustic domain. For this purpose, we extend the traditional analysis-modification-synthesis framework to include modulation domain processing. We then compensate the noisy modulation spectrum for additive noise distortion by applying the spectral subtraction algorithm in the modulation domain. Using subjective listening tests and objective speech quality evaluation we show that the proposed method results in improved speech quality. Furthermore, applying spectral subtraction in the modulation domain does not introduce the musical noise artifacts that are typically present after acoustic domain spectral subtraction. The proposed method also achieves better background noise reduction than the MMSE method.

Download

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

International journal of speech technology

Humans are quite adept at communicating in presence of noise. However most speech processing syst... more Humans are quite adept at communicating in presence of noise. However most speech processing systems, like automatic speech and speaker recognition systems, suffer from a significant drop in performance when speech signals are corrupted with unseen background distortions. The proposed work explores the use of a biologically-motivated multi-resolution spectral analysis for speech representation. This approach focuses on the information-rich spectral attributes of speech and presents an intricate yet computationally-efficient analysis of the speech signal by careful choice of model parameters. Further, the approach takes advantage of an information-theoretic analysis of the message and speaker dominant regions in the speech signal, and defines feature representations to address two diverse tasks such as speech and speaker recognition. The proposed analysis surpasses the standard Mel-Frequency Cepstral Coefficients (MFCC), and its enhanced variants (via mean subtraction, variance norma...

Download

Multistream Bandpass Modulation Features for Robust Speech Recognition

Biomimetic spectro-temporal features for music instrument recognition in isolated notes and solo phrases

EURASIP Journal on Audio, Speech, and Music Processing, 2015

Behavioral/Systems/Cognitive

this paper, we first summarize our basic findings concerning the accuracy and extent of precise s... more

Modeling goal-directed attention in tone sequences using a weighted Kalman filter

2015 49th Annual Conference on Information Sciences and Systems (CISS), 2015

ABSTRACT

Models of Timbre Using Spectro-Temporal Receptive Fields: Investigation of Coding Strategies

Timbre designates all of the perceptual characteristics of sounds that cannot be described as pit... more Timbre designates all of the perceptual characteristics of sounds that cannot be described as pitch, loudness or duration. Behavioral experiments combined with multidimensional scaling techniques have proposed that a few main acoustic dimensions subserve the perception timbre for homogeneous ensembles of sounds (e.g., Western musical instrument sounds). It is unclear however whether these dimensions can describe all aspects of timbre,

Download

Adaptive noise suppression of pediatric lung auscultations with real applications to noisy clinical settings in developing countries

by Dimitra Emmanouilidou, Mounya Elhilali, and Eric McCollum

IEEE transactions on bio-medical engineering, Jan 13, 2015

Chest auscultation constitutes a portable, lowcost tool widely used for respiratory disease detec... more Chest auscultation constitutes a portable, lowcost tool widely used for respiratory disease detection. Though it offers a powerful means of pulmonary examination, it remains riddled with a number of issues that limit its diagnostic capability. Particularly, patient agitation (especially in children), background chatter and other environmental noises often contaminate the auscultation, hence affecting the clarity of the lung sound itself. This work proposes an automated multiband denoising scheme for improving the quality of auscultation signals against heavy background contaminations. The algorithm works on a simple two-microphone setup, dynamically adapts to the background noise and suppresses contaminations while successfully preserving the lung sound content. The proposed scheme is refined to offset maximal noise suppression against maintaining the integrity of the lung signal, particularly its unknown adventitious components that provide the most informative diagnostic value dur...

Perceptual susceptibility to acoustic manipulations in speaker discrimination

The Journal of the Acoustical Society of America, 2015

Listeners' ability to discriminate unfamiliar voices is often susceptible to the effects of m... more Listeners' ability to discriminate unfamiliar voices is often susceptible to the effects of manipulations of acoustic characteristics of the utterances. This vulnerability was quantified within a task in which participants determined if two utterances were spoken by the same or different speakers. Results of this task were analyzed in relation to a set of historical and novel parameters in order to hypothesize the role of those parameters in the decision process. Listener performance was first measured in a baseline task with unmodified stimuli, and then compared to responses with resynthesized stimuli under three conditions: (1) normalized mean-pitch; (2) normalized duration; and (3) normalized linear predictive coefficients (LPCs). The results of these experiments suggest that perceptual speaker discrimination is robust to acoustic changes, though mean-pitch and LPC modifications are more detrimental to a listener's ability to successfully identify same or different speake...

A Biologically-Inspired Approach to the Cocktail Party Problem

2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, 2006

Though seemingly effortless, our auditory system engages in complex processes and transformations... more Though seemingly effortless, our auditory system engages in complex processes and transformations which enable us to segregate speech and other sounds in cocktail party settings. This paper presents a computational approach to modelling monaural auditory scene analysis, where we attempt to account for perceptual and neuronal findings of receptive field selectivity and adaptation in the auditory cortex. The model introduces