FULLTEXT01

Download as pdf or txt
Download as pdf or txt
You are on page 1of 66

Master Thesis

Electrical Engineering
October 2012

An Acoustic Echo
Cancellation System based on
Adaptive Algorithms

Veera Tej Garre


Sailesh Kumar Mannem

This thesis is presented as part of Degree of


Master of Science in Electrical Engineering

Blekinge Institute of Technology


October 2012

Blekinge Institute of Technology


School of Engineering
Department of Applied Signal Processing
Supervisor 1: Dr. Nedelko Grbic
Supervisor 2: Mr. Magnus Berggren
Examiner : Dr. Sven Jhonson
This thesis is submitted to the school of engineering at Blekinge Institute of Technology in
partial fulfillment of the requirement for the degree of Master of Science in Electrical
Engineering with Emphasis on Signal Processing.

Contact Information:

Authors:
Veeratej Garre
E-mail: [email protected]
[email protected]

Sailesh kumar Mannem


E-mail: [email protected]

Supervisor 1:
Dr. Nedelko Grbic
School of Engineering (ING)
E-mail: [email protected]
Phone no: +46 455 38 57 27

Supervisor 2:
Mr. Magnus Berggren
School of Engineering (ING)
Email: [email protected]
Phone no.: +46 455 38 57 40

Examiner:
Dr. Sven Johansson
School of Engineering(ING)
Email: sven.johansson @ bth.se
Phone no.: +46 455 38 57 10

School of Engineering Internet : www.bth.se/ing


Blekinge Institute of Technology Phone : +46 455 38 50 00
371 79 Karlskrona Fax : +46 455 38 50 57
Sweden

ii
Abstract

Adaptive filtering technique is one of the core technologies in digital signal processing and
finds numerous application areas in science as well as in industry. Adaptive filtering
technique is widely used in many applications, including echo cancellation, adaptive noise
cancellation, adaptive beam forming and adaptive equalization.
Acoustic echo is a common occurrence in today’s telecommunication systems. The
distraction caused by the acoustic echo, reduces the speech quality in the communication. In
the communication system acoustic echo cancellers is used works as the far-end signal is
delivered to the system, it will be reproduced by the loudspeaker in the room. A microphone
in the room picks up the resulting direct path sound and consequent reverberant sound as a
near-end signal, The far-end signal is filtered and delayed to resemble the near-end signal,
filtered far-end signal is subtracted from the near-end signal. The resultant signal represents
sounds present in the room excluding any direct or reverberated sound produced by the
loudspeaker. The AEC with adaptive filtering technique will more accurately enhance the
speech quality in hands-free and teleconferencing communication systems. The focus is on
speech enhancement of speech signal with reverberated signal in handsfree speech
communication using AEC with adaptive filtering technique. There are many adaptive
algorithms available in the literature for echo cancellation and every algorithm have its own
properties, but the aim of algorithms using for echo cancellation is to achieve higher
ERLE(amount of echo cancelled) in dB at a higher rate of convergence with low complexity.
The adaptive algorithms NLMS, APA and RLS for echo cancellation were
successfully implemented in MATLAB. The three algorithms for AEC are tested with
simulation in three different echo occurring environments by changing microphone position,
source position and room dimensions. The performance evaluation of the NLMS, APA and
RLS algorithms are measured with ERLE parameter. The results show that the RLS
algorithm have good performance with high rate of convergence speed but the computational
complexity is high which makes it impractical in real time applications. The amount of echo
cancellation with APA algorithm is higher than NLMS with less computational complexity
than RLS and easy to implement in real time. The amount of echo cancellation with NLMS is

iii
low when compared to RLS and APA but it is easy to implement in real time with less
computational complexity. The detailed view of the comparison results of three algorithms at
three different environments are shown in section 6.

Keywords: AEC, Reverberation, Adaptive algorithms, Adaptive filters

iv
Acknowledgement
We would like to express my sincere gratitude and thanks to my thesis supervisor Dr. Nedelko
Grbic, Mr. Magnus Berggren for providing us a chance to do my thesis research work under
their supervision and Dr. Sven Johansson as a examiner in the field of Speech Processing. We
would like to thank them for the persistent help throughout the thesis work. With their deep
knowledge in this field which helped us to learn new things in order to complete master thesis
successfully. The continuous feedback and encouragement helped us in doing this thesis work.
We extend my appreciation and thanks to my fellow students A.B.N Suresh kumar and
Harish Midathala for their suggestions and discussions regarding solving different problems in
doing this research thesis.
We would like to thank BTH for providing us a good educational environment where we
can gain the knowledge and learn about new technologies that help us to move forward with the
thesis work.
Finally, we would like to extend my immense gratitude and wholehearted thanks to my
parents for their moral support and financial support throughout my educational career. They
have motivated and helped us for the successful completion of thesis work. We also thank my
pals for their support and encouragement during the thesis work. We take an opportunity to
thank all the signal processing staff at BTH.
We would lastly thank to all those for their support and help in any aspect for the
successful completion of the thesis work.

v
List of figures

Figure 1: Typical hands-free speech communication environment…….……….……….…...6


Figure 2: Illustration of mobile to landline system…………………………………..……….7
Figure 3: Illustration of a direct sound, an early sound, an early reverberation and late
reverberation from source to the microphone…………………………………………..…......9
Figure 4: Illustration of a desired source, a microphone and interfering source……….…...10
Figure 5: Illustration of a direct path and a single reflection from the desired source to the
microphone……………………………………………………………………………….….10
Figure 6: First reflection path of an image source……………………………………….….11
Figure 7: Reverberated environment with reflected source images…………………….…..12
Figure 8: Illustration of a direct sound (red color) and a reverberated sound (blue color) in a
close room environment………………………………………………………………..…….13
Figure 9: System identification model…………………………………………………........15
Figure 10: Noise cancellation model……………………………………………………......16
Figure 11: Predicting future values of a periodic signal…………………………………...17
Figure.12: Interference cancellation model…………………………………………...…...17
Figure 13: A Baseband Communication System………………………………………..…18
Figure 14: Adaptive equalizer…………………………………………………………..…18
Figure 15: Hands-free communication system with echo paths in a conference room……..20
Figure 16: Implementation of acoustic echo cancellation using the adaptive filter……..….21
Figure 17: Implementation of echo-cancellation using adaptive algorithms……………..…22
Figure 18: Room impulse response of environment1…………………………………….....28
Figure 19: Room impulse response of environment2…………………………………….....28
Figure 20: Room impulse response of environment3…………………………………….....39
Figure 21: Desired signal of APA at environment1………………………………………...31
Figure 22: Estimation error signal ‘e’ of APA at environment1…………………………....32
Figure 23: ERLE of APA at environment1………………………………………………....32
Figure 24: Desired signal of APA at environment2……………………………………...…33
Figure 25: Estimation error signal ‘e’ of APA at environment2……………………………33

vi
Figure 26: ERLE of APA at environment2……………………………………………...….34
Figure 27: Desired signal of APA at environment3…………………………………...……34
Figure 28: Estimation error signal ‘e’ of APA at environment3……………………………35
Figure 29: ERLE of APA at environment3……………………………………………...….35
Figure 30: Desired signal of NLMS at environment1………………………………………36
Figure 31: Estimation error signal ‘e’ of NLMS at environment1………………………….37
Figure 32: ERLE of NLMS at environment1………………………………………...……..37
Figure 33: Desired signal of NLMS at environment2………………………………………38
Figure 34: Estimation error signal ‘e’ of NLMS at environment2……………………….…38
Figure 35: ERLE of NLMS at environment2…………………………………………...…..39
Figure 36: Desired signal of NLMS at environmen3……………………………………….39
Figure 37: Estimation error signal ‘e’ of NLMS at environment3………………………....40
Figure 38: ERLE of NLMS at environment3…………………………………………..…..40
Figure 39: Desired signal of RLS at environment1…………………………………….......41
Figure 40: Estimation error signal ‘e’ of RLS at environment1………………………........42
Figure 41: ERLE of RLS at environment1………………………………………………….42
Figure 42: Desired signal of RLS at environment2…………………………………………43
Figure 43: Estimation error signal ‘e’ of RLS at environment2………………………….....43
Figure 44: ERLE of RLS at environment2………………………………………………….44
Figure 45: Desired signal of RLS at environment3…………………………………………44
Figure 46: Estimation error signal ‘e’ of RLS at environment2…………………………….45
Figure 47: ERLE of RLS at environment3………………………………………………….45
Figure 48: ERLE comparison of NLMS, APA and RLS at environment 1 in graph...….....47
Figure 49: ERLE comparison of NLMS, APA and RLS at environment 1 in chart...……..47
Figure 50: ERLE comparison of NLMS, APA and RLS at environment 2 in graph...….…48
Figure 51: ERLE comparison of NLMS, APA and RLS at environment 2 in chart……….48
Figure 52: ERLE comparison of NLMS, APA and RLS at environment 3 in graph……....49
Figure 53: ERLE comparison of NLMS, APA and RLS at environment 3 in chart…….…49

vii
List of tables

Table.1: The details of clean speech signal used for evaluation……………………………29


Table.2: ERLE comparison of NLMS, APA and RLS values at environment 1……….…...47
Table.3: ERLE comparison of NLMS, APA and RLS values at environment 2……….…..48
Table.4: ERLE comparison of NLMS, APA and RLS values at environment 3……….…..49

viii
List of abbreviations
NLMS Normalized Least- Mean Square
ASR Automatic Speech Recognition
SNR Signal-to-Noise Ratio
LMS Least Mean Square
RLS Recursive Least Square
APA Affine Projection Algorithm
FIR Finite Impulse Response
IIR Infinite Impulse Response
FD Fractional Delay
RIR Room Impulse Response
ISM Image Source Model
ERLE Echo Return Loss Enhancement
RTF Room Transfer Function
ISM Image Source Model
GSC Generalized Side-lobe Canceller
LCMV Linearly Constrained Minimum Variance
SD Speech Distortion
AEC Acoustic Echo Cancellation

ix
Contents

Abstract.............................................................................................................iii
Acknowledgement..............................................................................................v
List of figures.....................................................................................................vi
List of tables.....................................................................................................viii
List of abbreviation...........................................................................................ix
1 Introduction…………………………….………………………………….1
1.1 Hands-free speech enhancement……………………………………………………….3
1.1.1 Applications………………………………………………..………………………….....3
1.2 Hands-free speech communication problem………………..…………………………..5
1.2.1 Background noise……………………………………………..………………….………6
1.2.2 Reverberation…………………………………………………..………………..….......6
1.2.3 Acoustic coupling……………………………………………..…………..……………...7
1.3 Fractional delay…………………………………………………………………………….….8
2 Room reverberation…………………………………………..…….……..9
2.1 Introduction……………………………………………………………………………………9
2.2 Room image model…………………………………………………………………………...11

3 Adaptive filtering………..………………………………………………..14
3.1 Introduction……………………………………………………………………………………14
3.2 Adaptive filtering……………………………………………………………………………...14
3.3 Applications of adaptive filters………………………………………………………………..15
4 Acoustic echo cancellation…………………………………………….….20
4.1 Introduction…………………………………………………………………………………....20
4.2 Adaptive filter algorithm for echo cancellation……………………………………………….22
4.3 NLMS algorithm…………………………………………………………………………........23
4.4 RLS algorithm………………………………………………………………………………....24
4.5 APA algorithm…………………………………………………………………………………25
4.6 Echo return loss enhancement………………..……………………………………………......26

x
5 Evaluation setup……...………………………………….………………..27
5.1 Introduction……………………………………………………………………………….…....27
5.2 Evaluation setup for echo cancellation with adaptive algorithm………………………………27

6 Results…..………………………………………………………………….31
6.1 Simulation results for echo cancellation using APA algorithm…………………………….....31
6.1.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source [1,1,1]………...……....31
6.1.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source [1,2,1]…………...…....33
6.1.3 At environment 3: room [4,2,2], microphone [2,1,2], source [2,2,2]...………………...34
6.2 Echo cancellation using the NLMS algorithm…………………………………………….…...36
6.2.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source [1,1,1]…………….…..36
6.2.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source [1,2,1]………………...38
6.2.3 At environment 3: room [4,2,2], microphone [2,1,2], source [2,2,2]…….………….....39
6.3 Echo cancellation using the RLS algorithm…………………………………………………....41
6.3.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source [1,1,1]…………….…..41
6.3.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source [1,2,1]………………...43
6.3.3 At environment 3: room [4,2,2], microphone [2,1,2], source [2,2,2]…….………….....44
6.4 Comparing ERLE of APA, NLMS and RLS in three environments………………..………....46
6.4.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source [1,1,1]…………….…..46
6.4.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source [1,2,1]………………...48
6.4.3 At environment 3: room [4,2,2], microphone [2,1,2], source [2,2,2]…..……………....49

7 Conclusion and future work………………………………………….......50


7.1 Summary………………………………………………………………………………..50
7.2 Conclusion…... ………………………………………………………………………....50
7.3 Future work…….………………………………………………………………………..51

8 Bibliography…………………………………………………………….....52

xi
1. Introduction
Hands-free communication is the area which has undergone tremendous advancement
in the recent past. It covers many things such as mobile telephony, hearing aids and
automatic information systems i.e. voice controlled systems, video conferencing systems
and many of the multimedia applications. More and more people are using personal
communication devices, personal computers and wireless mobile telephones which in turn
transforming into advanced personal communication systems. The advancements in
interpersonal communication systems are realized by continuous effort for improving and
extending the interaction between individuals, which are not only provides user safety and
quality but it is user friendly too. The combination of telephone technologies and computers
are making way for convenient hands-free communication.
The advancement in wireless communication technology has provided ease of usage
for voice connectivity in cellular communication and personal computer devices in order to
enabling the natural communication in different environments such as cars, restaurants and
offices. In hand-controlled automobile applications, the functionalities are processed with
voice controls; the signal degradations in this field are same as that of distant-talker speech
recognition applications. Audio conference plays a key role in communication systems for
small scale and a large scale firm which is cost effective and also aimed for user comforts. In
present generations, the demand for voice controlled systems is high as the hand-controlled
functions are replaced with voice controls which are efficient and also robust. The
importance of speech processing techniques have been analyzed for capability of preventing
damage to hearing in high-noise environments and also improving speech intelligibility in
noise for hearing impaired listeners.
Hands-free speech acquirement plays a vital role in all above mentioned applications.
In automated speech system design the microphone is placed far away from the user (
speech transmitter and receiver are installed at remote places with certain distance in between
them ) due to which problems like poor sound quality and acoustic echo arise from far-end
side. The poor sound quality is because of the microphone placed near to the speaker due
to which it suffers from unwanted disturbances caused by environmental noise, interfering
sounds and reverberation of speech signal from loudspeaker corrupts the actual speech
signal. In full-duplex hands-free communication acoustic echo is generated at the near end
1
side at microphone causes disturbance to the speaker at the far end side in which listener hear
his own voice with 100-200 ms delay. This leads to reduce intelligibility of the received
speech in a noisy conditions and also degrading the speech in speech recognition systems.
The degradation in the received speech signals makes conversation between the users
difficult. For improvement in the quality of the hands-free mobile telephones, the major tasks
to be considered are background noise suppression, interference reduction and acoustic echo
cancellation. For the improvement of the speech quality and reducing unwanted disturbances
several speech enhancement methods are implemented for robust speech communication
system. Microphone arrays are widely used technology for speech enhancement in
communication systems were speech quality and speech intelligibility is being degraded due
to a noisy environment and room reverberations.
The perception of speech signal is measured in terms of quality and intelligibility.
The “Quality” is a subjective measure which reflects on the individual preferences of
listeners [1]. The “Intelligibility” is an objective measure which predicts the percentage of
words that can be correctly identified by listeners [1]. Speech enhancement is required when
the speech signal and received signals are degraded. The purpose of speech enhancement is
to improve noisy speech signals.
The received speech signals in automated speech are mainly corrupted by background
noise. In general, the background noise can be non-stationary and the signal to noise ratio
(SNR) decreases if the noise level increases. Since a few decades the research in speech
enhancement methods of acoustically distributed signals has been performed widely and the
contribution of digital hearing aids has significantly improved the research in hands-free
communication systems.
The acoustic echo cancellation plays a k e y role in acoustically coupled
environments. The acoustic echo plays a major role in degrading the speech intelligibility
in speech communication systems like hearing aids and telecommunication systems. In this
thesis, adaptive methods like APA, NLMS and RLS algorithms are used to cancel the
acoustic echo.

2
1.1 Hands-free speech enhancement
Speech enhancement is necessary in hands-free communication devices such as
cellular phones, teleconferences and automatic information systems. For example speech
signals produced in a room generate reverberations, which are noticed when a hand-free
single channel telephone system is used and binaural listening is not possible [2]. Necessity
for enhancement of normal speech is required for impaired listeners to fit into their
individual hearing capabilities.
Speech enhancement in hand-free mobile communication is possible by spectral
subtraction [2] or temporal filtering such as wiener filtering, noise cancellation and multi-
microphone methods using different array techniques [2]. Different array techniques are used
to handle room reverberations. Hands-free speech communication is generally characterized
by reduction in speech naturalness and intelligibility resulting from the corruption of the
speech sound field during data capture by microphones, as well as speech distortion
generated by data transmission and reproduction [3].
Hands-free speech enhancement is defined as the ability to improve the
discrimination between speech and background noise, reverberation and other types of
interferences colliding on microphones [3]. In hands-free communication systems
perceptual aspects such as quality and intelligibility are necessary for speech enhancement.
The quality and intelligibility are un-correlated and can be achieved simultaneously.
Improvement in intelligibility can be achieved by emphasizing the high frequency content
of the noisy speech signal. Therefore, for intelligibility improvement quality should be
neglected. In other words quality and intelligibility performance is said to be inversely
proportional in the noisy speech signal. Human hearing system has the capability of
discrimination of speech in noisy reverberant environments.

1.1.1 Applications
Based on frequency selectivity, focused hearing and spatial sound's location, many
speech enhancement systems try to substitute and analyze in accordance with the human
hearing mechanism. There are numerous applications of hands-free speech enhancement. A
few important applications are explained briefly below.

3
a) Hearing aids
Hearing aids is concerned with the remedies for the hearing problems that are caused
due to unwanted disturbances. Nearly 25 percent of the present human population is
suffering from hearing impairment by damaging the inner ear hair cells of humans in the
process of exposure to loud noise. The exposure to loud noise is mainly in the environments
of industries, cooling systems, automobiles, engines and by listening to loud music using
headsets. Human hearing system exposing to these types of environments may lead to
temporary or permanent hearing loss. The hearing aid system amplifies the received signal,
If the signal consists of noise, it is also amplified along with speech signal as hearing
impaired people are incapable of distinguishing the speech signals and noise. The main
problem for hearing aid is acoustic echo due to the small distance between microphone
and speaker. To overcome the above situations, microphone arrays for speech enhancement
and an acoustic echo cancellation are used.
In this thesis, hearing aids is considered as one of the application in order to make
the hearing impaired person more comfortable in hearing the received speech signal and
reducing the noise and echo caused due to various environments. During the communication,
the speech signal is reverberated in the room from reflection of the wall. Therefore speech
signal is corrupted by ambient noise in the environment to the far-end user.
b) Voice control and speech recognition systems
The advancement in the electrical technology made a huge demand for consumer
products, telephones and personal devices and these are rapidly adapting to allow voice
control. In order to provide convenience and easy use, a large number of systems i s
controlled by voice, a few of the applications are lights and heating systems, powering,
opening window and curtains and adjusting home entertainment systems [3].
The main aim of the voice control and speech recognition systems is to replace
hand-controlled functions with voice controls t o progress i n efficiency and optimized
speech automated m et hods . In the process of speech enhancement in ASR (Automatic
Speech Recognition) method it avoids degrading the quality of speech due to the ambient
noise and room reverberations. The ASR increases the quality of received speech signal and
is based on statistical pattern recognition. The degradation of the signal is calculated based
on the amount of similarity between clean speech recognizer and noise speech signal. In

4
order to get improved SNR of the received noisy speech signal and also to increase the
speech intelligibility microphone array technique can be used.
c) Audio-conferencing
The exploitation of the broadband internet connections gav e ri s e t o t h e
advancements i n telecommunication and video communication systems for personal
computers based internet protocols. The advancements in the wireless communication
technology developed to increase the speech intelligibility in desktop and mobile
environments. The wireless communications have been frequently used in airports, offices
companies and restaurants. In these types of environments, the ambient noise composites
human babble noise, fan noise as well as moving object such as chairs and colliding items
[3]. Normally a microphone is placed at the top of the monitor in concern with optimization
of speaker’s eye level. The speaker and the microphone unit are placed at an operating
distance of 45-60 cm. For better solution for this kind of systems spectral subtraction
algorithms and beam forming are used.
Audio conferencing plays an important role in many large and small companies for
meeting and online study courses as it is cost effective and also saves time computed to
travel. Nowadays, it has become a mandatory step for many firms and individuals for
conducting teleconferences with sophisticated and reliable technologies. The conference
rooms are characterized by ambient noise due to all the participants in the conference are
surrounded by speech acquisition systems. As speaker and microphone are placed at varying
distance room reverberations occurs in conference rooms. The distance between the user
and the microphone is large when compared with other applications. The best solution for
the above problem can be solved by using microphone arrays and echo cancellation which
have the capacity to detect the speech and reduce the echo. In video technology, there is
system which allows steering and aiming the camera at the speaker [3].

1.2 Hands-free speech communication problem.


Different hands free communication application and their surrounding environments
were described. The major problems challenged in each application are background noise,
room reverberation and acoustic echo. A typical hands-free communication environment is as
shown in Figure 1

5
Figure 1: Typical hands-free speech communication environment
1.2.1 Background noise
Noise is present in any type of environment. Background noise is mostly due to
automobile traffic, engines, fan noise, background sound in public places, vibration noise
from heavy industries, and aircrafts. In hands-free speech communication, background
noises degrade the performance of speech recognition systems which is a severe problem for
hearing aid users and also suppress the intelligibility of the speech. Acoustic disturbances
arrive from different directions and are said to be background noise containing higher levels
of low frequency components when compared to speech signal therefore to extract speech
signal spectral based methods are used. In general, speech is characterized by a laplacian
distribution whereas background noise is characterized by Gaussian distribution and by
considering a certain class of distribution techniques can be developed for extracting speech
or background noise.
1.2.2 Reverberation
Speech signal in closed environments is reflected by the walls, objects and
ceilings in the room. As illustrated in Figure.1. These reflections cause disturbance to the
speech produced from the loudspeaker to microphone. The reverberation time is the time it
takes for a room impulse response to decay 60 dB from its largest peak. The energy of

6
confined reverberation depends on the location of acoustic sensors and the source in the
room and their distances.
The reverberation effect can be reduced by keeping the microphone close to the
source signal of interest. Reflections will affect the direct speech of the user while reaching
the receiver and blur its temporal and spatial characteristics. This type of communication is
not acceptable for hands-free communication like in telephone systems and communication
systems which adds unwanted disturbance and reverberation to the listener in real time. This
reduces the quality of the speech signal in reverberant conditions. In case of speech
recognition and verification applications in highly reverberant environments the performance
of the speech signal is reduced. The de-reverberation also adds an advantage to the
hearing impaired listeners as it increase speech intelligibility [4].
1.2.3 Acoustic coupling
In hands-free duplex communication, the reflected transmission path between loud
speaker and microphone is the echo path. In full duplex communication, the far-end signal
which is emitted by the speaker, propagates in the environment and is picked up by the
microphones in the same way as other interfering signals [3]. The acoustic echo occurred
during the full duplex hands free communication degrade the speech intelligibility, which
disturb the user like listening his own speech after some delay. In hands-free communication
system the SNR is reduced due to large distance between the microphone and the speaker as
it is disturbed by ambient noises.

Figure 2: Illustration of mobile to landline system


Echoes can severely affect the quality and intelligibility of the speech creating
disturbance to users in a telephone system. The echo characterizes with respect to delay and
amplitude. In hands-free and teleconference systems an acoustic echo arises when there is an
acoustic coupling between speaker and microphone as shown in Figure 2. The acoustic echo

7
can be cancelled using adaptive algorithms such as NLMS, RLS and APA algorithms.
1.3 Fractional delay
In digital filters, fractional delay filters used for band-limited interpolation. Band-
limited interpolation is a technique developed for evaluating the sample signal at an
arbitrary point of time even if the signal is placed between two sample points of the signal.
The arbitrary sampling of the signal is band limited to half the sampling rate (Fs/2) for the
sampling value to exact, which implies that the continuous-time signal can be exactly
regenerated from the s a m p l e d data. Now, the processing of the sample value is easy to
evaluate at any given arbitrary time even if the signal is fractionally delayed. The last integer
multiple of the sampled interval is used in the calculation of the fractional delay. The
fractional delay filters use FIR and IIR filters for the evaluation of fractional delays.
Fractional delay filters are used in various fields of applications in process of speech
coding and synthesis, sample rate conversion, beam steering, design of digital differentiators
and integrators. In the above mentioned fields there is a problem of the fixed sampling
period. Fractional- delay filters are the filters having flat phase delays with a wide frequency
band, with the value of phase delay approximating the fractional delay and are normally used
for the modeling of non-integer delays. Therefore, these filters are used in many real time
applications where actual sampling instants are necessary. Fractional delay is non-integer
multiple of the sampling interval, which is assumed to be uniform sample. These filters
provide the observation of signal values at arbitrary location in the sampling interval [5].

8
2. Room reverberation
2.1 Introduction
In speech communication systems like hands- free mobile telephones, hearing aid,
tele-conference systems and voice controlled systems the received microphones signals are
degraded with background noise, reverberation, and other interferences of the signal. The
performance of the automatic speech recognition systems decreases due to the degradation of
the signal.
In this study of reverberation the multi-path propagation of an acoustic sound from its
source point microphone is analyzed. The reverberant signal can be described as an audio
signal with a coloration and noticeable echo. The received microphone signals are
characterized as
1. Direct sound
2. Early reverberation and
3. Late reverberation as shown in Figure 3

Figure 3: Illustration of a direct sound, an early sound, an early reverberation and late
reverberation from source to the microphone.
The direct sound is said to be the first signal that is received by the microphone, the early
reverberation is said to be a signal that is arrived after the direct sound and the late
reverberation is said to be the signal that is arriving next after early reverberation, these
detrimental perceptual effects are primarily caused by late reverberation and generally

9
increase with increasing distance between the source and microphone. Conversely, early
reverberations tend to improve the intelligibility of speech. In combination with the direct
sound it is sometimes referred to as the early speech component [6].
To eliminate the far end echo signal an acoustic echo canceller are used. To reduce
the background noise and residual echo usage of post processor is applied to remove the echo
that are not eliminated by echo canceller. Hands-free systems are often used in a noisy and
reverberant environment and so the received microphone signal does not only contain the
Desired signal but also interferences such as room reverberation that are caused by the
desired source and a far-end echo signal that results from a sound that is produced by the
loudspeaker [6].

Figure 4: illustration of a desired source, a microphone and interfering source.

Figure 5: Illustration of a direct path and a single reflection of the desired source to the microphone.

10
The degraded signal received at the microphone are due to reverberation introduced by the
multi-path propagation of the desired speech signal to the microphone signal as shown in
Figure 5.

2.2 Room image model


In this study of room impulse response is measured using the image source model
(ISM). The room image model is analyzed concerning a room and it depends on the position
of a microphone in that room. Allen and Berkley describes this method briefly [7] and were
the prominent researchers to design and implement ISM. By using the fractional delay filters,
each image source is effectively represented by exact non-integer time delays and room
transfer function obtained in frequency domain and the Inverse Fourier transform in time
domain also gives the same result [8]. In Figure 6 the path involving the first reflection is
shown. This image source ‘S’ is located near to the wall, the destination ‘D’ will receive two
reflection one is a direct path (SD) and another reflected path (SRD).

Figure 6: First reflection path of an image source.


The direct path length is calculated directly. A virtual image is generated next to the wall
(S’). From the triangular geometry the distance SR=S’R therefore SRD=S’D [20].
The Figure 7 shows a sound source (green circle) located in a room at 3-D position.
Red plus (+) symbol is considered to be reference point of the room and its coordinates are
assumed to be (0,0,0). Every position is measured with reference to the reference point of the
room. Xm is the distance between the microphone and reference point. Xs is the distance
between source and reference point. Xr is reflecting wall distance from origin. The source
image1 and source image2 are the first reflected image sources generated from the
reverberating image model [20].

11
Figure 7: Reverberated environment with reflected source images
The red part is the origin. The x-coordinate of the virtual sources can be
expressed using the sequence below

( 2.1 )

xs is the x-coordinate of the sound source and xr is the length of the room in the
x-dimension. The location of the ith virtual source for value of i is determined. If i value is
negative then the virtual source is located on the negative x-axis. If i = 0 then the virtual
source is actually the real source. We can find the distance between the ith virtual sound
source and our microphone by subtracting the microphone's x-coordinate, xm, from xi. This is
shown below.

( 2.2 )

The relative positions of the virtual sources along the y and z axes can be found in a similar
fashion using equations 2.2 and 2.3.

( 2.3 )

( 2.4 )

12
( 2.5 )

( 2.6 )

Were ‘c’ is velocity of sound in meters. The ts value is estimated for multiple reflections of
reverberation. For every reflection there should be some loss of energy which is estimated by
using reflection co-efficient (α) alpha. Calculation of reflection co-efficient and its effect are
explained in [9].

Figure 8: Illustration of a direct sound (red color) and a reverberated sound (blue color) in a close
room environment.
The effect of reverberation for a signal is shown in Figure 8. The red colored signal in the
figure indicates the original speech signal and the blue colored signal in the figure indicated
amplified reverberant signal due to the addition of reflection energy at a particular unit
sample.

13
3. Adaptive filtering
3.1 Introduction
Signal processing is used in the area of electrical engineering, systems
engineering and applied mathematics. Signal processing is a tool for representation,
manipulation and transformation of signals and the data it contains. In the past
generation, the most extreme technology used for signal processing was analog signal
processing which involved both linear and nonlinear circuits. The rapid advancement
in the digital computer technology and integrated circuit fabrication resulted in an
area of science and engineering called digital signal processing. It is because of the
programming capability, low cost, miniature size, and low power consumption that
widespread application of DSP techniques is being carried out [10]. In digital signal
processing one of the widely used specialized branch is adaptive signal processing
which mainly concerned with adaptive filters and their applications.

3.2 Adaptive filters


Adaptive filtering is one of the main technologies in the field of digital signal
processing and is used in many number of application areas in industries as well as in
science. Application of adaptive filtering technique includes adaptive noise
cancellation, adaptive equalization, echo cancellation and adaptive beam forming. All
these applications concerned with unknown characteristics of the signal to be
generated. If the characteristics of signal are unknown then the efficient method to
use is an adaptive filter rather than using fixed filters. Adaptive filtering algorithm or
adaptation algorithm is said to self implemented filters using a recursive algorithm.
The algorithm starts from an initial guess, chosen based on the a priori knowledge
available to the system, then refines the guess in successive iterations, and converges,
eventually, to the optimal wiener solution in some statistical sense [11]. In many of
the practical applications adaptive filters are used to perform this estimation as
accurately and quickly as possible for an unknown system response.

14
The one basic common feature of adaptive filters is:
An input vector and a desired response are used to compute and estimation error,
which in turn is used to control the values of a set of adjustable filter coefficients by a
feedback loop and an algorithm [11].

3.3 Applications of adaptive filters


a) System identification
System identification deals with the capability of an adaptive system to find
the FIR filter that best reproduces of another system, whose frequency response is
unknown. The diagrammatical set up is shown in Figure 9.

Figure 9: System identification model


When the adaptive system reaches its optimum value and the output is close to
zero an FIR filter is obtained whose weights are the result of the adaptation process
that is giving the same output as that of the 'unknown system' for the same input. In
other words, the FIR filter reproduces the behavior of the 'unknown system' [18]. This
design is said to be efficiently working when the frequency response of the system to
be identified matches with that of a certain FIR filter. In case of unknown system
having an all-pole filter, then the FIR filter will approach for the best result. The
system output will never be zero but it may compromise reducing it by converging to
an optimum weight vector. The frequency response of the FIR filter will try to get the
best approximate out of it but not exactly equal to that of the 'unknown system.
b) Noise cancellation in speech signals.
Adaptive filtering can be extremely useful in cases where a speech signal is
submerged in a very noisy environment with many periodic components lying in

15
the same bandwidth as that of speech [18]. The design of adaptive noise canceller
for speech signals consists of two inputs. The desired input consists of voice that is
corrupted by noise (speech signal) and other reference input that contains noise
which is related in some way to the desired input noise. The noise reference input is
made as similar as that of the desired input noise by passing it to the system filter and
that filtered version is subtracted from the desired input. Therefore by removing the
noise from the desired input signal the noise free signal is obtained. The setup is show
in Figure 10. From practical system noise is not completely removed but its level is
reduced considerably.

Figure 10: Noise Cancellation Model


c) Signal prediction

Predicting signals may seem to be an impossible task, without some limiting


assumptions. Assume that the signal is either steady or slowly varying over time, and
periodic over time as well. Here the function of the adaptive filter is to provide best
prediction (in some sense) of the present value of a random signal. Accepting these
assumptions, the adaptive filter must predict the future values of the desired signal
based on past values. When s(k) is periodic signal and the filter is long enough to
remember previous values, this structure with the delay in the input signal, can
perform the prediction. This structure can also be used to remove a periodic signal
from stochastic noise signals. The present value of the signal serves the purpose of a
delayed response for the adaptive filter. Past values of the signal supply the input
applied to the adaptive filter. Depending upon the application of interest, the adaptive

16
filter output or the estimation (prediction) error may serve as the system output. In the
first case, system operates as a predictor, in the latter case; it operates as a prediction
error filter. The setup is shown in Figure 11.

Figure 11: Predicting future values of a periodic signal

d) Interference cancellation
In this application, adaptive filter is used to cancel unknown interference
contained alongside an information signal component in a primary signal, with the
cancellation being optimized in some sense in fig 1.4. The primary signal serves as
the desired response for the adaptive filter. A reference (auxiliary) signal is employed
as the input to the adaptive filter. The reference signal is derived from the sensor or set
of sensors located in relation to the sensors supplying the primary signal in such a
way that the information signal component is weak or essentially undetectable [18].

Figure 12: Interference cancellation model


e) Channel equalization

In communication channels such as wireless, telephone and optical channels are


affected by inter-symbol interference (ISI). The channel bandwidth becomes inefficient,
without the utilization of channel equalization. Channel equalization is a process of
compensating for the effects caused by a band-limited channel, hence enabling higher

17
data rates [12]. These effects are due to the out-of-boundary transmission medium and
the multipath effects in the radio channel. A typical communication system is depicted
in Figure 13,

Additive
Noise

Transmitter Channel ∑ Receiver Equalizer


Filter Medium Filter

Figure 13: A baseband communication system


In the receiver the equalizer is incorporated by introducing inter-symbol interference to
the channel. The equalizer output transfer function is directly inverse to the channel
transfer function estimate.
Figure 1.6
Channel Channel v(n) Equalizer Output
X(n)
Output Equalizer

e(n)
Adaptive weights ∑

Supervise
Training
Unsupervised training
Figure 14: Adaptive equalizer
The equalizer is designed to be adaptive to the channel variation in the transmission of
high speed data over a band limited channel. The equalizer is recursively updated by an
adaptive algorithm based on the observed channel output for reconstructing the output
signal. The configuration of an adaptive equalizer is depicted in Figure 14.

18
f ) Acoustic echo cancellation
An acoustic echo canceller can overcome the acoustic echo that interferes with
teleconferencing and hands free telecommunication. It adaptively identifies the transfer
function between a loudspeaker and a microphone, and then produces an echo replica
that is subtracted from the real echo [13]. Echo occurs when an audio source and
sink operate in full duplex mode. In this situation the received signal is output
through the telephone loudspeaker (audio source), this audio signal is then
reverberated through the physical environment and picked up by the systems
microphone (audio sink). The result is that time delayed and attenuated images of the
original speech are returned to the distant user [18].
The present study deals with canceling these echo signals for improving the
communication quality by using various adaptive filtering algorithms and comparing
the performance of all these algorithms when applied to echo cancellation application.
Echo cancellation is critical to achieving high quality voice transmissions over packet
networks, which typically face transmission delays above 30 to 40ms. These long
delays make echo readily apparent to listeners, and must be eliminated in order to
provide viable telephony service [14].

19
4. Acoustic echo cancellation
4.1 Introduction
In hands-free speech communication the main aim of the system is to provide
good voice quality and good intelligibility of the speech when two or more people
communicate with each other from different locations. During the communication
between two or more people due to the acoustic echo conversation between talkers and
listeners the voice quality becomes degraded and there is a chance of loss in intelligibility
of the signal.

Figure 15: Hands-free communication system with echo paths in a conference room
The phenomenon in which the delayed and distorted version of the original speech
signal or the electrical signal is reflected back to the speech source is known as Echo.
Acoustic echo is defined as a type of noise which occurs due to the reflections of speech
signal by the walls, ceiling or objects of a room and also defined as an acoustic coupling
between the loudspeaker and the microphone. The main aim of the hands-free
communication is to cancel the acoustic echo in order to provide echo free environment

20
for loudspeakers during the communication. In this thesis the main concentration is to
simulate the acoustic echo cancellation using APA.
Figure 15 shows the scenario of a hands free communication system with echo paths
in the conference room where the speech from the far-end processed from a loud-speaker
reaches the microphone of near- end of the room in various paths i.e. direct path and
reflected path from the wall, ceilings and objects in a room forming an echo that is sent
back to the far-end. Therefore, this causing disturbance in the speech quality of the signal
in communication process which leads to a major problem in communication systems.
In order to overcome the acoustic echo problem in hands free
communication systems such as hearing aids, teleconferencing several methods have been
designed using directional microphones. In order to reduce echo in hands-free
communication AEC has been implemented. The AEC helps in eliminating echo and to
enhance the quality of speech in communication systems. The design of AEC provides the
clarity, smooth and comfortable way of communication for the participants in the
conference room. The echo cancellation is achieved using several adaptive algorithms
such as LMS, NLMS, RLS and APA. The mentioned algorithms follow the same
procedure to cancel echo in any of the communication applications. In our thesis, the
main concentration is on APA, NLMS and RLS adaptive filter algorithms in order to
achieve echo cancellation. Figure 16 shows structure of how to implement AEC using
adaptive filters in three basic steps.

W(n)

Figure 16: Implementation of acoustic echo cancellation using the adaptive Filter
The three basic steps using adaptive algorithms for are mentioned in detail as [16]
1. Estimate the characteristics of echo path of a room

21
2. Create a replica of the echo signal
3. Subtract echo from the microphone signal in order to obtain clean speech signal.
Therefore, AEC plays a major role in communication systems by avoiding the
acoustic coupling between microphone and loudspeaker. If the echo is generated then
coupling causes the undesired characteristics of acoustic echo that degrades that quality of
sound and intelligibility of the speech.

4.2. Adaptive filter algorithm for echo cancellation


Repetition of a sound by reflection of sound waves from a surface is popularly known
as echo. There are many ways of solving the acoustic echo cancellation

w(n) h(n)

Figure.17: Implementation of echo-cancellation using adaptive algorithms.


Adaptive filters are dynamic filters which iteratively alter their characteristics in order to
achieve an optimal desired output. An adaptive filter algorithmically alters its parameters in
order to minimize a function of the difference between the desired output d(n) and its actual
output ŷ(n). This function is known as the cost function of the adaptive algorithm. Figure.17
shows a block diagram of the adaptive echo cancellation system. Here the filter h(n)
represents the impulse response of the acoustic environment, w(n) represents the adaptive
filter used to cancel the echo signal. The adaptive filter aims to equate its output ŷ(n) to the
desired output d(n) (the signal reverberated within the acoustic environment). At each
iteration the error signal, e(n) =d(n) – ŷ(n), is fed back into the filter, where the filter
characteristics are altered accordingly. The aim of an adaptive filter is to calculate the
difference between the desired signal and the adaptive filter output, e(n). This error signal is

22
fed back into the adaptive filter and its coefficients are changed algorithmically in order to
minimize the cost function. In the case of acoustic echo cancellation, the optimal output of
the adaptive filter is equal in value to the unwanted echoed signal. When the adaptive filter
output is equal to desired signal the error signal goes to zero. In this situation the echoed
signal would be completely cancelled and the user would not hear any of their original
speech returned to them.

4.3 NLMS algorithm:


The LMS algorithm was first developed by Widrow and Hoff in 1959 through their
studies of pattern recognition. From there it has become one of the most widely used
algorithms in adaptive filtering. The LMS algorithm is a type of adaptive filter known as
stochastic gradient-based algorithms as it utilizes the gradient vector of the filter tap weights
to converge on the optimal wiener solution. And its update equation is
( 4.1 )
Where in equation 4.7 e(n) is the error signal, is
the input signal and the update coefficient can be
calculated from its previous coefficient w(n-1). N is the length of the coefficient vector. And
the fixed step size ( ) gives the detail of the rate of convergence and gradient (-E{e(n)
(n)}) gives the convergence direction.
One of the difficulties in the design and implementation of the LMS adaptive filter is
the selection of the step size µ. Determining the upper bound step size is a problem for the
variable step size algorithm if the input signal to the adaptive filter is non-stationary. A
convenient way to incorporate this bound into the LMS adaptive filter is to use a time
varying step size of the form
µ(n) = ( 4.2 )

where ║║² = Euclidean Norm and β is the normalized step size with 0 < β < 2.
Replacing µ in the LMS weight vector update equation with µ(n) leads to NLMS
algorithm, which is given by

( 4.3 )
║ ║

( 4.4 )
where d(n) is a desired signal

23
Advantages and disadvantages:
NLMS algorithm has a good convergence speed which makes this algorithm useful for echo
cancellation. It shows greater stability with unknown input signals. The noise amplification
becomes smaller when using normalized step size. It has minimum steady state error and
faster convergence. Compared with LMS algorithm, the NLMS algorithm requires additional
computations to evaluate the normalization term ║x(n)║². NLMS algorithm requires 3N+1
multiplication which are N times more than the LMS algorithm.

4.4 RLS algorithm:


The memory of the RLS algorithm is confined to a finite number of values, with respect
to the order of the filter tap weight vector. The RLS implementation is even though the
matrix inversion is essential for the RLS algorithm derivation, It is not necessary for the
implementation. This will reduce the amount of computational complexity of the algorithm.
Unlike the LMS based algorithms, current variables are updated within the iteration they are
to be used, using values from the previous iterations. The RLS algorithm is implemented as
the filter tap weights from the previous iteration and the current input vector as [25]
The computational data for RLS algorithm is as follows
=Exponential weighting factor
δ= Value used to intialize value of inverse of autocorrelation at n=0 i.e., P(0)= I
P(n)= inverse of Autocorrelation matrix , where

= ( 4.5 )

g(n)= gain vector= ( 4.6 )


( 4.7 )

( 4.8 )

The estimation error value is calculated using equation


wT(n)x(n) ( 4.9 )
The adaptive filter coefficients and in turn the coefficients of auto-correlation matrix are
calculated as
( 4.10 )

24
( 4.11 )

Advantages and disadvantages:


RLS converges faster than LMS in stationary environment but in non stationary LMS
algorithm is better than RLS. Sensitivity to computer round off error this leads to instability
and higher computational complexity. Numerically robust RLS are two types they are:
Square root RLS and inverse QR RLS Algorithm. The computational complexity of RLS is
proportional (M+1)^2 the convergence is less sensitive to eigen value disparities in the
autocorrelation matrix of x(n) for stationary process. RLS does not perform very well in
tracking non stationary processes.

4.5 APA algorithm:


The affine projection algorithm is an ‘intermediate’ algorithm in between the well
known NLMS and RLS algorithms, since it has both a performance and a complexity in
between those of NLMS and RLS [26]. In APA the projections are made in multiple
dimensions. As the projection dimension increases, the convergence speed of the tap weight
vector and algorithm’s computational complexity increases.
In APA, a high projection order leads to a fast convergence rate but a large estimation
error. Meanwhile, a low projection order gives rise to a slow convergence rate but a small
estimation error. Therefore, the reasonable adjustment of the projection order is worth
considering satisfying fast convergence rate and small steady state estimation error [27].
Adaptive output xT(n)w(n) and desired response d(n) as shown in equation 4.6 and 4.7
The APA recursion is given as:
Let’s assume we keep L+1 input signal vectors in a matrix as follows:

X(n)=

=[x(n) x(n-1) … x(n-L)] ( 4.12 )


Where γ is the small constant
( 4.13 )
( 4.14 )
Where L is the projection order of APA

25
( 4.15 )
The objective of the affine projection algorithm is to minimize
║ w(n)-w(n-1)║² ( 4.16 )

Subject to:
d(n) – XT(n) w(n) = 0 ( 4.17 )
T -1
w(n) = w(n-1) +µ X(n) ( X (n) X(n) + γI ) e(n) ( 4.18 )
choosing µ in the range of 0 < µ ≤ 2
The affine projection algorithm maintains the next coefficient vector w(n) as close as
possible to the current w(n-1), while forcing the a posteriori error to be zero [28].
Using techniques similar to those which led to FRLS from RLS a fast version of
APA, FAP may be derived. APA includes LMS like complexity affine projection algorithm
is that it causes no delay in the input or output signals. These features make APA an excellent
candidate for an adaptive filter in the acoustic echo cancellation problem. To improve the
power of a speech signal NLMS is modified to APA the gradient of the signal is multiplied
with the original pure input signal which improves the power of the output and faster
convergence.
Advantages and disadvantage:
APA has faster tracking capabilities than NLMS. APA has a better performance in steady
state MSE or transient response compared with other algorithms. APA has a better
performance and complexity compared with NLMS and APA.

4.6 Echo return loss enhancement


Echo return loss enhancement (ERLE) [15] is the ratio of input desired signal
power and the power of a residual error signal immediately after e c h o cancellation. It is
measured in dB. ERLE measures the amount of loss introduced by the adaptive filter
alone. ERLE depends on the size of the adaptive filter and the algorithm design. T he
higher the value of ERLE represents better the echo canceller. ERLE is a measure of the
echo suppression achieved and is given by

( 4.19 )

Where ‘ ’ is the input desired signal power and ‘ ’ is the power of a residual error signal
after echo cancellation.

26
5. Evaluation setup

5.1 Introduction

This thesis deals with the elimination of disturbances due to the echo which occurs
during the hands-free speech communication. These disturbances caused during the speech
communications were explained in the previous chapters. Echo cancellation using adaptive
algorithms APA, NLMS and RLS are implemented in MATLAB. The implementation of this
system will be explained clearly in this chapter. My aim is to implement and perform an
evaluation of adaptive echo canceller using APA, NLMS and RLS algorithm.
This chapter deals with the implementation and analysis of the adaptive echo
canceller as it is one of the best speech enhancement system for hands-free speech
communication systems which was discussed in detail in the previous chapter. The
implementation and experimented setup of the system to be examined is discussed in detail in
the next section. Considered various parameters of the particular system to achieve optimum
values are mentioned clearly in the next section. Finally the results of adaptive echo canceller
and evaluation of performance in different environments are plotted in the results section.

5.2 Evaluation setup for echo cancellation with adaptive


algorithm

The performance of the acoustic echo canceller depends on parameters like spectrum,
background noise level, pitch variability, gender, language and age. The strong pitch voice
can be easily converging than the soft pitch voices. The intensity of sound is defined as
sound power per unit area and the perception of loudness is related to both the sound pressure
level and duration of a sound
The implementation of NLMS, APA and RLS algorithms suppress the echo and noise
in the acoustic echo cancellation system. For testing (Speech_all.wav) signal contains four
sentences with female and male voice alternatively is taken. The sampling frequency of the
speech signal is 16000Hz, duration of 11 seconds. These four sentences are described in
Table 1. The input of the algorithm is clean speech signal of far end user x(n) and desired
signal is taken as reverberated signal received at near end microphone. The reverberated
signal at three closed room environments is generated at different room dimensions,

27
microphone position and source position implemented using RIR as described in section 2,
with the reflection coefficient α=-0.8 in MATLAB. The three environments are
Environment 1: room [3,4,2.5], microphone [1,2,1], source [1,1,1]
Environment 2: room [2,3,2.5], microphone [1,1,1], source [1,2,1]
Environment 3: room [4,2,2], microphone [2,1,2], source [2,2,2]
The room impulse response of the three environments are shown below

Figure.18 Room impulse response of environment 1

Figure.19 Room impulse response of environment 2

28
Figure.20 Room impulse response of environment 3
Table 1:
File Duration Type of Sentences
name in sec voice
3 Female “It’s easy to tell the depth of the well.”
2 Male “Kick the ball straight and follow through.”
Speech_all.wav 3 Female “Glue the sheet to the dark blue background.”
3 Male “A part of tea helps to pass the evening.”
Table.1: The details of clean speech signal used for evaluation

The filter order is taken as 500, 1000, 1500, 2000 and 2500 for AEC with NLMS,
APA and RLS. The algorithms are tested with different parameter values (trial and error
method) within a limit to fix the value which gives high amount of echo cancellation. The
NLMS implementation is mentioned in section 3.4, the step size β=1 is taken and
reverberated signal is taken as input. The RLS implementation is mentioned in section 3.5,
the exponential weighting factor λ=1 and value used to initialize P(0) is δ=0.1. The APA
implementation is mentioned in section 3.6, the step size µ=1 is taken and projection order is
taken as 20 because reasonable adjustment of the projection order is worth considering

29
satisfying fast convergence rate and small steady state estimation error. The parameters for
three algorithms are tested with different values and selected the best value with which
amount of echo cancelled (ERLE) is high. The microphone signal contains reverberated
speech signal of far-end user, noise signal is not added in this experiment. An acoustic echo
cancellation system using adaptive algorithm is explained in chapter 4. The estimation error
is plotted with filter order 2500. The ERLE with respect to the order of the filter (number of
coefficients) is plotted for every system at three environments and the performance of three
systems is compared in each environment. The ERLE is the ratio of input desired signal
power and the power of a residual error signal immediately after e c h o cancellation. The
calculated ERLE value represents the measurement of echo loss processed by the adaptive
filter.

30
6. Results

6.1 Simulation results for echo cancellation using APA


algorithm
APA achieves good convergence behavior at every instant of convergence state, low cost for
implementation because of low computational complexity compared to the (RLS) method.
The algorithms is tested at three different environments by changing room dimension,
microphone position, source position.
6.1.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source
[1,1,1]

The desired signal received by microphone at environment 1, is shown in Figure 21, The
error signal estimated by the adaptive filtering with APA algorithm of order 2500, is shown
in the Figure 22 and the amount of echo cancellation after the adaptive filtering with APA is
plotted with respect to the order of the filter, is shown in the Figure 23.

Figure.21: Desired signal of APA at environment1

31
Figure.22: Estimation error signal ‘e’ of APA at environment1

Figure.23: ERLE of APA at environment1

32
6.1.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source
[1,2,1]

The desired signal received by microphone at environment 2, is shown in Figure 24, The
error signal estimated by the adaptive filtering with APA algorithm of order 2500, is shown
in the Figure 25 and the amount of echo cancellation after the adaptive filtering with APA is
plotted with respect to the order of the filter, is shown in the Figure 26.

Figure.24: Desired signal of APA at environment2

Figure.25: Estimation error signal ‘e’ of APA at environment2

33
Figure.26: ERLE of APA at environment2

6.1.3 At environment 3: room [4,2,2], microphone [2,1,2], source


[2,2,2]
The desired signal received by microphone at environment 3, is shown in Figure 27, The
error signal estimated by the adaptive filtering with APA algorithm of order 2500, is shown
in the Figure 28 and the amount of echo cancellation after the adaptive filtering with APA is
plotted with respect to the order of the filter, is shown in the Figure 29.

Figure.27: Desired signal of APA at environment3

34
Figure.28: Estimation error signal ‘e’ of APA at environment3

Figure.29: ERLE of APA at environment3

35
6.2 Echo cancellation using the NLMS algorithm

The fast convergence speed of the NLMS algorithm makes a favorite choice in the echo
cancellation system. The algorithm is tested at three different environments by changing
room dimension, microphone position, source position.
6.2.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source
[1,1,1]

The desired signal received by microphone at environment 1, is shown in Figure 30, The
error signal estimated by the adaptive filtering with NLMS algorithm of order 2500, is shown
in the Figure 31 and the amount of echo cancellation after the adaptive filtering with NLMS
is plotted with respect to the order of the filter, is shown in the Figure 32.

Figure.30: Desired signal of NLMS at environment1

36
Figure.31: Estimation error signal ‘e’ of NLMS at environment1

Figure.32: ERLE of NLMS at environment1

37
6.2.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source
[1,2,1]
The desired signal received by microphone at environment 2, is shown in Figure 33, The
error signal estimated by the adaptive filtering with NLMS algorithm of order 2500, is shown
in the Figure 34 and the amount of echo cancellation after the adaptive filtering with NLMS
is plotted with respect to the order of the filter, is shown in the Figure 35.

Figure.33: Desired signal of NLMS at environment2

Figure.34: Estimation error signal ‘e’ of NLMS at environment2

38
Figure.35: ERLE of NLMS at environment2

6.2.3 At environment 3: room [4,2,2], microphone [2,1,2], source


[2,2,2]
The desired signal received by microphone at environment 3, is shown in Figure 36, The
error signal estimated by the adaptive filtering with NLMS algorithm of order 2500, is shown
in the Figure 37 and the amount of echo cancellation after the adaptive filtering with NLMS
is plotted with respect to the order of the filter, is shown in the Figure 38.

Figure.36: Desired signal of NLMS at environmen3

39
Figure.37: Estimation error signal ‘e’ of NLMS at environment3

Figure.38: ERLE of NLMS at environment3

40
6.3 Echo cancellation using the RLS algorithm
The results of the RLS indicate the estimation error is very small even smaller than the
NLMS and APA. The outputs of RLS indicates better performance than NLMS and APA still
it was not preferred, as each iteration requires multiplications. As in echo cancellation
systems the FIR filter is usually in tousands. This gives very large number of multiplications
and implementation becomes too costly.

6.3.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source


[1,1,1]
The desired signal received by microphone at environment 1, is shown in Figure 39, The
error signal estimated by the adaptive filtering with RLS algorithm of order 2500, is shown in
the Figure 40 and the amount of echo cancellation after the adaptive filtering with RLS is
plotted with respect to the order of the filter, is shown in the Figure 41.

Figure.39: Desired signal of RLS at environment1

41
Figure.40: Estimation error signal ‘e’ of RLS at environment1

Figure.41: ERLE of RLS at environment1

42
6.3.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source
[1,2,1]

The desired signal received by microphone at environment 2, is shown in Figure 42, The
error signal estimated by the adaptive filtering with RLS algorithm of order 2500, is shown in
the Figure 43 and the amount of echo cancellation after the adaptive filtering with RLS is
plotted with respect to the order of the filter, is shown in the Figure 44.

Figure.42: Desired signal of RLS at environment2

Figure.43: Estimation error signal ‘e’ of RLS at environment2

43
Figure.44: ERLE of RLS at environment2
6.3.3 At environment 3: room [4,2,2], microphone [2,1,2], source
[2,2,2]

The desired signal received by microphone at environment 3, is shown in Figure 45, The
error signal estimated by the adaptive filtering with RLS algorithm of order 2500, is shown in
the Figure 46 and the amount of echo cancellation after the adaptive filtering with RLS is
plotted with respect to the order of the filter, is shown in the Figure 47.

Figure.45: Desired signal of RLS at environment3

44
Figure.46: Estimation error signal ‘e’ of RLS at environment3

Figure.47: ERLE of RLS at environment3

45
6.4 Comparing ERLE of APA, NLMS and RLS in three
environments
The reduction of echo cancelled is measured as ERLE. The ERLE is measured as the ratio of
power of input desired signal and estimated error signal immediately after echo cancellation.
The measurement is in dB and also helps to calculate echo loss done by the adaptive
algorithm. A large value of ERLE indicates better echo cancellation.
ERLE is calculated for APA, NLMS and RLS at three different environments by varying
room dimensions, microphone position and source position and the results of the three
algorithms are represented in three ways as shown below.
The results of the algorithms at three different environments shows that RLS
has greater ERLE than the APA and NLMS, which indicates that the echo cancellation using
RLS is better than the APA and NLMS, But due to the more computational complexity it
takes long time to process than the APA and NLMS algorithms. The ERLE of APA, RLS
and NLMS are plotted and compared in graph in Figure 48, 50 and 52. Compared in chart as
shown in Figure 49, 51 and 53. The ERLE values with respect to the order of the filter are
also shown in table 2, 3 and 4. As ERLE of APA and RLS does not differ much, APA is the
preferable adaptive algorithm to use in this environment because of the computational
complexity and costly implementation of RLS.

6.4.1 At environment 1: room [3,4,2.5], microphone [1,2,1], source


[1,1,1]

46
Figure.48: ERLE comparison of NLMS, APA and RLS at environment 1 in graph

Figure.49: ERLE comparison of NLMS, APA and RLS at environment 1 in chart

Table 2:
ORDER NLMS APA RLS
500 4.25 6.93 9.87
1000 10.35 14.79 18.57
1500 20.44 23.44 27.89
2000 27.67 32.89 36.98
2500 28.45 47.17 56.15

Table.2: ERLE comparison of NLMS, APA and RLS values at environment 1

47
6.4.2 At environment 2: room [2,3,2.5], microphone [1,1,1], source
[1,2,1]

Figure.50: ERLE comparison of NLMS, APA and RLS at environment 2 in graph

Figure.51: ERLE comparison of NLMS, APA and RLS at environment 2 in chart

Table 3 :
ORDER NLMS APA RLS
500 8.28 10.43 13.21
1000 18.78 21.52 25.42
1500 25.33 30.53 35.03
2000 33.02 41.41 46.68
2500 31.53 48.25 56.76

Table.3: ERLE comparison of NLMS, APA and RLS values at environment 2

48
6.4.3 At environment 3 : room [4,2,2], microphone [2,1,2], source
[2,2,2]

Figure.52: ERLE comparison of NLMS, APA and RLS at environment 3

Figure.53: ERLE comparison of NLMS, APA and RLS at environment 3 in chart

Table 4:
ORDER NLMS APA RLS
500 11.36 13.21 16.74
1000 21.71 24.43 27.97
1500 29.36 33.76 38.64
2000 33.11 40.50 45.86
2500 29.20 47.99 58.53

Table.4: ERLE comparison of NLMS, APA and RLS at environment 3

49
7. Conclusion and future work

7.1 Summary
The advancements of technology in a wide range of acoustic echo cancellation applications
hands-free communication, mobile phones, bluetooth headset, skype calls and
teleconferencing systems. In mobile technology and wireless systems the speech
enhancement and echo cancellation are playing an important role. There are many echo
cancellation methods are used to suppress the echo as mentioned in section 1, one of the most
advanced method is adaptive filtering technique. There are many adaptive algorithms are
available to suppress the echo in the hands-free communication, to choose the best algorithm
among them ERLE is the parameter used to calculate and compare the performance of the
adaptive algorithms in AEC.

7.2 Conclusion
This thesis is a collaborative work done by a group of two; the focus is on echo cancellation
enhancement in hands free speech communication using adaptive filtering technique. There
are many adaptive algorithms available in the literature for echo cancellation and every
algorithm have its own properties, but the aim of the algorithms is to achieve higher ERLE at
a higher rate of convergence with less complexity.
The adaptive algorithms NLMS, APA and RLS for echo cancellation were
successfully implemented in MATLAB. The three algorithms were tested in three different
echo occurring environments by changing microphone position, source position and room
dimensions. The performance evaluation of the NLMS, APA and RLS algorithms are
measured with ERLE parameter. The results show that the RLS algorithm has best echo
cancellation with highest rate of convergence speed among the three algorithms. The highest
ERLE at three different environments are 56.15dB, 56.76dB and 58.53dB but the
computational complexity is more than the other algorithms. The amount of echo
cancellation with APA algorithm is near to RLS performance with less computational
complexity and easier to implement in real time. The highest ERLE at three different
environments are 47.17dB, 48.25dB and 47.99dB. The NLMS algorithm has less

50
computational complexity and very easy to implement in real time but it gave worst
performance for echo cancellation among the three. The highest ERLE at three different
environments are 28.45dB, 33.02dB and 33.11dB. The detailed view of the comparison
results of three algorithms at three different environments are shown in tables, plots and
graphs in the 6.3 section. The RLS algorithm gives best results among the three algorithms;
still it is not used because it requires multiplications per iteration, as for echo
cancellation systems the order is usually in the thousands in real time. Thus the number of
multiplications required is very large making the RLS algorithm too costly to implement.

7.3 Future work


The future works are
 Using more than one microphone in the system for better echo cancellation.
 Comparing the performance of other algorithms with this system.
 Improvement of existing algorithms in reducing the computational complexity.
 The noise and echo cancellation in speech communication is further improved by
implementing algorithms in the frequency domain.
 Implement the system in digital signal processors for real-time applications like
mobile phones, laptops, bluetooth headset etc.,

51
8. Bibliography

[1] Y. Ephraim and I. Cohen, “Recent Advancements in Speech Enhancement,” in the


Electrical Engineering Handbook, CRC Press, 2006, ch. 15, pp. 12-26.

[2] N. Grbic, “Optimal and Adaptive Subband Beamforming - Principles and Applications,”
Ph. D. dissertation, Dept. of Telecommunications and Signal Processing, Blekinge Institute
of Technology, Ronneby, SW, 2001.

[3] Z.Yermeche,“Soft-Constrained Subband Beamforming for Speech Enhancement,”


Ph.D.dissertation, Dept. of Signal Processing, Blekinge Institute of Technology, Karlskrona,
SW, 2007.

[4] Lollmann, H. W.; Peter Vary; “Low Delay Noise Reduction and Dereverberation in
Hearing Aids,” EURASIP journal on Advances in Signal Process., Mar. 2009, Available:
http://delivery.acm.org/10.1145/1600000/1592486/p1lollmann.pdf?ip=194.47.147.33&acc=P
UBLI& CFID=76616061&CFTOKEN=13910873& acm
=1333990169_2f68c8c6972969074a9db89563e27bdc

[5] V. Valimaki and T. I. Laakso, “Principles of Fractional Delay Filters,” IEEE Int. Conf. on
Acoustic, Speech and Signal Proc., Istanbul, Turkey, 2000.

[6] A. Craggs, “Acoustic modeling: finite element method,” in Handbook of Acoustics, M. J.


Crocker, Ed. New York: Wiley, 1998, pp. 149–156.

[7] J.B. Allen and D.A. Berkley, “Image Method for Efficiently Simulating Small Room
Acoustics,” Journal of the Acoustical Society of America, vol. 65, no. 4, pp. 943–950, 1979.

[8] Masahiro Yukawa,NoriakiMurakoshi, and IsaoYamada (2005), Efficient Fast Stereo


Acoustic Echo Cancellation Based on Pairwise Optimal Weight Realization Technique
http://www.hindawi.com/GetPDF.aspx?doi=10.1155/ASP/2006/84797.

52
[9] Aditya Sri Teja .P, “Simulation of Microphone Inaccuracies and Multi-channel Speech
Enhancement using Beamformers in Reverberant Environment ,” M. S. Thesis, Dept. of
Signal Process., Blekinge Institute of Technology (BTH), Blekinge, Sweden, 2012.

[10] John G. Proakis, Dimitris G.Manolakis, “Digital signal processing: principles,


algorithms and applications”, Prentice Hall, March 2002.

[11] S. Haykins , “Adaptive Filter Theory” , Prentice Hall ,New Jersey, 1996.

[12] S. Qureshi, “Adaptive Equalization,” Proceedings of the IEEE, vol.73, No.9, pp.
1349-1387, Sept. 1985.

[13] Shoji Makino, Member, IEEE, Yutaka Kaneda, Member, IEEE and Nobuo Koizumi,
“Exponentially weighted step size NLMS adaptive filter based on the statistics of a room
impulse response”, IEEE Trans. on speech and audio Processing, vol. 1, No.1, pp.101-108,
Jan 1993.

[14] M.M. Sondhi, “An Adaptive Echo Canceller,” Bell Syst. Tech. J., vol. 46, No.3, pp.
497-511, Mar. 1967

[15] Hosien Asjadi, Mohammad Ababafha, “Adaptive Echo Cancellation Based On Third
Order Cumulant, International Conference on Information, Communications and Signal
Processing, ICICS '97 Singapore, September 1997

[16] SanthuRenu Vuppala, “ Speech Enhancement in Hands-free Speech Communication


with emphasis on Wiener Beamformer ,” M. S. Thesis, Dept. of Signal Process., Blekinge
Institute of Technology (BTH), Blekinge, Sweden, 2012.

53
[17] Da-Zheng Feng, Xian-Da Zhang, Dong-Xia Chang, and Wei Xing Zheng, “A Fast
Recursive Total Least Squares Algorithm for Adaptive FIR Filtering”, IEEE Trans. On
Signal Processing vol.52, No.10, pp.2729-2737, Oct 2004

[18] Gupta .S, “Acoustic Echo Cancellation using Conventional Adaptive Algorithms and
modified Variable Step Size LMS Algorithm,” M. S. Thesis, Dept. of Electron. & Commun.
Eng., Thapar Inst. Of Eng. And Technology, Punjab, India, 2007

[19] K. S. Patel, “Performance Analysis of Adaptive Algorithms based on different


parameters Implemented for Acoustic Echo Cancellation in Speech Signals,” Dept. of Signal
Processing, Blekinge Institute of Technology (BTH), Blekinge, Sweden, 2012.

[20] Harish Midathala, “ Performance Analysis of Optimal SNIR Beamformer in Anechoic


and Echoic Environment,” M. S. Thesis, Dept. of Signal Process., Blekinge Institute of
Technology (BTH), Blekinge, Sweden, 2012.

[21] Fukane, A.R.; Sahare, S. L.; “Enhancement of Noisy Speech Signals for Hearing
Aids,” 2 0 1 1 Int. Conf. on, Communication Systems and Network Technologies (CSNT),
pp.490-494, Pune, IN, June3-5.

[22] Elko, G. W, Anh-Tho Pguyen Pong; “A Simple Adaptive First Order Differential
Microphone,”IEEE Applications of Signal Process. to Audio and Acoustics, 1995, pp. 169 -
172.

[23] Hung, N. N; Majid, D; A, Sarfraz; “Implementation of the LMS and NLMS


algorithms for Acoustic Echo Cancellation in teleconference system using MATLAB,”
Vaxjo Univ., SE, 2009.

[24] T. S. N. U. V. Ramesh, “Speech Enhancement in Hands-free device (Hearing Aids)


with emphasis on Elko‟s Beamformer, ” M. S. Thesis, Dept. of Signal Process., Blekinge
Institute of Technology (BTH), Blekinge, Sweden, 2012.

54
[25] Amit Munjal, Vibha Aggarwal, Gurpal Singh.; “RLS algorithm for Acoustic Echo
Cancellation”. 2 0 0 8 national. Conf. on, Challenges & Opportunities in Information
Technology, pp.301, Mandi Gobindgarh, IN, March29.
[26] K. Ozeki and T. Umeda. “An adaptive filtering algorithm using an orthogonal
projection to an affine subspace and its properties”. Electronics and communications in
japan, 67-A(5):126-132, February 1984.
[27] Jin Woo Yoo and Poo Gyeon Park, “An Affine Projection Algorithm with Variable
Projection Order Using the MSE Criterion”. IMECS, Hong kong, March 14-16, 2012.
[28] Paulo S. R. Diniz, “Adaptive Filtering Algorithms and Practical Implementation”,
Springer, July 2008.

55

You might also like