17 - Chapter 7

Chapter 7 Target Classification
CHAPTER 7
TARGET CLASSIFICATION
The proposed system, apart from performing the task of localization and
tracking, also performs the task of target classification. The acoustic signals
picked up by the buoy system may be of natural origin such as ice cracking,
biological sources, thermal agitations, hydrodynamic sources, etc. or of
manmade sources such as the ones from the ships, submarines, military
operations, sonar, etc. The target specific features of the unknown target
are compared with a set of archetypal features of a known pre-recorded
data which have been previously generated and stored in a knowledge base,
leading to the identification of the target. The various digital signal
processing techniques used to extract the signature features, leading to the
classification of the noise source are discussed in this Chapter. A Digital
Signal processing module for the identification of noise sources in the
ocean has also been implemented, with acceptable success rates. The
system has been realized using a proprietary version of C language evolved
for the digital signal processor TMS320C6713 development board. The
feature vectors extracted for different sources were computed in MATLAB
as well as in the DSP development board and the results were compared.
7.1 Introduction
Noise in the ocean is of utmost importance for ocean explorations,
oceanographic as well as fisheries studies, sonar operations, etc. The wide
range of systems for ocean research demands the need for characterizing
the noise sources in the ocean. The ambient noise in the ocean is
172
Underwater Target Localization, Tracking and Classification
composite in nature, comprising of components emanating from a variety

of noise sources. The studies carried out on noise in the ocean reveal that
its spectrum extends over the frequency range from a few Hz to about 100
kHz.
Ships, other surface crafts, submarines, torpedoes, etc. are excellent

sources of underwater noise. They may have machineries of greater
complexities, as they require numerous rotational and reciprocating
machinery components for their propulsion, control and habitability. These
machineries also generate vibrations, thus contributing their effect on to the
underwater noise. Thus, the noise heard by a hydrophone operator contains
a wide range of frequencies. These noises are very difficult to interpret
into a pattern that will indicate the type of the vessel or its class, with the
usual signal extraction methods. Moreover, some noises generated by a
ship are intermittent in nature, typically the noise from the steering engine.
Hence, in such situations, the tonal components present in the noise signals
collected over a short period of time will be different from those taken for a
long period. Scientists have been labouring hard for over five decades to
develop electronic systems to understand and identify the noise sources in
the ocean.
Spectral estimation techniques have been used as one of the vital

signal processing tools for extracting valuable information about the
signals as well as the noises, for many years. As the power spectrum is
related to autocorrelation, it does not contain any information about the
components higher than the second orders, whereas the higher order
spectra contain certain information, not present in the power spectrum.
Hence, power spectral estimation has limited applications in situations
where the non-Gaussian noise has to be detected. Power spectral
estimation techniques can be used for estimating the magnitude response of
173
the system. Estimates of bispectrum, which is the third order spectrum and
bicoherence have been found useful in detecting nongaussianity and
nonlinearity in system identification as well as detecting transient signals.
In order to interpret most effectively and efficiently the vast amount

of data furnished by the signal processor, especially in situations where the
detectable range of the system is very large, it is essential to have a fully
automated and intelligent classifier, as most of the target information, in all
probability, may not be of much interest to the user. Operator controlled
classifier turns out to be inappropriate and highly inefficient in such
situations. Automatic detection and classification algorithm attempts to
alleviate this operability problem by taking over the operator‟s role of
picking out targets from a background of noise and interferences.
PROCESSED DETECTION/ CLASSIFIER

DATA TARGET OUTPUT
ESTIMATOR KNOWLEDGE DECISION
FEATURE
STATISTICS BASE SYSTEM
RECORD
PROCESSOR
CLASSIFICATION CLUES
Fig. 7.1 Block Schematic of the Target Classifier
The generalized structure of the prototype target classifier is shown

in Fig. 7.1. The output of the power spectral estimator is compared with the
earlier detection/estimation decisions, which are stored in the target feature
record and the relevant target feature are updated. In case, if a target
feature is not updated over a significant period, the concerned feature will
be dropped from the target feature record. In many situations, the system
may have to backtrack or re-track through the stored features record to
establish the links with the most recent features. As and when the required
classification clues are available in the target feature record, the most
174
matching signature pattern is identified from the known target signatures in

the knowledge base, depending on the allowable percentage of mismatch,
chosen by the user.
In the signal processing module of the developed buoy system, the

acoustic signals generated by target are captured by using a 20-element
hydrophone array. The unique signature features of the target can be
extracted from the captured acoustic signals using a 32-bit floating point
DSP processor, TMS320C6713 [113, 114]. The features thus extracted can
be used for comparing with those available in the knowledge base.
7.2 Knowledge Base

For the realisation of the target classifier, it is essential to have a
powerful knowledge base comprising of the relevant parameters of
different class and types of targets. The raw data collected has to be
processed for gathering the relevant parameters for creating the knowledge
base.
7.2.1 Noise Data
The noise data used for creating the knowledge base mainly
comprises of the man made noises and noise that are of biological in
nature. Some of the data sets used in developing the knowledge base were
collected during scheduled cruises off Cochin and Mangalore.
7.2.1.1 Man Made Noises
Surfaced and submerged vessels create noise from their propellers,

motors and gears. The noise generated by the motor is continuous and
caused by the mini-explosions that occur, as the fuel burns rapidly inside
the engine cylinders and by the rotating gears and shafts. Sound is also
175
generated due to the formation of bubbles during the rotation of propellers

and, to a lesser extent, by the wake of waves produced due to the
movement of the vessels. As the vessel moves and the propellers rotate,
bubbles are formed in the water and the formation of these bubbles is
known as cavitations. The breaking of these bubbles create a loud acoustic
noise and is termed as cavitation noise which is directly related to the speed
of the vessel. The faster the propeller rotates, the more will be the
cavitation noise. The breaking bubbles produce noise over a range of
frequencies, and at high speeds, these frequencies can be as high as 20,000
Hz. On the other extreme, a large ship with slowly turning propellers can
generate very low frequencies to the extent of 10 Hz or even less. The
rotation of the propellers creates bands of noise at more or less constant
frequencies that are proportional to the rate of rotation of the propeller.
The noise created by these rotations, called blade-rate lines, can help to
distinguish between different sizes of ships and even a particular ship in
certain cases. Low frequency noise generated by ships contributes
significantly to the amount of low frequency ambient noise in the ocean,
particularly in regions with heavy ship traffic. In fact, because of the
increase in propeller-driven vessels, low frequency ambient noise has
increased 10-15 dB during the past 50 years.
7.2.1.2 Biological Noise Data
A variety of biological noise data has been used for the purpose of
creating the knowledge base. The beluga, a medium sized toothed whale,
is amongst the loudest animals in the sea. They exhibit a wide range of
vocalizations including clicks, squeaks, whistles and a bell-like clang. The
sounds recorded are mostly in the range of 0.1 to 12 kHz. The humpback
whale is best known for their vocalizations that are arranged in complex,
repeating sequences with the characteristics of song and contain both tonal
176
and pulsed sounds. Some of the different types of the harbour seal calls are
trill, chirp, multiple whistle, single whistle, growl, whoop, chug, and grunt.
Sea robins are very noisy fishes and make sounds like grunting, growling
and grumbling.
7.2.2 Data Analysis
For creating the knowledge base, the noise data waveforms of

various targets are analysed following the procedures for extracting the
spectral and bispectral features. The performance of the classifier depends
on how extend and vast the knowledge base is. In the prototype system, all
the available noise date waveforms were analyzed and a representative
knowledge base has been developed. The knowledge base for the prototype
classifier comprise of the spectral and bispectral features of different
classes like ships, boats, marine mammals, environmental conditions, etc.
7.2.3 Updating of Knowledge Base
The knowledge base that has been developed for realizing the
prototype target classifier is only representative and not complete in all
respects. For making the system efficient, the knowledge base has to be
updated with the signature patterns and the target dynamics for all the
classes and types of targets.
7.3 Extraction of Features

The pre-processed noise data waveforms are analyzed in different
ways in the Estimation Statistics Processor. The different techniques, like
spectral analysis and bispectral estimation techniques, are used for
extracting the various signatures of the targets are illustrated in Fig. 7.2.
177
The various features generated from this analysis are stored in a

Target Feature Record (TFR), which is used for mapping the target
signatures with the signatures available in the knowledge base. The
generation of the target feature record as depicted in Fig. 7.3. plays an
important role in the efficiency and success rate of the classifier.
START
NOISE DATA WAVEFORMS
PREPROCESSOR
SPECTRAL HOS
ANALYSIS ANALYSIS
SPECTRAL BISPECTRAL
FEATURES FEATURES
TARGET
FEATURE
RECORD
Fig. 7.2 Methods for extracting the target features
As such, when the noise data waveforms are made available to the
classifier, it generates the target feature record by performing spectral
estimation and bispectral estimation. The target feature records for various
data records are generated. In case, if a TFR is not updated over a
considerable period of time, the concerned feature record will be dropped
and the system takes the average of all the TFRs which have close
resemblances and thus generates the TFR.
178
From HA
Preprocessor
Slice Noise Data Waveforms

into Fixed Size Records
Generate TFR from the

Records
TFR for all

Records
Generated ?
No
Yes
Take the Average of all TFRs

which have close Resemblances
and Generate the Feature Vector
Feature Vector
Fig. 7.3 Flowchart for generation of TFR
179
7.4 Spectral Analysis

Signal being analysed in frequency domain is known as spectral
analysis. Spectral analysis methods can be classified as parametric and
non-parametric methods based on the analysis of the signal in time domain.
Non–parametric analysis require multiple periods for the particular spectral
peak to appear whereas parametric analysis require the data segment to
contain only single period to produce a pronounced peak. Non-parametric
statistics have the advantage of being distribution independent as well as
insensitive to extreme values or outliers. The disadvantages of non-
parametric statistics are complexity, low power and time required for
computation. Some of the non-parametric methods are Daniell
Periodogram, Barlett Periodogram and Welch Periodogram. In contrast,
parametric statistics are simple and easy to compute but rely upon the
assumption of a “Gaussian” distribution. Parametric statistics are known to
be generally robust even when the assumption of Gaussian distribution is
violated. Widely used parametric approaches are Auto Regressive (AR)
process, Moving Average process (MA) and ARMA process.
The major noise sources emanated from the targets are from
propellers and propulsion or from other machinery of the targets, which
can produce significant noise at low frequencies but little noise at high
frequencies greater than 5 kHz, where wind and wave-generated noise
dominates the spectrum of oceanic noise. In addition, higher frequency
noise is strongly attenuated in seawater. Moreover, as the noise signals
from the target are non-linear, intermittent and of short duration,
parametric methods of spectral analysis are more appropriate and precise.
Though frequency analysis using Fourier methods, DFT or the

computationally efficient FFT with periodogram methods are commonly
180
used, there are numerous disadvantages with these non – parametric

methods as compared to parametric methods. Parametric methods give
smoother spectrums than non-parametric methods although the latter using
digital windowing techniques can smoothen the spectrum to some extent;
they distort the true spectrum due to side lobe leakages. Parametric
methods give better frequency resolution while avoiding picket fencing and
scalloping loss faced by non – parametric methods. The latter consist of
harmonic amplitude and phase components regularly spaced in frequency
intervals. The spacing of the spectral lines depends on the number of data
samples decreasing with number of data. Therefore, it is unable to estimate
accurately the frequency component of the signal in between two adjacent
harmonic frequency components. This problem is better known as picket
fencing effects. This results in scalloping loss which attenuates the signal
mid-way between the harmonically related frequency components.
Thus, more accurate power spectral density of the noise signal

emanated from the target can be estimated using parametric spectral
methods. Though several spectral estimation techniques have been
evolved using parametric spectral methods, in an attempt to improve the
spectral fidelity and resolutions, the power spectral density estimation
using the autoregressive (AR) approach has been adopted here, for
analyzing the spectral components in the noise waveforms, leading to the
extraction of certain classification clues.
7.4.1 Spectral Features
Spectral features of an acoustic signal are unique in their

characteristics and so they can be called as spectral signatures as they
explicitly prove the identity of the signal. Therefore, the nature or class of
the noise emanating from the target can be identified by extracting its
181
spectral features. The features are calculated after estimating the power
spectral density of the signal using parametric spectral methods. Spectral
features are extracted using power spectral statistics and higher order
statistics [ 115-122].
7.4.2 Power Spectral Statistics
The power spectrum is the primary tool of signal processing and

algorithms for estimating the power spectrum have found applications in
areas such as radar, sonar, seismic, biomedical, communications and
speech signal processing. The usefulness of the power spectrum arises
from an important theorem known as Wold‟s decomposition theorem,
which states that any discrete-time stationary random process can be
expressed in the form,
x(n) = y(n) + z(n)
such that :
 Processes y(n) and z(n) are uncorrelated with one another
 Process y(n) has causal linear process representation,
∞
𝑦 𝑛 = 𝑕 𝑘 𝑢(𝑛 − 𝑘)
𝑘=0
∞ 2
where, h(0)=1, 𝑘=0 𝑕 (𝑘) < and u(n) is a white–noise process; and
 z(n) is singular, that is, it can be predicted perfectly (with zero
variance) from its past.
According to Wiener – Khintchine theorem, power spectrum, Pxx(f)
of a stationary process, x(n) is defined as the Fourier transform of the
autocorrelation sequence of the process.
∞
𝑃𝑋𝑋 𝑓 = 𝑅𝑋𝑋 𝑚 exp(−𝑗2𝜋𝑓𝑚)
𝑚=−∞
182
where, Rxx(m) = E [x*(m) x(n+m)] is the autocorrelation sequence of x(n).
A sufficient, but not necessary, condition for the existence of the

power spectrum is that the autocorrelation be absolutely summable. The
power spectrum is real valued and non-negative, ie., Pxx(f)>=0; if x(n) is
real valued, then the power spectrum is also symmetric, ie., Pxx (f)= Pxx (-f).
7.4.3 Features using Power Spectral Statistics
Spectral features extracted using power spectral statistics are as follows:
 Spectral Centroid (Brightness)

 Spectral Range (Bandwidth)
 Spectral Roll off
 Spectral Slope
 Characteristic Frequencies
 Audio spectrum centroid
 Audio spectrum spread
 Spectral Flatness
7.4.3.1 Brightness
Brightness or spectral centroid is the amplitude-weighted average,

or centroid, of the frequency spectrum, which can be related to a human
perception of brightness. It is calculated by multiplying the value of each
frequency by its magnitude in the spectrum and thereby summing them.
The resultant is then normalised by dividing it by the sum of all the
magnitudes [ 89, 115].
𝑚𝑎𝑔 𝑖 ×𝑓𝑟𝑒𝑞 [𝑖]

𝑏𝑟𝑖𝑔𝑕𝑡𝑛𝑒𝑠𝑠 = (7.1)
𝑚𝑎𝑔 [𝑖]
183
i=0… frame size/2, where, mag = the magnitude spectrum.

freq = the frequency corresponding to each magnitude element.
e.g.: Brightness of engine.wav = 2824.5 Hz
7.4.3.2 Bandwidth
Bandwidth or spectral range is an amplitude weighted average of

the differences between each frequency magnitude and the brightness, i.e. a
representation of the range of frequencies that are present in a certain
frame. It is computed by subtracting the mean value (in this case the
brightness) from each data value:
𝑚𝑎𝑔 𝑖 ×𝑓𝑟𝑒𝑞 [𝑖]−𝑏𝑟𝑖𝑔 𝑕𝑡𝑛𝑒𝑠𝑠

𝑏𝑎𝑛𝑑𝑤𝑖𝑑𝑡𝑕 = (7.2)
𝑚𝑎𝑔 [𝑖]
i = 0…..frame size/2.
where, mag = the magnitude spectrum.
freq = the frequency corresponding to each magnitude element.
e.g. Bandwidth of engine.wav = 1334 Hz
7.4.3.3 Spectral Roll off
Spectral Roll off is a measure of spectral shape, which could be

used instead of bandwidth. It is defined as the frequency below which 85%
of the magnitude distribution is concentrated. i.e.
MIN(R) such that,
𝑅 𝑁
𝑖=1 𝑚𝑎𝑔[𝑖] ≥ 0.85 × 𝑖=1 𝑚𝑎𝑔[𝑖] (7.3)
where, N is the length of the signal.
e.g. Spectral Roll off of engine.wav = 4694 Hz
184
7.4.3.4 Audio Spectrum Centroid (ASC)
Audio spectrum centroid features a logarithmic frequency scaling

centered at 1 kHz,
𝑁 /2
𝑙𝑜𝑔 2 𝑓[𝑘]/1000 𝑃 𝑟 [𝑘]
𝐴𝑆𝐶𝑟 = 𝐾=1
𝑁 /2 (7.4)
𝐾=1 𝑟
𝑃 [𝑘]
where, Pr is the power spectrum of the frame r.
e.g. Audio spectrum centroid of engine.wav = -5.02
7.4.3.5 Audio Spectrum Spread (ASS)
It describes concentration of the spectrum around the centroid and

is defined as
𝑁 /2
[𝑙𝑜𝑔 2 𝑓[𝑘]/1000 −𝐴𝑆𝐶𝑟 ]2 𝑃 𝑟 [𝑘]
𝐴𝑆𝑆𝑟 = 𝐾=1
𝑁 /2 (7.5)
𝐾=1 𝑟
𝑃 [𝑘]
Lower spread values would mean that the spectrum is highly

concentrated near the centroid and higher values mean that it is distributed
across a wider range at both sides of the centroid.
e.g. Audio spectrum spread of engine.wav = 0.737
7.4.3.6 Audio Spectrum Flatness (ASF)
It can be defined as the deviation of the spectral form from that of a

flat spectrum. Flat spectra correspond to noise or impulse-like signals
hence high flatness values indicate noisiness. Low flatness values
generally indicate the presence of harmonic components. The flatness of a
band is defined as the ratio of the geometric and the arithmetic means of
the power spectrum coefficients within that band.
185
𝑁 𝑁−1 𝑥 𝑛
𝑛 =0
𝐹𝑙𝑎𝑡𝑛𝑒𝑠𝑠 = 𝑁 −1 𝑥 𝑛 (7.6)
𝑛 =0
𝑁
e.g. Audio spectrum flatness of engine.wav = 0.21
7.4.3.7 Spectral Slope
It refers to the average slope of the power spectral density variation.

e.g. Spectral slope of engine.wav = -41.39.
7.4.3.8 Peaking Frequencies
Peaking frequencies will help in identifying the tonal as well as

continuous frequency components.
e.g. One of the prominent peaking frequencies of engine.wav = 5117.3 Hz.
7.5 Bispectral Statistics

There is much more information in a stochastic non-Gaussian or
deterministic signal than is conveyed by its autocorrelation or power
spectrum. Higher order spectra which are defined in terms of higher order
moments or cumulants of a signal contain this additional information [123-
126].
In power spectrum estimation, the process under consideration is

treated as a superposition of statistically uncorrelated harmonic
components and the distribution of power among these frequency
components is then estimated. As such, only linear mechanisms governing
the process are investigated since phase relations between frequency
components are suppressed. The information contained in the power
spectrum is essentially that which is present in the autocorrelation
sequence; this would suffice for the complete statistical description of a
Gaussian process of known mean. However, there are practical situations
186
where we would have to look beyond the power spectrum (autocorrelation)

to obtain information regarding deviations from Gaussianness and presence
of nonlinearities. Higher order spectra (also known as polyspectra),
defined in terms of higher order cumulants of the process, do contain such
information. Particular cases of higher order spectra are the third order
spectrum also called the bispectrum which is, by definition, the Fourier
transform of the third order cumulant sequence and the trispectrum (fourth
order spectrum) which is the Fourier transform of the fourth order
cumulant sequence of a stationary random process. The power spectrum is,
in fact, a member of the class of higher order spectra, i.e., it is a second
order spectrum.
The general motivation behind the use of higher order spectra in

signal processing is threefold:
(i) To extract information due to deviations from Gaussianness

(normality)
(ii) To estimate the phase of non-Gaussian parametric signals, and
(iii)To detect and characterize the nonlinear properties of

mechanisms which generate time series via phase relations of
their harmonic components.
The first motivation is based on the property that for Gaussian

processes, all polyspectra of order greater than two are identically zero.
Thus, a non-zero higher order spectrum indicates deviation from normality.
For a given zero-mean stationary real random process 𝑋𝑛 ; non-zero
skewness, E 𝑋𝑛3  0 indicates the existence of its bispectrum where, E .
is the expectation operation. Hence, in those signal processing settings
where the signal is a non-Gaussian stationary process and the additive
187
noise process is stationary Gaussian, there might be certain advantages

estimating signal parameters in higher order spectrum domains. The non-
Gaussianess condition is satisfied in many practical applications, since any
periodic or quasi-periodic signal can be regarded as a non-Gaussian signal
and self emitting signals from complicated mechanical systems can also be
considered as non-Gaussian signals.
The second motivation is based on the fact that higher order spectra
preserve the phase information of non-Gaussian parametric signals. For
modelling time series data in signal processing, least squares estimation is
almost exclusively used because it yields maximum-likelihood estimates of
the parameters of Gaussian processes and also because the equations
obtained are usually in a linear form involving autocorrelation samples or
their estimates. However, the autocorrelation domain suppresses phase
information and therefore least squares techniques (or modelling
autocorrelation methods) are incapable of representing non-minimum
phase parametric processes. An accurate phase reconstruction in the
autocorrelation (or power spectrum) domain can only be achieved if the
parametric process is indeed minimum phase. Non-minimum phase
estimation is of primary importance in deconvolution problems that arise in
geophysics, telecommunications, etc., in which the wavelet shape must
have the correct phase character.
One approach to the deconvolution problem that has emerged

recently explores the use of higher order spectra to estimate the phase of
the wavelet due to the ability of polyspectra to preserve non-minimum
phase information. Assuming that the reflectivity series (input sequence) is
non-Gaussian white with zero-mean, a mixed-phase wavelet can be
reconstructed in the bispectrum domain if the input sequence has non-zero
skewness, or in the trispectrum domain if the fourth-order cumulant
188
sequence is different from zero. Moreover, if the reflectivity series is

Gaussian, no procedure can recover the actual shape of a non-minimum
phase wavelet.
Finally, introduction of higher order spectra (HOS) is quite natural

when we try to analyze the nonlinearity of a system operating under
random input. General relations for arbitrary stationary random data
passing through arbitrary linear systems have been studied quite
extensively for many years. In principle, most of these relations are based
on power spectrum (or autocorrelation) matching criteria. On the other
hand, general relations are not available for arbitrary stationary random
data passing through arbitrary nonlinear systems. Instead, each type of
nonlinearity has to be investigated as a special case. HOS could play a key
role in detecting and characterizing the type of nonlinearity in a system
from its output data. Consider a linear time invariant (LTI) system as
shown in Fig. 7.4 with input,
𝑋𝑘 = 𝑚 𝐴𝑚 exp 𝑗(𝜔𝑚 𝑘 + 𝜙𝑚 ),
where 𝜙(𝑚 ) are independent, identically distributed random variables.
𝑋𝑘 = 𝑚 𝐴𝑚 𝑒
𝑗(𝜔𝑚 𝑘+𝜙𝑚 )
LTI 𝑌𝑘 (1) = 𝑚 𝐵𝑒
𝑗(𝜔𝑚 𝑘+𝜃𝑚 )
SYSTEM
𝑗(𝜔𝑚 𝑘+𝜑𝑚 ) 𝑁 (𝜆)

𝑋𝑘 = 𝑚 𝐴𝑚 𝑒 NONLINEAR 𝑍𝑘 = 𝜆=1 𝑌𝑘
SYSTEM
Fig. 7.4 Output of a Linear Time Invariant system and a nonlinear system
to a sinusoidal input
189
Then the output of the LTI system, 𝑌𝑘 (1) is given by
𝑌𝑘 (1) = 𝑚 𝐵𝑚 𝑒𝑥𝑝 𝑗 𝜔𝑚 𝑘 + 𝜃𝑚 (7.7)
It can easily be verified that all higher order cumulants of {𝑌𝑘 (1) }of
order greater than two are identically zero. Therefore, zero HOS of {𝑌𝑘 (1) }
will indicate that only linear mechanisms generate the output time series.
The output of the Nonlinear system is given by
𝑁 (𝜆)
𝑍𝑘 = 𝜆=1 𝑌𝑘 (7.8)
where {𝑌𝑘 (1) } is given by (7.7) and
𝑌𝑘 (2) = 𝑚 𝑛 𝐶𝑚 𝐶𝑛 𝑒𝑥𝑝 𝑗[ 𝜔𝑚 + 𝜔𝑛 )𝑘 + (𝜃𝑚 + 𝜃𝑛 ] (7.9)
A nonzero bispectrum, given by (7.9), will indicate the existence of

the term, 𝑌𝑘 (2) , and therefore, the presence of a quadratic nonlinearity in
the system.
7.5.1 Cumulants and Higher Order Spectra
Higher order spectra are defined in terms of cumulants and

therefore are called cumulant spectra. Given a set of n real random
variables {xl, x2,…, xn}, their joint cumulants of order, r = k1 + k2 +… + kn
are defined as
𝜕 𝑟 𝑙𝑛 Φ(𝜔 1, 𝜔 2, …….,𝜔 𝑛 )
𝑐𝑘 1…..𝑘 𝑛 ≜ (−𝑗)𝑟 (7.10)
𝜕𝜔 𝑘 1 𝜕𝜔 𝑘 2 ………𝜕𝜔 𝑘 𝑛 𝜔 1= 𝜔 2=⋯=𝜔 𝑛 = 0
190
where, (1, 2,…, n) = E[exp j(1x1+…+ nxn)] is their joint

characteristic function. The joint moments of order r of the same set of
random variables are given by
𝑚𝑘1….. 𝑘𝑛 ≜ 𝐸 𝑥𝑘11 𝑥𝑘22 … … . 𝑥𝑛𝑘𝑛
𝜕 𝑟 Φ(𝜔 1, 𝜔 2, …….,𝜔 𝑛 )
= (−𝑗)𝑟 (7.11)
𝜕𝜔 𝑘 1 ………𝜕𝜔 𝑘 𝑛 𝜔 1= ….= 𝜔 𝑛 = 0
Hence, the joint cumulants can be expressed in terms of the joint moments
of the random variables. For example, if m1…0 = E[X1] = 0, then
c1…0 = 0
c2…0 = m2…0 = E[X12 ]
c3…0 = m3…0 = E[X13 ]
c4…0 = m4…0 – 3c22…0
= E[X14 ] – 3m22…0 (7.12)
By taking {X(k)}, k = 0, ±1, ±2, … to be a real stationary random process

with zero mean, E[X(k)] = 0, then the moment sequences of the process are
related to its cumulants as follows:
E[X(k) X(k+τ1)] = m2(τ1)
= c2(τ1) (autocorrelation sequence)
E[X(k) X(k+τ1) X(k+τ2)] = m3(τ1, τ2)
= c3(τ1, τ2) (third order moment or cumulant sequence)
E[X(k) X(k+τ1) X(k+τ2) X(k+τ3)] = m4(τ1, τ2 ,τ3) (7.13)
= c4(τ1,τ2 ,τ3) + c2(τ1) . c2(τ3 -τ2) + c2(τ2) . c2(τ3 –τ1) +c2(τ3)

. c2(τ2 –τ1) (fourth order moment sequence)
191
While the third order moments and third order cumulants are
identical, this is not true for the fourth order statistics. In order to generate
the fourth order cumulant sequence, we need knowledge of the fourth order
moment and autocorrelation sequences.
7.5.2 Properties of Bispectrum
Let {X(k)} be a real, discrete, zero-mean stationary process with

power spectrum P(), defined as
+∞
𝑃 𝜔 = 𝜏=−∞ 𝑟 𝜏 𝑒𝑥𝑝 − 𝑗(𝜔𝜏) , 𝜔 <𝜋 (7.14)
where,
𝑟 𝜏 = 𝐸 𝑋(𝑘)𝑋(𝑘 + 𝜏) (7.15)
is its autocorrelation sequence. If R(m, n) denotes the third moment

sequence of {X(k)}, i.e.;
𝑅 𝑚, 𝑛 = 𝐸[𝑋 𝑘 𝑋 𝑘 + 𝑚 𝑋 𝑘 + 𝑛 ] (7.16)
then its bispectrum is defined as
+∞ +∞
𝐵 𝜔1, 𝜔2 = 𝑚 =−∞ 𝑛=−∞ 𝑅 𝑚, 𝑛 exp −𝑗(𝜔1 𝑚 + 𝜔2 𝑛) (7.17)
Since the third order moments and cumulants are identical, the bispectrum
is a third order cumulant spectrum. From (7.16), it follows that the third
order moments obey the symmetry properties such as,
R(m,n) = R(n,m) = R(-n,m-n) = R(n-m,-m) = R(m-n,-n) = R(-m,n-m)
As a consequence, knowing the third moments in any one of the six

sectors, shown in Fig. 7.5 and Fig. 7.6, would enable us to find the entire
third moment sequence.
192
 Gaussian Processes: If {X(k)} is a stationary zero-mean Gaussian

process, its third-moment sequence R(m, n) = 0 for all (m, n) and therefore
its bispectrum B(l, 2 ) is identically zero.
 Linear Phase Shifts: Given {X(k)}with power spectrum Px() and

bispectrum Bx(l, 2 ), the process, Y(k) = X(k - N), where N is a constant
integer, has power spectrum Py() = Px() and bispectrum By(l, 2) =
Bx(l, 2) i.e., the second and third order moments suppress linear phase
information. However, while the power spectrum (autocorrelation)
suppresses all phase information, the bispectrum (third-moment sequence)
does not.
 Non-Gaussian White Noise: If {W(k)} is a stationary non-

Gaussian process with E[W(k)] = 0, E[W(k) W(k + τ)] = Q.(τ) and
E[W(k) W(k + τ) W(k +)] = β. (τ, ), its power spectrum and bispectrum
are both flat, i.e., P() = Q and B(l, 2) = β.
 Quadratic Phase Coupling: There are situations in practice where

because of interaction between two harmonic components of a process,
there is contribution to the power at their sum and/or difference
frequencies. Such a phenomenon, which could be due to quadratic
nonlinearities, gives rise to certain phase relations called quadratic phase
coupling. In certain applications, it is necessary to find out, if peaks at
harmonically related positions in the power spectrum are, in fact, coupled.
The power spectrum suppresses all phase relations. The bispectrum,
however, is capable of detecting and quantifying phase coupling.
193
ΙΙ
ΙΙΙ
Ι
m
ΙV
VΙ
Ι V
Fig. 7.5 Symmetry regions of third order moments
Fig. 7.6 Symmetry regions of the Bispectrum
With bispectral analysis, it has been evidenced that ship generated

noise contain strong nonlinear components in its noise generating
mechanisms, whereas the ambient noise does not. The bispectral analysis
has the advantage that it is capable of distinguishing these nonlinear
components in the noise generating mechanism. This leads us to some
procedures for differentiating between shipping noise and ambient noise.
Moreover, this procedure can also be used for differentiating various
classes or types of ships.
194
Hence, the bispectral analysis can be used to extract the information

for analyzing the ship radiated noise, on the existence of such noise sources
that would normally be hidden in the ambient noise, when the spectral
estimation approaches are carried out.
7.6 DSP based Feature Extraction

The omni-directional hydrophone element keeps monitoring for any
acoustic disturbances near the deployed buoy system until the power of the
captured acoustic signal reaches above a predefined threshold level. Once
the threshold level is reached, the signal processing unit of the buoy
electronics get triggered for extraction of features which can be used for
identification of targets.
The signature features are extracted from the acoustic signals

captured by the hydrophone array. The signal conditioning and feature
extraction is performed using a 32 bit, DSP development board. The analog
acoustic signals from the hydrophone array are fed into an AIC23 stereo
CODEC for sampling and A/D conversion. The sampled digital output of
the audio signals is stored as wave file for further signal analysis and
processing. The acoustic signals from hydrophone array after preprocessing
are analyzed in the Feature Extraction Unit which is implemented using a
DSP development board manufactured by Texas Instruments. It consists of
a higher end DSP Processor TMS320C6713 which is adequate for
precision audio applications, Audio Codec, SDRAM, and FLASH memory.
The board is compact, inexpensive, fast in operations, and works on 3.3 V
supply.
The acoustic signal captured by the hydrophone array, after

preliminary processing is sampled and digitized by the audio codec. The
195
digital signal is then read by the processor using one of the Multichannel
Buffered Serial Ports (McBSPs). The data is then transferred to the
internal L2 memory through the Enhanced Direct Memory Access
(EDMA) channel. Double-Buffering is used to buffer the incoming data.
When one of the buffers is full an interrupt is generated to process the data
received. At the same time, the CODEC keeps sampling and saves data
into the other buffer. So data sampling and processing can be done
simultaneously and no incoming signals are missed even if the DSP is
processing previously received data. Feature extraction and signal
conditioning is done on the received signal. The digital signal is then send
to RF Link for transmission through the serial port. The Flowchart
depicting the algorithm for feature extraction is shown in Fig. 7.7.
7.6.1 Architecture of TMS320C6713
The TMS320C6713 DSP processor is a 32-bit floating point

processor developed by Texas Instruments. Its key features are:
 Operating frequency is 225 MHz

 An AIC23 stereo codec for coding and decoding the audio signals
 16 Mbytes of synchronous DRAM
 512 Kbytes of non-volatile Flash memory
(256 Kbytes usable in default configuration)
 4 user accessible LEDs and DIP switches
 Software board configuration through registers implemented in
CPLD
 Configurable boot options
 Standard expansion connectors for daughter card use
 JTAG emulation through onboard JTAG emulator with USB host
interface or external emulator
196
 Single voltage power supply (+5V)
Start
Initialize HA to 0°
Enable Motor for

Clockwise Mode
Maximum Report Routine

Limit ? End
Capture Signal from

HA and Extract
Feature End
t
Send Feature &
Angular Position
Advance HA by Rotating
in CW Direction
Fig. 7.7 Flowchart depicting the algorithm for feature extraction
The platform used for programming and communicating with the

processor is Code Composer Studio (CCS) through an embedded JTAG
emulator via USB interface or through an external emulator via JTAG
connector. The block diagram of the DSK is shown in Fig. 7.8.
197
Fig. 7.8 Block Diagram of TI TMS320C6713 DSK
7.6.1.1 Memory Map
The C67xx family of DSPs has a large byte addressable address

space. Program code and data can be placed anywhere in the unified
address space. Addresses are always 32-bits wide. The memory map
shows the address space of a generic 6713 processor on the left with
specific details of how each region is used on the right. By default, the
internal memory locates at the beginning of the address space. Portions of
the internal memory can be reconfigured in software as L2 cache rather
than fixed RAM. The EMIF has 4 separate addressable regions called chip
enable spaces (CE0-CE3). The SDRAM occupies CE0 while the Flash and
CPLD share CE1, CE2 and CE3 are generally reserved for daughter cards.
The memory mapping is shown in Fig. 7.9.
7.6.1.2 CPLD (Programmable Logic)
The C6713 DSK uses an Altera EPM3128TC100-10 Complex

Programmable Logic Device (CPLD) device to implement:
198
 4 Memory-mapped control/status registers that allow software

control of various board features.
 Control of the daughter card interface and signals.
 Assorted "glue" logic that ties the board components together.
Fig. 7.9 Memory Mapping in TMS320C6713 DSK
7.6.1.3 AIC23 Codec
The DSK uses a Texas Instruments AIC23 (part #TLV320AIC23)

stereo codec for input and output of audio signals. The codec samples
analog signals on the microphone or line inputs and converts them into
digital data so that it can be processed by the DSP. When the DSP is
finished with the data it uses, the codec convert the samples back into
analog signals on the line and the headphone outputs the signals so that the
user can hear the output. The codec communicates using two serial
channels, one to control the codec‟s internal configuration registers and one
199
to send and receive digital audio samples. The schematic of AIC23 codec is
shown in Fig. 7.10.
Fig. 7.10 Schematic of AIC23 CODEC
McBSP0 is used as the unidirectional control channel. It should be

programmed to send a 16-bit control word to the AIC23 in SPI format.
The top 7 bits of the control word should specify the register to be
modified and the lower 9 should contain the register value. The control
channel is only used when configuring the codec, it is generally idle when
audio data is being transmitted, McBSP1 is used as the bi-directional data
channel. All audio data flows through the data channel. Many data
formats are supported based on the three variables, sample width, clock
signal source and serial data format. The DSK examples generally use a
16-bit sample width with the codec in master mode, so it generates the
frame sync and bit clocks at the correct sample rate without effort on the
DSP side. The preferred serial format is DSP mode which is designed
specifically to operate with the McBSP ports on TI DSPs. The codec has a
12MHz system clock. The 12MHz system clock corresponds to USB
sample rate mode and can use the same clock for both the codec and USB
controller.
200
7.6.1.4 Synchronous DRAM
The DSK uses a 128 megabit synchronous DRAM (SDRAM) on

the 32-bit EMIF. The SDRAM is mapped at the beginning of CE0 (address
0x80000000) with a total available memory of 16 megabytes. The
integrated SDRAM controller is a part of the EMIF and must be configured
in software for proper operation. The EMIF clock is derived from the PLL
settings and should be configured in software at 90MHz. This number is
based on an internal PLL clock of 450MHz required to achieve 225 MHz
operation with a divisor of 2 and a 90MHz EMIF clock with a divisor of 5.
When using SDRAM, the controller must be set up to refresh one row of
the memory array every 15.6 microseconds to maintain data integrity.
With a 90MHz EMIF clock, the period is 1400 bus cycles.
7.6.1.5 Flash Memory
Flash is a type of memory which does not lose its contents when the
power is turned off. When read, it looks like a simple asynchronous read
only memory (ROM). Flash can be erased in large blocks commonly
referred to as sectors or pages. Once a block has been erased each word
can be programmed once through a special command sequence. After that,
the entire block must be erased again to change the contents. The DSK
uses a 512Kbyte external Flash as a boot option. It is visible at the
beginning of CE1 (address 0x90000000). The Flash is wired as a 256K by
16 bit device to support the DSK's 16-bit boot option. However, the
software that ships with the DSK treats the Flash as an 8-bit device
(ignoring the top 8 bits) to match the 6713's default 8-bit boot mode. In
this configuration, only 256Kbytes are readily usable without software
changes.
201
7.6.1.6 LEDs and DIP Switches
The DSK includes 4 software accessible LEDs (D7-D10) and DIP

switches (SW1) that provide the user a simple form of input/output. Both
are accessed through the CPLD USER_REG register.
7.7 Prototype Target Classifier

The omni-directional hydrophone element keeps monitoring the
acoustic disturbances near the deployed buoy system until the power of the
captured noise signal reaches above a predefined threshold level. Once the
threshold level is reached, the signal processing unit of the Buoy
electronics gets triggered and generates the ALERT signal. The analog
signals captured by the hydrophone array are coded to the required file
format using the AIC23 CODEC of the TMS320C6713 DSK. This file can
be used for further signal analysis and processing.
The classification function operates in a multidimensional space

formed by the various components of the feature vector. For the purpose of
target classification, one has to identify the characteristic features from the
representation of an object. Upon generating the various features, those
features that can indeed aid in the process of classification are selected.
Though, such a selection will generally lead to loss of information, this will
reduce the noise generated by the irrelevant features as well as the risk of
over fitting the training data, thus making the classifier computationally
efficient.
The signature features are used to generate the required

classification clues towards the identification of the noise sources in the
ocean. The function of classification is carried out by performing the
template matching process, in which the various components of the feature
202
vector generated are mapped with the corresponding components of the

feature vector available in the knowledge base.
7.7.1 Feature Vector based classifier
7.7.1.1 Euclidean Distance Model
Euclidean distance model is one of the simple yet efficient classifier

algorithms and a properly weighted model, making use of the feature
vector, could be used to find out the nearest match [127]. The weights for
the various components of the feature vector have been selected based on
heuristics, the knowledge gained from the training examples as well as trial
and error procedures. For the purpose of feature vector based classification,
the Euclidean distance between the feature vectors of the unknown target
and that of the various targets in the knowledge base is computed. The
vector components are normalized by standard deviation or the range of the
features, across the whole knowledge base. Further to normalization, each
feature is weighted in proportion to its significance in the similarity
estimation.
The Euclidean distance DE is computed as
 x  yi   wi
2
l

DE    i  (7.18)
i 1  vi 
where xi and yi refers to the ith feature component of the unknown target
and that of the various targets in the knowledge base respectively, wi is the
weight assigned to the ith feature component such that,
l
w
i 1
i  1,
vi represents the normalization vector and l is the total number of features.
203
7.8 Results and Discussions

A feature extraction unit has been developed on the TMS320C6713
DSP development board, for extracting the characteristic features of the
target. This hardware has been designed in such a way that it can handover
the extracted features to the communications controller through the DSP
interface provided in the Hydrophone Array Controller. The flowchart for
the code for spectral feature extraction is shown in Fig. 7.11.
The signature features of the captured signals by the hydrophone

arrays are extracted using MATLAB and TMS320C6713 DSK. Codes
were written in „C‟ language and were compiled and run on
TMS320C6713 DSP processor using Code Composer Studio. The
algorithms used in both the platforms were similar and the performance of
the module for extracting the features and identifying the targets has been
validated to a satisfactory level of repeatability and reproducibility.
The spectral features that are generated by the DSP module and
MATLAB are compared with those available in the knowledge base.
Features extracted for 3 targets using MATLAB and DSP processor are
given in Table 7.1, Table 7.2 and Table 7.3.
204
Fig. 7.11 Flowchart for extracting spectral features
205
Table 7.1 Spectral Features extracted using MATLAB and TMS320C6713 DSP
Board for Engine noise
Spectral Features extracted using MATLAB and TMS320C6713 DSP processor
Output
Input (Wave File) Feature
MATLAB TMS320C6713
Sampling Frequency 11008 11008

Spectral centroid -5.02 -5.9
Spectral Spread 0.737 1.9
Bandwidth 1334 1355
Brightness 2824.5 2830
Spectral slope -41.39 -45
Spectral Roll off 4694 4700
Spectral Flatness 0.21 0.5
2913.1 2920.7
Engine.wav
3197.6 3200.3
3448.1 3452.5
3735.7 3738.4
Prominent Peaking 4035.1 4040.2

Frequencies 4269.7 4270
4549.2 4552
4813.8 4815.7
5117.3 5119
5385.8 5385.8
Data Length = 110080
206
Board for Ship noise
Output
Input
Feature
(Wave File) MATLAB TMS320C6713

Bandwidth 2343 2343
Brightness 7273 7252
Spectral Roll off 10196 10200
2537.6 2542.5
Ship.wav
3339.5 3344.4
4719.2 4722
5401.1 5409
Prominent Peaking 6728.9 6732.8
Frequencies 7502.7 7510
8224.6 8234.6
8784.5 8790
9466.4 9472
10234 10244
Data Length = 151806
207
Board for Boat noise
Input Output
Feature
( Wave File) MATLAB TMS320C6713
Bandwidth 1103 1113
Brightness 3693 3670
Spectral Roll off 5027.2 5035
224 230
Boat.wav 605 610.5
1121 1127
1709 1715.2
Prominent Peaking 3602 3610
Frequencies 3963 3957
4285 4277
4269 4265
4946 4950
5237 5242
Data Length = 89856
208
7.9 Summary
A prototype underwater target classifier system based on the digital
signal processor hardware has been developed for classifying the noise
sources in the ocean using signature features extracted from the noise
emanations. The various steps involved in the generation of feature vectors
have been described in this chapter. The TMS320C6713 has been used for
extracting the features and to handing over it to the communications
controller through the DSP interface provided in the Hydrophone Array
Controller. The performance of the module for extracting the features
and identifying the targets has been validated to a satisfactory level of
repeatability and reproducibility. The signal capturing, processing, feature
extraction and target identification are subject to a real time constraint and
these can be easily done using the DSP module. The knowledge base
requires frequent updating for improving the success rates of the classifier.
209

17 - Chapter 7

Uploaded by

Copyright:

Available Formats

17 - Chapter 7

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

17 - Chapter 7

Uploaded by

Copyright:

Available Formats

Chapter 7 Target Classification

composite in nature, comprising of components emanating from a variety

Ships, other surface crafts, submarines, torpedoes, etc. are excellent

Spectral estimation techniques have been used as one of the vital

In order to interpret most effectively and efficiently the vast amount

PROCESSED DETECTION/ CLASSIFIER

Fig. 7.1 Block Schematic of the Target Classifier

The generalized structure of the prototype target classifier is shown

matching signature pattern is identified from the known target signatures in

In the signal processing module of the developed buoy system, the

7.2 Knowledge Base

7.2.1 Noise Data

7.2.1.1 Man Made Noises

Surfaced and submerged vessels create noise from their propellers,

generated due to the formation of bubbles during the rotation of propellers

7.2.1.2 Biological Noise Data

7.2.2 Data Analysis

For creating the knowledge base, the noise data waveforms of

7.2.3 Updating of Knowledge Base

7.3 Extraction of Features

The various features generated from this analysis are stored in a

NOISE DATA WAVEFORMS

Fig. 7.2 Methods for extracting the target features

Slice Noise Data Waveforms

Generate TFR from the

TFR for all

Take the Average of all TFRs

Fig. 7.3 Flowchart for generation of TFR

7.4 Spectral Analysis

Though frequency analysis using Fourier methods, DFT or the

used, there are numerous disadvantages with these non – parametric

Thus, more accurate power spectral density of the noise signal

7.4.1 Spectral Features

Spectral features of an acoustic signal are unique in their

7.4.2 Power Spectral Statistics

The power spectrum is the primary tool of signal processing and

where, Rxx(m) = E [x*(m) x(n+m)] is the autocorrelation sequence of x(n).

A sufficient, but not necessary, condition for the existence of the

7.4.3 Features using Power Spectral Statistics

Spectral features extracted using power spectral statistics are as follows:

 Spectral Centroid (Brightness)

Brightness or spectral centroid is the amplitude-weighted average,

𝑚𝑎𝑔 𝑖 ×𝑓𝑟𝑒𝑞 [𝑖]

i=0… frame size/2, where, mag = the magnitude spectrum.

Bandwidth or spectral range is an amplitude weighted average of

𝑚𝑎𝑔 𝑖 ×𝑓𝑟𝑒𝑞 [𝑖]−𝑏𝑟𝑖𝑔 𝑕𝑡𝑛𝑒𝑠𝑠

7.4.3.3 Spectral Roll off

Spectral Roll off is a measure of spectral shape, which could be

MIN(R) such that,

where, N is the length of the signal.

e.g. Spectral Roll off of engine.wav = 4694 Hz

7.4.3.4 Audio Spectrum Centroid (ASC)

Audio spectrum centroid features a logarithmic frequency scaling

where, Pr is the power spectrum of the frame r.

e.g. Audio spectrum centroid of engine.wav = -5.02

7.4.3.5 Audio Spectrum Spread (ASS)

It describes concentration of the spectrum around the centroid and

Lower spread values would mean that the spectrum is highly

e.g. Audio spectrum spread of engine.wav = 0.737