ECE 410 Digital Signal Processing D. Munson University of Illinois
ECE 410 Digital Signal Processing D. Munson University of Illinois
ECE 410 Digital Signal Processing D. Munson University of Illinois
1
ECE 410 DIGITAL SIGNAL PROCESSING D. Munson
University of Illinois Chapter 15
Application 1: Computer Tomography (CT)
Used extensively for medical imaging, nondestructive testing.
Objective is to reconstruct a cross-sectional view of a 3-D object:
reconstruct cross-section f(x,y)
Accomplish this by shining x-rays sideways through the object and collecting projections at
various angles. The projection data is then processed digitally to produce the image of f(x,y).
The oldest CT machines used narrow, parallel x-ray beams. Modern-day machines use a fan-
beam geometry. Since the digital processing in both systems is similar, we will consider the
parallel-beam case, which is a bit simpler mathematically.
15.2
Parallel beam geometry
cross-section with
density f(x,y) X-rays
y
p (v)
x
v
p
(v
o
) is the integral of f(x,y) along the path of the x-ray at angle impinging at v = v
o
.
Typically, projections p
(v) are collected through a full 360 by rotating the x-ray source(s) and
detectors around the object being imaged.
How do we recover f(x,y) from the projections p
(v)?
Define the 2-D Fourier transform of f(x,y) as
F(
1
,
2
) = f(x, y)e
j(
1
x+
2
y)
dxdy
The inverse 2-D Fourier transform is
f(x,y) =
1
(2)
2
F(
1
,
2
) e
j(
1
x+
2
y)
d
1
d
2
15.3
Notation:
Let F(
1
,
2
) in polar coordinates be written as
F
pol
(r, ) = F(r cos, rsin)
Then the following famous theorem forms the basis for reconstructing f(x,y) from its projections.
Projection-Slice Theorem
Let P
(v). Then
P
() = F
pol
(, )
| |
So, the Fourier transform of a projection is a radial slice of the 2-D Fourier transform of f(x,y) at
angle :
2
Collecting sampled projections at many (usually hundreds) of angles and taking the DFT (via
FFT) of each projection gives samples of F(
1
,
2
) on a polar grid:
15.4
1
To reconstruct samples of f(x,y) we might try discretization of the inverse 2-D Fourier transform.
We had:
f(x,y) =
1
(2)
2
F(
1
,
2
) e
j(
1
x+
2
y)
d
1
d
2
Writing this integral in polar coordinates, and then discretizing, would give an approximate
formula for f(nT,mT) in terms of the available polar samples of F(
1
,
2
). Computing N
2
samples of f(x,y) from N
2
samples of F would require
N
2
N
2
= N
4
which is excessive. For example, if N = 512, this approach would require about 64 x 10
9
MAs.
A faster alternative would be to
1) Interpolate the polar Fourier data to a Cartesian grid.
2) Compute a 2-D FFT
-1
(requires ~ 2N
2
log
2
N MAs)
15.5
A 2-D DFT is implemented by a series of row FFTs, followed by a series of column FFTs:
In practice, an accurate implementation of Step 1) requires more computation than the 2-D FFT
1
. The most popular image reconstruction algorithm for computer tomography is convolution-
back-projection (also called filtered back-projection) which is essentially an accurate and
efficient way to accomplish 1) and 2) in 0(N
3
) MAs. Researchers are now working on
convolution-back-projection algorithms that require only 0(N
2
log
2
N) MAs.
Application 2: Synthetic Aperture Radar (SAR)
SAR is a high-resolution microwave imaging system. It is used widely in applications such as
earth resources monitoring, military reconnaissance, planetary imaging, etc.
Same advantages of microwave imaging over optical: can penetrate fog, cloud cover,
atmosphere of Venus, etc., and does not rely on illumination by the sun.
Disadvantage: It is hard to achieve optical resolution. Why? Because, although we can get high
resolution in range via delay measurements, the cross-range resolution is seemingly limited by
the antenna beamwidth, which can be very wide at microwave frequencies:
D
W
R
15.6
If D is the antenna diameter, R is the range to the scene, and is the wavelength of radiation,
then the width of the antenna-footprint is
W =
R
D
For high cross-range resolution, we want W to be small, but this is hard to get. R may be large
and dictated by the imaging scenario, and you hope to use an antenna of practical size (D small).
For microwaves, is much larger than for visible light. So, with microwaves, D may need to be
impractically-large to achieve a desired cross-range resolution.
Solution:
Use small D with large W, but collect and process data from many angles. This is called
spotlight-mode SAR.
Imaging geometry
Flight path
Radar
y
x
f(x,y)
microwave
reflectivity
15.7
If a linear FM waveform cos(
o
t + t
2
) is transmitted, it can be shown that demodulated,
sampled returns provide Fourier data on a polar grid:
1
The return collected at angle in the spatial domain gives Fourier data on the radial trace at the
same angle in the Fourier domain. (Proof of this fact uses the projection-slice theorem.) The
inner and outer radii of the Fourier data region are proportional to the lowest and highest
frequencies, respectively, of the transmitted linear FM signal.
Reconstruction algorithm:
1) Polar-to-Cartesian interpolation
2) 2-D FFT
1
3) Display magnitude of the result.
Typical resolution:
1 ft., or less, to 20 m, depending on the application
These resolutions are achievable at exceedingly long ranges (e.g., space-based monitoring of the
earth).
15.8
Application 3: Speech Analysis/Synthesis
Consider an approach to speech coding called LPC Linear Predictive Coding.
This scheme is used in many speech communication systems, automated answering systems and
electronic games.
s
a
(t) s
n
Speech samples are highly correlated, so that s
n
can often be fairly-well predicted from its past values.
Suppose we wish to predict s
N
from s
N1
, s
N2
, , s
NK
. Try linear prediction.
Estimate s
N
by:
s
N
= a
k
s
N k
k=1
K
The a
k
{ }
are called LPC coefficients.
Choose the a
k
{ }
to minimize E s
N
s
N
( )
2
{ }
.
So, do this
minE
a
i
{ }
i =1
K
s
N
a
k
s
Nk
k=1
K
|
\
|
.
|
2
`
)
a
i
E s
N
a
k
s
Nk
k=1
K
|
\
|
.
|
2
`
)
= 0 i = 1, , K
E 2 s
N
a
k
s
Nk
k=1
K
|
\
|
.
|
s
Ni
( )
`
)
= 0 i = 1, , K
15.9
a
k
E s
Nk
s
Ni
{ }
= E s
N
s
Ni
{ }
k=1
K
i = 1, 2, K (1)
Suppose s
n
{ } is short-term wide-sense stationary. Then E s
m
s
n
{ } depends only on the
separation between m and n, i.e., on |m-n|, not on m and n individually.
In this case we can write E s
n
s
m
{ }as some function R
s
(nm) = R
s
(mn) where R
s
is called the
autocorrelation.
Substituting R
s
into (1) gives
a
k
R
s
(i k)
k=1
K
= R
s
(i) i = 1, 2, , K
This set of K equations can be expressed in matrix form as
R
s
(0) R
s
(1) R
s
(K1)
R
s
(1) R
s
(0)
R
s
(1)
R
s
(K1) R
s
(1) R
s
(0)
a
1
a
2
a
K
=
R
s
(1)
R
s
(2)
R
s
(K)
(2)
Given the R
s
(i), this set of equations can be solved for the optimal a
k
{ }
k=1
K
We might approximate R
s
(i) as:
R
s
(i) =
1
L
i
s
n
s
n+i
n
# terms in sum = L
i
The solution of (2) would ordinarily require 0(K
3
) operations.
15.10
But, the matrix has a special Toeplitz structure faster algorithms exist.
The Levinson - Durbin algorithms require 0(K
2
) operations.
Now, look at speech!
Classification of speech segments:
n
pitch period
Voiced:
(a, e, etc.)
s
n
Unvoiced:
(s, f, etc.)
n
noise
s
n
Now, suppose we have the optimal a
k
{ }
and we look at the prediction error e
n
= s
n
s
n
.
It turns out that for voiced sounds e
n
is well approximated by a pulse train:
n
pitch period
e
n
G train of unit pulses
where G is a slowly varying gain.
15.11
For unvoiced sounds, e
n
looks like noise:
n
noise
e
n
Using our definition of e
n
and the LPC model, we have
s
n
=
s
n
+ e
n
= a
k
s
nk
+ e
n
k=1
K
Thus, we can obtain s
n
from e
n
via
e
n
s
n
H(z)
where H(z) =
1
1 a
k
z
k
k=1
K
Standard Speech Model for Analysis/Synthesis:
Pitch
Period
time-varying filter
V/UV decision
Noise
Generator
Unit-Pulse
Generator
G H(z)
s
n
15.12
This model provides the basis for a speech analysis/synthesis scheme:
1) Analyze each 20 msec segment of the speech waveform to get:
a) V/UV decision
b) Pitch period (if voiced)
c) a
k
{ }
k=1
K
~ by solving equations in (2)
d) Gain ~ G
Transmit a) d) every 20 msec. At the receiver, reconstruct an approximation to the original
speech waveform by using the above model.
Comparison with PCM
PCM: Sample speech at ~ 8 kHz; use 7 bits/sample
56 K bits/sec.
Analysis/Synthesis:
(assuming a fancier version of LPC than we just covered)
8 K bits/sec: very close to regular telephone quality
2 K bits/sec: very understandable, somewhat machine-like
600 bits/sec: understandable, quite machine-like
Thus, we see that the LPC scheme can greatly reduce the bit rate for both transmission and
storage of speech.