Convolution: 8.1. Linear Translation Invariant (LTI or LSI) Operators

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

8.

Convolution

8.1. Linear Translation Invariant (LTI or LSI) Operators

An extremely important class of operators is the class of linear translation invariant or linear
shift invariant operators. These are operators T that satisfy the following three conditions:

(i) T is “bounded” or “continuous” on some space somehow.

(ii) T is linear:
T (af + bg) = aT (f ) + bT (g) (8.1)
(iii) T is translation invariant:

T (τa (f )) = τa (T (f )) (8.2)

The model that we want to think of is that T is a black box – a processor – into which we feed
a function, or signal, f. The output is the processed signal T (f ). We’ll be very loose about what we
mean by condition (i); intuitively it means that a “small” change in the input signal should result
in a “small” change in the output. We are being deliberately vague for now about what we mean
by “small”. Another physical name for (ii) is that our operator or processor obeys a principle of
superposition. The intuitive meaning of property (iii) is this: the system respond in the same way
to the same input, regardless of what the clock on the wall says. If we feed a certain song into the
amplifier, we get a certain response from the speakers. If we feed the same song into the amplifier,
only three hours later, we get the same response from the speakers – three hours later.
Since the concept is so basic, let us recognize for the purpose of this course the standard
abbreviation “LTI operator”.
Some examples of LTI operators: the zero operator, the identity operator, any translation
operator τa ; multiplication by a constant, differentiation; any linear constant coefficient differential
operator; the “inverse” of a linear constant coefficient differential operator, the partial sum operators
SM defined in (5.12). We’ll have some others to consider, called filters, that do various things to
various frequencies of the input signal; the partial sum operators are examples of low-pass filters.
Does the Fourier transform interact with LTI operators in a predictable way? The answer is a
resounding yes. What follows is going to be an extremely formal argument. To put even the barest
hint of rigor into it requires either the language of generalized functions or a whole lot of functional
analysis.
Suppose that we have an LTI operator T, and suppose that the set of functions T is capable
of acting on includes all of the simple harmonic motions. Just for this argument, let us adopt a
temporary notation: for each frequency s, the function es is defined as es (x) = e2πisx . An important
identity is the one we obtain by letting a translation act on es .

τa es (x) = es (x + a) = e2πis(x+a) = e2πisa e2πisx = e2πisa es (x) (8.3)

How does our LTI operator act on a simple harmonic motion? We test it by applying transla-
tions.
T es (x + a) = τa T es (x) = T τa es (x) = T e2πisa es (x) = e2πisa T es (x)

(8.4)
Let x = 0 in equation (8.4):
T es (a) = e2πisa T es (0) (8.5)

57
Now replace a in (8.5) by x :

T es (x) = e2πisx T es (0) = T es (0)es (x) (8.6)

We have discovered that an LTI operator sends a simple harmonic motion to a constant multiple
of itself. What about arbitrary functions f (x)? We use the synthesis equation:
Z ∞
f (x) = fb(s)es (x) ds (8.7)
−∞

Hit both sides of (8.7) with the operator T. Use the analogy of sums to integrals and the linearity
of the operator to bring T inside the integral. (Yes, this is a very formal derivation.)
Z ∞ Z ∞
T f (x) = f (s)T es (x) ds =
b fb(s)T es (0)es (x) ds (8.8)
−∞ −∞

T es (0) is a number that depends only on s. Let’s give it a new name: T es (0) = mT (s).
Z ∞
T f (x) = fb(s)mT (s)e2πisx ds (8.9)
−∞

If we read (8.9) as being another synthesis equation, then we can extract the following from
equation (8.9).
Tcf (s) = mT (s)fb(s) (8.10)
A paraphrase of equation (8.10) goes like this: every LTI operator is given by point-by-point
multiplication on the Fourier transform side. The function mT (s) is called the multiplier associated
with the operator T. (Engineers might call mT (s) the transfer function.)
We wrote (8.10) as if we were talking about R; however, the same identity is true on each
of the other three domains. An LTI opererator acting on functions performs the same action as
multiplying the Fourier transforms by some fixed function.
For a different perspective on what this means, consider LTI operators on functions on PN .
A function on PN is an N -vector and the set of all such functions is an N -dimensional vector
space. Any linear operator (linear transformation) from this vector space into itself is given by the
multiplication of the N -vector by an N ×N matrix. The set of LTI operators is then some (possibly
fairly small) proper subset of the set of all N × N matrices. (In fact, these matrices have the special
property that they are constant on all “wrapped-around” diagonals in the same direction as the
main diagonal, and they have a special name: circulant matrices.) Let T be some LTI operator.
Then there is a function mT [k] on PN such that for all x ∈ {functions on PN } :

Tcx[k] = mT [k]b
x[k] (8.11)

b to be N × 1 column
Now let’s look at this from a matrix point of view. Consider both x and x
vectors, and let the N × N matrix of the operator T also called T. Let F be the matrix whose
(j, k)th element is ζ jk = e2πijk/N . Let F∗ be the conjugate transpose of F, which is to say the
matrix whose (j, k)th element is ζ −jk = e−2πijk/N . (This is the same matrix defined in equation
(4.11).) The work in chapter 4 shows that
1 ∗
F F = I, where I is the identity matrix. (8.12)
N

58
1
The analysis equation is xb = F∗ x, and the synthesis equation is x = Fbx. Equation (8.11) can
N
1
be rewritten as F∗ Tx equals a vector obtained by multiplying the kth component of the vector
N
1 ∗
F x by mT [k]. But the act of each component of a vector by a different number can be written
N
as multiplication of that vector by a diagonal matrix. Let D = DT be the diagonal matrix obtained
by making the kth entry on the diagonal mT [k] and making all other entries of that matrix zero.
Then we have the following further rewrite of (8.11):
 
1 ∗ 1 ∗
F Tx = D F x
N N
Multiply both sides of this on the left by F.
   
1 ∗ 1 ∗ 1
Tx = FD F x or T = FD F or F∗ TF = D (8.13)
N N N
To put a name to what we have here, the Fourier transform diagonalizes all LTI operators. The
numbers mT [k] are the eigenvalues of T.

8.2. Introduction to convolution

From looking at LTI operators, we develop an interest in the following: Suppose we have
functions f (x) and g(x). We calculate their Fourier transforms fb(s) and gb(s), multiply these two
transforms together and assume that the result is the Fourier transform b h(s) of some function h(x),
then use Fourier inversion (the synthesis integral) to recover h. That is,

h(s) = fb(s)b
b g (s);
Z ∞ Z ∞
2πisx
Then, h(x) = h(s)e
b ds = g (s)e2πisx ds
fb(s)b
−∞ −∞
Z ∞ Z ∞ 
−2πisu
= fb(s) g(u)e du e2πisx ds
−∞
Z ∞ Z−∞∞ 
2πis(x−u)
= g(u) f (s)e
b ds du
−∞ −∞
Z ∞
= f (x − u)g(u)du (8.14)
−∞
The (binary) operation that combines two functions given in (8.14) is called the convolution of
the two functions. Let’s define what we mean by convolution in each of our four domains:

Convolution on R :
Z ∞
Convolution: f ∗ g(x) = f (x − u)g(u) du (8.15)
−∞

Fourier transform: ∗ g (s) = fb(s)b


f[ g (s) (8.16)
Convolution on Z :

X
Convolution: f ∗ g[n] = f [n − m]g[m] (8.17)
m=−∞

59
Fourier transform: ∗ g [k] = q fb[k] gb [k]
f[ (8.18)
Convolution on TP :
Z P
1
Convolution: f ∗ g(x) = f (mod(x − u, p))g(u) du (8.19)
P 0

Fourier transform: ∗ g [k] = fb[k] gb [k]


f[ (8.20)
1
Warning! We had a notational choice: should we include the in equation (8.19) or not?
P
Leaving it out would be consistent with all of the other equations in this section, and indeed that
is the choice that Kammler would make, but I have another agenda here which argues in favor of
the notation I chose. We’ll just have to watch our step.

Convolution on PN :
N
X −1
Convolution: f ∗ g[n] = f [mod(n − m, N )]g[m] (8.21)
m=0

Fourier transform: ∗ g[k] = N fb[k]b


f[ g [k] (8.22)
Convolution on one side yields multiplication on the other side. Many things in this subject
are symmetrical, so multiplication on one side should yield convolution on the other side. Without
further ado, let’s write down the formulas:

On R : h(s) = (fb ∗ gb)(s)


If h(x) = f (x)g(x), then b (8.23)
On TP : If h(x) = f (x)g(x), then bh[k] = (fb ∗ gb)[k] (8.24)
Z q
On Z : If h[n] = f [n]g[n], then h(s) = q(f ∗ gb)(s) =
b b fb(s − u)b
g (u) du (8.25)
0

On PN : h[k] = (fb ∗ gb)[k]


If h[n] = f [n]g[n], then b (8.26)
Our funny notational choice in (8.19) came back to haunt us on (8.25), introducing what looks
like an extraneous multiplication by q into the formula. We’ll just have to live with it.

8.3. Convolution as a multiplication

Convolution is a binary operation on functions. It takes as input two functions and deliv-
ers as output another function. As a binary operation, it has many of the characteristics of a
multiplication. Let’s look at what some of those properties are.

• Convolution is commutative.

To see this, write out the integral in (8.15), then make the change of variable y = x − u, from
which u = x − y. By Calculus I notation, du = −dy, but the domain of integration gets flipped
around, balancing out the minus sign. If we follow the discussion before equation (7.30), we will
just write du = dy, with the same result.
Z ∞ Z ∞
f ∗ g(x) = f (x − u)g(u) du = f (y)g(x − y) dy = g ∗ f (x) (8.27)
−∞ −∞

60
Very similar arguments could be made in each of the other three domains.

• Convolution is associative.

This means that f ∗ (g ∗ h) = (f ∗ g) ∗ h. The proof of this identity is left to the reader. It’s a
lot of work, but the work is straightforward. You do what you have to do.

• Convolution is bilinear.

This means that f ∗ (ag + bh) = af ∗ g + bf ∗ h and (af + bg) ∗ h = af ∗ h + bg ∗ h.

Eventually, some of this is going to lead to the topic of “commutative Banach algebras”. There
is one multiplication-related point I’m still hiding from you: I haven’t told you whether there is or
isn’t an identity element for convolution.

8.4. A practical guide for convolving piecewise-defined functions

The trickiest part of dealing with convolution is figuring out what to do when one or both of
the functions being convolved is piecewise defined: e.g., if x is less than a certain number, f (x) is
defined by a certain formula, if x is greater or equal to that number but less than another number,
f (x) is defined by another formula, and so forth. The following steps might help.

0. Don’t forget that convolution is commutative - that is, (f ∗ g)(x) = (g ∗ f )(x). Try writing
the integral in (8.15) both ways, so that you can pick the way that looks easier.

1. Inspect the definitions of both f (x) and g(x) to determine for each function the best interval
that describes where the function
 is anything but zero: outside this interval the function is

 1 − x if 0 ≤ x < 2
zero. For instance, if f (x) = x if 3 ≤ x < 4 then that interval is [0, 4]. (We include the

0 otherwise

endpoints as a matter of routine and convention: it’s frequently not particularly important
to us exactly what the definition of the function is at any one point.) The interval could
easily be of the form [a, ∞) or (−∞, b]. You have two such intervals for the two functions;
call them [a, b] and [c, d]. Then the convolution of the two functions will “live” on the interval
[a + c, b + d] in the sense that (f ∗ g)(x) will be 0 for x < a + c and for x > b + d, and we
don’t need to do anything else about computing (f ∗ g)(x) “out there”.

2. For each of f (x) and g(x) write out the list of the “change points” – all of the points at which
the formula that defines the function changes. For instance, the function f (x) listed in step
1 has as its change points {0, 2, 3, 4}. If either function has any absolute values symbols in
its formulas, list all of the places at which the thing inside the absolute value equals zero as
change points.
( For instance, if g(x) = |x2 − 4|, then the change points of g are {−2, 2} and if
1 − |x| if |x| < 1
/\(x) = , then the change points of /\ are {−1, 0, 1}.
0 otherwise

3. You now have two lists – the change points of f and the change points of g. Find the list of
all of the numbers that you can obtain by adding together any element of the first list with
any element of the second list. The resulting list is the list of the change points of (f ∗ g)(x).
For example, suppose the change points of f (x) are {0, 2, 3, 4} and the change points of g(x)
are {−1, 0, 1}. Then the change points of (f ∗ g)(x) are {−1, 0, 1, 2, 3, 4, 5}. With 3 points on

61
one list and 4 points on the other, there could have been as many as 3 · 4 = 12 points on the
list of sums, but in this case some of these sums were duplicated. (Actually, a function with 7
change points is one that could take 8 lines to describe – we hope that most of our problems
aren’t quite that complicated!)

4. Proceed on a case-by-case basis. For each interval between adjacent change points of (f ∗g)(x)
(or outside, if outside matters ), assume that x belongs to that interval. Write out the integral:
Z ∞
(f ∗ g)(x) = f (x − u)g(u) du
−∞

Break this integral down into as many integrals over separate intervals as there are non-zero
pieces of the definition of g(x). For each such separate integral, determine which parts of the
definition of f (x) apply within that interval. For that purpose, you need to know the answers
to questions of this form: for what values of u is x − u in the interval [a, b]? The answer is
for x − b ≤ u ≤ x − a. This allows you to subdivide your integral even further into integrals
whose lower limit of integration may be of the form x − b or whose upper limit of integration
may be of the form x − a. Add up these pieces of integrals to get the value of (f ∗ g)(x) on
the particular interval for x that you are working on; repeat for each of the other intervals.

5. Sketch the graphs of f (x), g(x), and (f ∗ g)(x). Expect that (f ∗ g)(x) will be the smoothest
of the bunch; in particular, expect that under ordinary circumstances, (f ∗ g)(x) will be
continuous even if f (x) and g(x) are not.

Let’s illustrate this with two examples:


( (
1 if 1 ≤ x ≤ 3 1 if 1 ≤ x ≤ 2
Example 8.4a: Let f (x) = and g(x) = .
0 otherwise 0 otherwise
By step 1, we see that f (x) is non-zero on [1, 3] and g(x) is non-zero on [1, 2] so that (f ∗ g)(x)
should be nonzero on (at most) [1, 3] + [1, 2] = [2, 5]. By step 2, we see that our lists of change
points are {1, 3} and {1, 2} so that by step 3 the list of change points of (f ∗ g)(x) is {2, 3, 4, 5}. We
have already decided that for values of x that are “outside” – which is to say less than 2 or greater
than 5 – then (f ∗ g)(x) is zero. This leaves us three cases for values of x.

First case: Assume 2 ≤ x ≤ 3. (We’re going to be sloppy about whether these should be “<” or
“≤”; it really won’t matter to the integral calculations that follow.) We must compute
Z ∞ Z 2
(f ∗ g)(x) = f (x − u)g(u) du = f (x − u) du
−∞ 1

But f (x − u) is equal to 1 if x − u in the interval [1, 3], which is to say when x − 3 ≤ u ≤ x − 1, and
is zero otherwise. Since x is between 2 and 3, x − 3 is between −1 and 0 and thus doesn’t matter
when we are integrating from 1 to 2, but x − 1 is between 1 and 2 and matters very much. f is 1
to the left of that point and 0 otherwise, so we have
Z x−1
(f ∗ g)(x) = 1 du = (x − 1) − 1 = x − 2.
1

Second case: Assume 3 < x < 4. Now x − 3 is between 0 and 1 and x − 1 is between 2 and 3:
neither one of these change points matters when we are integrating from 1 to 2. In fact, f (x − u)

62
is equal to 1 on the whole of that interval and we have
Z 2
(f ∗ g)(x) = 1 du = 2 − 1 = 1.
1

Third case: Assume 4 < x < 5. Now x − 3 is between 1 and 2 and matters: f is equal to 1 for
values of u to the right of x − 3. x − 1 is between 3 and 4 and does not matter.
Z 2
(f ∗ g)(x) = 1 du = 2 − (x − 3) = 5 − x.
x−3

Overall, we have found that





x − 2 if 2 ≤ x ≤ 3

1 if 3 ≤ x ≤ 4
(f ∗ g)(x) =


5 − x if 4 ≤ x ≤ 5

0 otherwise

The graph looks like this (note that (f ∗ g)(x) is continuous, even though neither f (x) nor g(x) is):

f ∗ g(x)

( (
ex if x ≤ 0 e−x if x ≥ 0
Example 8.4b: Let f (x) = and g(x) = .
0 if x > 0 0 if x < 0

The interval on which (f ∗ g)(x) can be non-zero is (−∞, 0] + [0, ∞) = (−∞, ∞), so we have to
consider all possible values of x. The only change point of f is 0 and the only change point of g is
also 0, so the only change point of (f ∗ g)(x) is 0 and we have two cases to consider. In any case,
we must compute this integral:
Z ∞ Z ∞
(f ∗ g)(x) = f (x − u)g(u) du = f (x − u)e−u du
−∞ 0

First case: Assume that x < 0. Then f (x − u) = ex−u if x − ule0, which is to say if u ≥ x. Since
x < 0, this holds for all of the interval [0, ∞) upon which we are integrating.
∞ |inf ty
1 x −2u ∞ 1 x
Z Z
x−u −u x −2u
(f ∗ g)(x) = e e du = e e du = − e e = e
0 0 2 0 2

63
Second case: Assume that x ≥ 0. Then f (x − u) = ex−u if u ≥ x, but is zero otherwise. Since x
lies inside the interval [0, ∞), this matters.

1 x −2u ∞ 1 x −2x 1 −x
Z ∞ Z ∞
x−u −u x −2u
(f ∗ g)(x) = e e du = e e du = − e e = e e = e
x x 2 x 2 2

1
In this case, we can combine the two formulas into one line as (f ∗ g)(x) = e−|x|. Once again, this
2
is a continuous function.

8.5. The Wave Equation on the Line

We consider the one-dimensional wave equation for the infinite line:

∂2u 2
2∂ u ∂u
2
= c 2
for − ∞ < x < ∞, 0 < t < ∞; u(x, 0) = f (x), (x, 0) = g(x). (8.28)
∂t ∂x ∂t
Here c is a physical constant that has the dimensions of a velocity. We’ll see that s
c is the speed
T
of wave propagation. For a string under tension T with mass per unit length ρ, c = .
ρ
We take the Fourier transform of the equation with respect to x, the space variablel arriving at

∂2 ∂
u) = −4π 2 c2 s2 u
(b b; u
b (s, 0) = fb(s); u
b(s, 0) = gb(s) (8.29)
∂t2 ∂t
This is now an ordinary differential equation in t. In particular, it is the ODE of simple harmonic
motion of angular frequency 2πcs, that is of frequency cs. The solution is

u
b(s, t) = A(s) cos(2πcst) + B(s) sin(2πcst) (8.30)

Considering the initial conditions in (8.29) leads us to saying that A(s) = fb(s) and that B(s) =
gb(s)
. We now write our answer as u(x, t) = u1 (x, t) + u2 (x, t) where
2πcs
sin(2πcst)
u
c1 (s, t) = fb(s) cos(2πcst) and u
c2 (s, t) = gb(s) (8.31)
2πcs
We deal with these terms one at a time.
1 1
c1 (s, t) = fb(s) cos(2πcst) = fb(s)e2πicst + fb(s)e−2πicst
u
2 2
By the translation formula (7.3), we get that
1 1
u1 (x, t) = f (x + ct) + f (x − ct) (8.32)
2 2
Formula (8.32) is quite striking. It is the classic solution due to DAlambert in which the
(stationary) initial position is divided in two with half of it sent sliding to the right at speed c
without changing shape and the other half sliding to the left at the same speed.
Now we deal with the part that depends on the initial velocity. By the second part of (8.31), u c2
is the product of gb and another function of s, which by (8.16) makes u2 the result of the convolution
sin(2πcst)
of g with the function whose Fourier transform is = t sinc(2cts). We know that sinc(s)
2πcs

64
is the Fourier transform of u(x). The dilation rule (7.31) tells us that the function whose Fourier
t  x  1  x 
transform is t sinc(2cts), is u = u . Explicitly writing out the convolution, we
2ct 2ct 2c 2ct
have
1 x+ct
Z
u2 (x, t) = g(y) dy (8.33)
2c x−ct
Our final answer for u is simply the sum of the functions in (8.32) and (8.33).

8.6. Exercises

8.1. On R, define the operator T f (x) = 1 + x2 f (x).




(a) Show that T is a linear operator.


(b) Show by appropriate example that T is not translation invariant.

∗ g(s). Verify that


8.2. Using the functions defined in Example 8.4a, compute fb(s), gb(s) and f[
identity (8.16) is true in this case.

∗ g(s). Verify that


8.3. Using the functions defined in Example 8.4b, compute fb(s), gb(s) and f[
identity (8.16) is true in this case.
∂2u ∂2u
8.4. Consider the wave equation on the whole line with c = 1 : = , with the initial
∂t2 ∂x2
conditions being that u(x, 0) = 0 (zero intial position) and
(
∂u 1 for − 1 ≤ x ≤ 1
(x, 0) = .
∂t 0 otherwise

(This simulates a string being struck a sudden blow by a broad, blunt hammer.)
1 3
Find the solution u(x, t) and draw sketches of the graph for t = , 1, , 2, 3, and 5.
2 2

65

You might also like