Inner Product

Math 571
Inner Product Spaces
1. Preliminaries
An inner product space is a vector space V along with a function h, i called an inner product which
associates each pair of vectors u, v with a scalar hu, vi, and which satisfies:
(1) hu, ui 0 with equality if and only if u = 0

(2) hu, vi = hv, ui and
(3) h↵u + v, wi = ↵hu, wi + hv, wi
Combining (2) and (3), we also have hu, ↵v + wi = ↵hu, vi + hu, wi. Condition (1) is called positive definite,
condition (2) is called symmetric and condition (3) with the note above is called bilinear. Thus an inner
product is an example of a positive definite, symmetric bilinear function or form on the vector space V .
Definition 1.0.1. Let V be an inner product space and u and v be vectors in V . We make the following
definitions:
(1) The length or norm of the vector u is:
p
||u|| = hu, ui
(2) The distance between u and v is:
||u v||
(3) The angle between u and v is:

✓ ◆
hu, vi
✓ = cos 1
||u|| · ||v||
(4) We say that u and v are orthogonal if
hu, vi = 0
(5) The orthogonal projection of u onto the space spanned by v is:

✓ ◆
hu, vi
p= v
hv, vi
hu,vi
Note that, a priori, we do not know that 1 ||u||·||v||  1 so ✓ may not be defined. Later we will show that
these bounds are valid and so our definition of ✓ is also valid. When referring to (5), we will usually say “the
projection of u onto v”.
Now we give some preliminary results:
Theorem 1.0.2 (Pythagorean Theorem). Suppose that u and v are orthogonal vectors. Then
||u ± v||2 = ||u||2 + ||v||2

1
Proof: We’ll just do the case ||u + v||2 as the argument for the other case is similar. Note, as u and v are
orthogonal, hu, vi = 0. Now:
||u + v||2 = hu + v, u + vi
= hu, ui + hu, vi + hv, ui + hv, vi
= hu, ui + hv, vi
= ||u||2 + ||v||2
Theorem 1.0.3. Suppose that p is the orthogonal projection of u onto the space spanned by v. Then p and
u p are orthogonal.
Proof: Recall that the projection of u onto v is given by:

hu, vi
p= v
hv, vi
hu,vi
For notational convenience, we set = hv,vi . Then
hp, u pi = hp, ui hp, pi
= h v, ui h v, vi
= hv, ui 2
hv, vi
hu, vi hu, vi2
= hv, ui hv, vi
hv, vi hv, vi2
hu, vi2 hu, vi2
=
hv, vi hv, vi
= 0
Therefore they are orthogonal. ⇤
Theorem 1.0.4 (Cauchy-Schwarz Inequality). Suppose that v and v are vectors in an inner product space.
Then
|hu, vi|  ||u|| · ||v||
Proof: Let p be the orthogonal projection of u onto v. From the previous result, p and u p are orthogonal.
We may then apply the Pythagorean Theorem to get
||p||2 + ||u p||2 = ||(p) + (u p)||2
= ||u||2
Subtracting ||u p||2 from both sides and noting 0  ||u p||2 gives:
||p||2 = ||u||2 ||u p||2  ||u||2
In the proof of the previous theorem, we found that

hu, vi2 hu, vi2
||p||2 = hp, pi = =
hv, vi ||v||2
2
So we have
hu, vi2
 ||u||2
||v||2
hu, vi2  ||u||2 · ||v||2
which finally leads to

|hu, vi|  ||u|| · ||v||
2. Examples of Inner Product Spaces
2.1. Example: Rn . Just as Rn is our template for a real vector space, it serves in the same way as the
archetypical inner product space. The usual inner product on Rn is called the dot product or scalar product
on Rn . It is defined by:
hx, yi = xT y
where the right-hand side is just matrix multiplication. In particular, if

0 1 0 1
x1 y1
B C B C
B C B C
B x2 C B y2 C
x = B . C and y = B . C
B C B
C
B .. C B .. C
@ A @ A
xn yn
then 0 1
y1
B C
B C
B y2 C
hx, yi = x y = (x1 , . . . , xn ) B
T
B ..
C = x1 y1 + · · · + xn yn
C
B
@ . C
A
yn
****PROOF OF DOT PRODUCT BEING INNER PRODUCT GOES HERE****
****GEOMETRIC PROOF OF ORTHOGONAL PROJECTIONS GOES HERE****
****SPECIFIC EXAMPLE GOES HERE****
✓ ◆
xT y
p= y
yT y
xT y = ||x|| · ||y|| cos(✓)
An alternate inner product can be defined on Rn by:
hx, yi = xT ⇤ A ⇤ y
where the right-hand side is just matrix multiplication. The n ⇥ n matrix A must be a type of matrix known as
a symmetric, positive definite matrix in order for this to satisfy the conditions of an inner product. If we choose
A to be a symmetric matrix in which all of its entries are non-negative and has only positive entries on the main
diagonal, then it will be such a matrix. (More generally a symmetric, positive definite matrix is a symmetric
matrix with only positive eigenvalues.)
3
2.2. Example: Rm⇥n . Suppose A = (aij ) and B = (bij ) are matrices in Rm⇥n . The usual inner product on
Rm⇥n is given by:
m X
X n
hA, Bi = aij bij
i=1 j=1
Note that this is the sum of the point-wise products of the elements of A and B.
****PROOF OF THIS PRODUCT BEING INNER PRODUCT GOES HERE****
2.3. Example: Pn . Here we will describe a type of inner product on Pn which we will term a discrete inner
product on Pn . Let {x1 , . . . , xn } be distinct real numbers. If p(x) is a polynomial in Pn , then it has degree at
most (n 1), and therefore has at most (n 1) roots. This means that p(xi ) 6= 0 for at least one i. We know
define an inner product by:
n
X
hp(x), q(x)i = p(xi )q(xi )
i=1

Since every polynomial is continuous at every real number, we can use the next example of an inner product
as an inner product on Pn . Each of these are a continuous inner product on Pn .
2.4. Example: C[a, b]. An inner product on C[a, b] is given by:

Z b
hf (x), g(x)i = f (x)g(x)w(x) dx
a
where w(x) is some continuous, positive real-valued function on [a, b]. The standard inner product on C[a, b]
is where w(x) = 1 in the above definition.
3. Subspaces, Orthogonal Complements and Projections
Definition 3.0.1. Let U and V be subspaces of a vector space W such that U \ V = {0}. The direct sum of
U and V is the set U V = {u + v | u 2 U and v 2 V }.
Definition 3.0.2. Let S be a subspace of the inner product space V . The the orthogonal complement of S
is the set S ? = {v 2 V | hv, si = 0 for all s 2 S}.
Theorem 3.0.3. (1) If U and V are subspaces of a vector space W with U \ V = {0}, then U V is also
a subspace of W .
(2) If S is a subspace of the inner product space V , then S ? is also a subspace of V .
Proof: (1.) Note that 0+0 = 0 is in U V . Now suppose w1 , w2 2 U V , then w1 = u1 +v1 and w2 = u2 +v2
with ui 2 U and vi 2 V and w1 + w2 = (u1 + v1 ) + (u2 + v2 ) = (u1 + u2 ) + (v1 + v2 ). Since U and V are
subspaces, it follows that w1 +w2 2 U V . Suppose now that ↵ is a scalar, then ↵w1 = ↵(u1 +v1 ) = ↵u1 +↵v1 .
As above, it then follows that ↵w1 2 U V . Thus U V is a subspace for W .
4
For (2.), first note that 0 2 S ? . Now suppose that v1 and v2 2 S ? . Then hv1 , si = hv2 , si = 0 for all s 2 S.
So hv1 + v2 , si = hv1 , si + hv2 , si = 0 + 0 = 0 for all s 2 S. Thus v1 + v2 2 S ? . Similarly, if ↵ is a scalar, then
h↵v1 , si = ↵hv1 , si = ↵ · 0 = 0 for all s 2 S. Thus S ? is a subspace of V . ⇤
Theorem 3.0.4. If U and V are subspaces of W with U \ V = {0} and w 2 U V , then w = u + v for unique
u 2 U and v 2 V .
Proof: Write w = u1 + v1 and w = u2 + v2 . Then u1 + v1 = u2 + v2 ) u1 u2 = v2 v1 ) u1 u2 = 0 =

v2 v1 ) u1 = u2 and v2 = v1 . ⇤
Recall that one of the axioms of an inner product is that hx, xi 0 with equality if and only if x = 0. An
immediate consequence of this is that S \ S ? = 0.
Definition 3.0.5. Let S be a subspace of the inner product space V and let {x1 , . . . , xn } be a basis for S such
that hxi , xj i = 0 if i 6= j, then this basis is called an orthogonal basis. Furthermore, if hxi , xi i = 1 then this
basis is called an orthonormal basis.
Definition 3.0.6. Let S be a finite dimensional subspace of the inner product space V and let {x1 , . . . , xn } be
an orthogonal basis for S. If v is any vector in V then the orthogonal projection of v onto S is the vector:
Xn
hv, xi i
p= xi
i=1
hxi , xi i
Note that if {x1 , . . . , xn } is an orthonormal basis, then we have the simpler expression:
n
X
p= hv, xi ixi
i=1
Also in the special case where S is spanned be the single vector x1 , then p is just the usual orthogonal projection
of v onto S, which is the line spanned by x1 .
Now we can prove the main theorem of this section:
Theorem 3.0.7. Let S be a finite dimensional subspace of the inner product space V and v be some vector in
V . Moreover let {x1 , . . . , xn } be an orthogonal basis for S and p be the orthogonal projection of v onto S. Then
(1) v p 2 S?.
(2) V = S S?.
(3) If y is any vector in S with y 6= p, then ||v p|| < ||v y||
Note that part (3.) says that p is the vector in S which is closest to v. Moreover, an immediate consequence
of (2.) is that the orthogonal projection p of v onto S is independent of the choice of orthogonal basis for S.
Proof: (1.) We need to show that p and v p are orthogonal. So consider hp, v pi = hp, vi hp, pi. Note
that hxi , xj i = 0 when i 6= j, so that
n
X X n
hv, xi i
hp, vi = hci xi , vi with ci = and hp, pi = hci xi , ci xi i )
i=1
hxi , xi i i=1
Xn Xn
hv, xi i hv, xi i2
hp, vi = hxi , vi and hp, pi = hxi , xi i
i=1
hxi , xi i i=1
hxi , xi i2
5
Thus
n
X n
X
hv, xi i2 hv, xi i2
hp, vi = and hp, pi =
i=1
hxi , xi i i=1
hxi , xi i
and the result follows for this part. Now let v be any vector in V , then v = p + (v p). Note that p 2 S and
from (1.), v p 2 S ? , and S \ S ? = {0}. Therefore we must have V = S S ? , proving (2.). For part (3.), let
y be some vector in S with y 6= p. Then ||v p|| = ||v p+p y||. Since p y 2 S and v p 2 S ? by (1.),
we have
||v p||2 + ||p y||2 = ||v y||2
By the Pythagorean Theorem. So

||v p||2 = ||v y||2 ||p y||2
Since y 6= p, ||p y|| 6= 0. Therefore ||v p||2 < ||v y||2 and ||v y|| < ||v y||. ⇤
Note that by (3.) of the above theorem, if v is actually in S, then p = v.
Definition 3.0.8. Let S be a subspace of the inner product space V , v be a vector in V and p be the orthogonal
projection of v onto S. Then p is called the least squares approximation of v (in S) and the vector r = v p
is called the residual vector of v.
4. Least squares in Rn
In this section we consider the following situation: Suppose that A is an m ⇥ n real matrix with m > n. If b
is a vector in Rm then the matrix equation Ax = b corresponds to an overdetermined linear system. Generally
such a system does not have a solution, however we would like to find an x̂ such that Ax̂ is as close to b as
possible. In this case Ax̂ is the least squares approximation to b and we refer to x̂ as the least squares solution
to this system. Recall that if r = b Ax, then r is the residual of this system. Moreover, our goal is then to
find a x which minimizes ||r||.
Before we continue, we mention a result without proof:
Theorem 4.0.9. Suppose that A is a real matrix. Then Col(A)? = N(AT ) and Col(AT )? = N(A).
We will use the results of the previous section to find x̂, or more precisely Ax̂. Given b there is a unique vector
p in Col(A) such that ||b p|| is minimal by theorem 3.0.7. Moreover, by the same theorem, b p 2 N(AT ).
Thus:
AT (b p) = 0 ) AT b AT p = 0 ) AT p = AT b
However, p = Ax̂ for some vector x̂ (note: x̂ is not necessarily unique, but Ax̂ is). So
AT p = AT b ) AT Ax̂ = AT b
Thus to find x̂ we simply solve for x̂ in the equation:
AT Ax̂ = AT b
which is necessarily consistent. Note that in this case we did not need to know an orthogonal basis for Col(A).
This is because we never explicitly calculate p.
6
Another general fact about A in this case is that the rank of A is generally n. That is, the columns of A will
usually be linearly independent. We have the following theorem which gives us an additional way to solve for x̂
in this situation:
Theorem 4.0.10. If A is an m ⇥ n matrix and the rank of A is n then AT A is invertible.
Proof: Clearly, N(A) is a subset of N(AT A). We now wish to show that N(AT A) is a subset of N(A). This
would establish that N(A) = N(AT A). Let x 2 N(AT A), then (AT A)x = 0 ) AT (Ax) = 0 ) Ax 2 N(AT ).
Note also that Ax 2 Col(A) so that Ax 2 N(AT ) \ Col(A). Since Col(A)? = N(AT ) ) Ax = 0, thus x 2 N(A)
and N(AT A) = N(A). By the rank-nullity theorem we see that the rank of AT A is the same as the rank of A
which is assumed to be n. As AT A is an n ⇥ n matrix, it must be invertible. ⇤
Thus, when A has rank n, AT A is invertible, and
x̂ = (AT A) 1
AT b
Now we proceed with some examples:

Example 1: Consider the linear system:
x1 + x2 = 10
2x1 + x2 = 5
x1 2x2 = 20
This system is overdetermined and inconsistent. We would like to find the least squares approximation to b
and the least squares solution x̂ to this system. We can rewrite this linear system as a matrix system Ax = b
where: 0 1 0 1
1 1 10
B C B C
B C B C
A=B 2 1 C and b = B 5 C
@ A @ A
1 2 20
It is easy to check that A has rank 2, hence AT A is invertible. Therefore:

0 1
0 1 6.43
2.71 B C
B C
x̂ = (AT A) 1 AT b = @ A, Ax̂ = B 1.71 C and ||r|| = ||b Ax̂|| = 19.44
3.71 @ A
10.14
Example 2: Suppose some system is modeled by a quadratic function f (x), so that f (x) = ax2 + bx + c.
Experimental data is recorded in the form (x, f (x)) with the following results:
(1, 1), (2, 10), (3, 9), (4, 16)
We would like to find the best approximation for f (x). Using these data points, we see that:
a(1) + b(1) + c = 1
a(4) + b(2) + c = 10
a(9) + b(3) + c = 9
a(16) + b(4) + c = 16
7
This corresponds to the matrix equation Ax = b where:
0 1 0 1
1 1 1 1
B C B C
B C B C
B4 2 1C B10C
A=B B
C and b = B C
C B C
B9 3 1C B9C
@ A @ A
16 4 1 16
As in the previous example, the matrix A has full rank, hence AT A is invertible. Therefore the least squares
solution to this system is:
0 1
0.5
B C
B C
x̂ = (AT A) 1
AT b = B 6.9 C
@ A
4.5
Therefore f (x) is approximately 0.5x2 + 6.9x 4.5
Example 3: The orbit of a comet around the sun is either elliptical, parabolic, or hyperbolic. In particular,
the orbit can be expressed by the polar equation:
r= e(r cos ✓)
where is some positive constant and e is the eccentricity. Note that the orbit is elliptical if 0  e < 1, parabolic
if e = 1 and hyperbolic if e > 1.
A certain comet is observed over time and the following set of data (r, ✓) was recorded:
(1.2, 0.3), (2.1, 1.2), (4.1, 2.6), (6.3, 3.8)
Using this data we would like to find the approximate orbital equation of this comet. Plugging these data points
in the equation above gives us the linear system:
e(1.146) = 1.2
e(0.761) = 2.1
e( 3.513) = 4.1
e( 4.983) = 6.3
This system corresponds to the matrix equation Ax = b where

0 1 0 1
1 1.146 1.2
B C B C
B C B C
B1 0.761C B2.1C
A=B B
C and b = B C
C B C
B1 3.513 C B4.1C
@ A @ A
1 4.983 6.3
Once again, the matrix A has full rank so that the least squares solution is given by:
0 1
2.242
x̂ = (AT A) 1 AT b = @ A
0.718
Therefore the orbital equation is approximately r = 2.242 0.718(r cos ✓). This example is similar to one of the
first applications of least squares. Gauss is credited with developing the method of least squares and applying
it to predicting the path of the asteroid Ceres. He did this to a remarkable degree of accuracy in 1801.
8
5. Least squares in C[a, b]
Recall that an inner product in C[a, b] is given by

Z b
hf (x), g(x)i = f (x)g(x)w(x) dx
a
where w(x) is some continuous, positive function on [a, b]. Consider that we have a collection of functions
{f1 (x), . . . , fn (x)} which are mutually orthogonal. Moreover, assume that they form an orthogonal basis for S.
Then, given any function f (x) in C[a, b], we can approximate f (x) by a linear combination of the fi . The best
such approximation (in terms of least squares) will be given by the orthogonal projection p(x) of f (x) onto S.
The most common application of such an approximation is in Fourier Series which will be covered in the next
section.
There is an analytical consideration which has to be made, and that is how good can we make this approx-
imation. In particular, can we enlarge S is a regular way so that the limit of this process is f (x). This is a
question which is beyond our scope, but the answers is yes in some cases and no in others.
6. Fourier Series
In this section we consider the function space C[ ⇡, ⇡] and we wish to know if given a function f (x) in
C[ ⇡, ⇡] how can we approximate this function using functions of the form cos mx and sin mx. This is obviously
useful for periodic functions. Our setup is as follows:
Z
1 ⇡
hf (x), g(x)i = f (x)g(x) dx
⇡ ⇡
and
1
Sn = Span( p , sin x, . . . , sin nx, cos x, . . . , cos nx)
2
We can check that the following are true:
Z
1 ⇡
h1, cos ↵xi = cos ↵x dx = 0
⇡ ⇡
Z
1 ⇡
h1, sin xi = sin x dx = 0
⇡ ⇡
8
Z < 1 if ↵ =
1 ⇡
hcos ↵x, cos xi = cos ↵x cos x dx =
⇡ ⇡ : 0 if ↵ 6=
8
Z ⇡ < 1 if ↵ =
1
hsin ↵x, sin xi = sin ↵x sin x dx =
⇡ ⇡ : 0 if ↵ 6=
Z
1 ⇡
hcos ↵x, sin xi = cos ↵x sin x dx = 0
⇡ ⇡
Therefore
1
{ p , sin x, . . . , sin nx, cos x, . . . , cos nx}
2
is an orthonormal basis for Sn .
9
Given a function f (x) in C[ ⇡, ⇡], the least squares approximation of f (x) in Sn will be
n
a0 X
+ (ak cos kx + bk sin kx)
2
k=1
where Z
1 ⇡
a0 = f (x) dx
⇡ ⇡
Z
1 ⇡
ak = f (x) cos kx dx
⇡ ⇡
Z
1 ⇡
bk = f (x) sin kx dx
⇡ ⇡
Note that ak and bk are just the inner products of f (x) with the basis vectors of Sn and
n
a0 X
+ (ak cos kx + bk sin kx)
2
k=1
is just the orthogonal projection of f (x) onto Sn .

Pn
The series a20 + k=1 (ak cos kx+bk sin kx) is called the nth order or nth degree Fourier series approximation
of f (x) and ak , bk are called the Fourier coefficients . If we consider the approximations as partial sums, then
P1
as n ! 1 we get the usual Fourier series a20 + k=1 (ak cos kx + bk sin kx). There still remains the question of
how good an approximation this is and if this series actually converges to f (x).
Now lets try see what the projection of an easy function onto Sn is:
Example 4: Let f (x) = x. Then clearly f (x) is a vector in C[ ⇡, ⇡]. The Fourier coefficients are:
Z
1 ⇡
a0 = x dx = 0
⇡ ⇡
Z ⇡ Z ⇡
1 ⇡ x sin kx 1
ak = x cos kx dx = sin kx dx = 0
⇡ ⇡ k⇡ ⇡ k⇡ ⇡
Z ⇡ Z ⇡
1 ⇡ x cos kx 1 2
bk = x sin kx dx = cos kx dx = ( 1)k+1
⇡ ⇡ k⇡ ⇡ k⇡ ⇡ k
Therfore the closest vector in Sn to f (x) is
n
X n
X 2
pn (x) = bk sin kx = ( 1)k+1 sin kx
k
k=1 k=1
It is beyond our scope, but as n ! 1, these approximations do converge to f (x) on the interval ( ⇡, ⇡). In
particular, on that interval,
1
X 2
x = f (x) = ( 1)k+1 sin kx
k
k=1
and
1
x X 1
= ( 1)n+1 sin kx
2 k
k=1
If we take x = ⇡/2, then we see that
1
⇡ X 1 ⇡ 1 1 1
= ( 1)k+1 sin k = 1 + + ···
4 k 2 3 5 7
k=1
10

Inner Product

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Inner Product

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Inner Product

Uploaded by

Copyright:

Available Formats

Math 571

Inner Product Spaces

(1) hu, ui 0 with equality if and only if u = 0

(1) The length or norm of the vector u is:

(2) The distance between u and v is:

(3) The angle between u and v is:

(5) The orthogonal projection of u onto the space spanned by v is:

||u ± v||2 = ||u||2 + ||v||2

= hu, ui + hu, vi + hv, ui + hv, vi

Proof: Recall that the projection of u onto v is given by:

hp, u pi = hp, ui hp, pi

Therefore they are orthogonal. ⇤

||p||2 + ||u p||2 = ||(p) + (u p)||2

||p||2 = ||u||2 ||u p||2  ||u||2

In the proof of the previous theorem, we found that

which finally leads to

2. Examples of Inner Product Spaces

where the right-hand side is just matrix multiplication. In particular, if

xT y = ||x|| · ||y|| cos(✓)

An alternate inner product can be defined on Rn by:

****PROOF OF THIS PRODUCT BEING INNER PRODUCT GOES HERE****

2.4. Example: C[a, b]. An inner product on C[a, b] is given by:

3. Subspaces, Orthogonal Complements and Projections

Proof: Write w = u1 + v1 and w = u2 + v2 . Then u1 + v1 = u2 + v2 ) u1 u2 = v2 v1 ) u1 u2 = 0 =

By the Pythagorean Theorem. So

Thus to find x̂ we simply solve for x̂ in the equation:

Theorem 4.0.10. If A is an m ⇥ n matrix and the rank of A is n then AT A is invertible.

Now we proceed with some examples:

It is easy to check that A has rank 2, hence AT A is invertible. Therefore:

(1, 1), (2, 10), (3, 9), (4, 16)

(1.2, 0.3), (2.1, 1.2), (4.1, 2.6), (6.3, 3.8)

This system corresponds to the matrix equation Ax = b where

Recall that an inner product in C[a, b] is given by

is just the orthogonal projection of f (x) onto Sn .

You might also like

PROOF OF THIS PRODUCT BEING INNER PRODUCT GOES HERE