Paper 2
Paper 2
Paper 2
Two novel
features of the chapter are the converse of Taylor's formula and Danskin's
theorem. The
rst result validates the role of Taylor's formula for computing
derivatives, and Danskin's formula is a useful tool in optimization.
Chapter 2 develops the
rst-order and second-order optimality conditions
in unconstrained optimization. Section 2.4 deals with quadratic forms and
symmetric matrices. We recall the spectral decomposition of a symmetric matrix,
give the eigenvalue characterizations of de
nite and semide
nite matrices,
state Descartes's exact rule of sign (whose proof is given in Appendix B), and
use it as tool for recognizing de
nite and semide
nite matrices. We also include
a proof of Sylvester's theorem on the positive de
niteness of a symmetric
matrix. (An elegant optimization-based proof is given in an exercise at the end
of the chapter.) In Section 2.5, we give the proofs of the inverse and implicit
function theorems and Lyusternik's theorem using an optimization-based approach
going back at least to Carath#eodory. A proof of Morse's lemma is given
in Section 2.6 because of the light it throws on the second-order optimality
conditions.
Chapter 3 is devoted to Ekeland's #-variational principle (and its relatives)
and its applications. We use it to prove the central result on linear inequalities
(Motzkin's transposition theorem), and the basic theorems of nonlinear
analysis in a general setting. Variational principles are fascinating, and their
importance in optimization is likely to grow even more in the future.
The next three chapters are devoted to convexity. Chapter 4 treats the
fundamentals of convex analysis. We include Section 4.1 on a#ne geometry
because of its intrinsic importance, and because it helps make certain results
in convexity more transparent.
Chapter 5 delves into the structure of convex sets. A proper understanding
of concepts such as the relative interior, closure, and the faces of convex sets
is essential for proving separation theorems involving convex sets and much
else. The concept of the relative interior is developed in both algebraic and
topological settings.
Chapter 6 is devoted to the separation of convex sets, the essential source
of duality, at least in convex programming. The chapter is divided into two
parts. Sections 6.1{6.5 deal with the separation theorems in
nite dimensions
and do not depend heavily on Chapter 5. They are su#cient for somebody
who is interested in only the
nite-dimensional situation. Section 6.5 is devoted
to the
nite-dimensional version of the Dubovitskii{Milyutin theorem, a
convenient separation theorem, applicable to the separation of several convex
sets. Sections 6.6{6.8 treat the separation theorems involving two or several
convex sets in a very general setting. Chapter 5 is a prerequisite for these
sections, which are intended for more advanced readers.
Chapters 7 and 8 treat the theories of convex polyhedra and linear programming,
respectively. Two sections, Section 7.5 on Tucker's complementarity
theorem and Section 8.5 on the existence of strictly complementary
solutions in linear programming, are important in interior-point methods.
Chapters 9 and 10 treat nonlinear programming. The standard, basic theory
consisting of
rst-order (Fritz John and KKT) and second-order conditions
for optimality is given in Chapter 9. A novel feature of the chapter is the
inclusion
of a
rst-order su#cient optimality condition that goes back to Fritz
John, and several completely solved examples of nonlinear programs. Chapter
10 gives complete solutions for seven structured optimization problems.
These problems are chosen for their intrinsic importance and to demonstrate
that optimization techniques can resolve important problems.
Chapter 11 deals with duality theory. We have chosen to treat duality using
the Lagrangian function. This approach is completely general for convex
programming, because it is equivalent to the approach by Fenchel duality in
that context, and more general because it is sometimes applicable beyond
convex programming. We establish the general correspondence between saddle
point and duality in Section 11.2 and apply it to nonlinear programming
in Section 11.3. The most important result of the chapter is the strong duality
theorem for convex programming given in Section 11.4, under very weak
conditions. It is necessary to use sophisticated separation theorems to achieve
this result. After treating several examples of duality in Section 11.5, we turn
to the duality theory of conic programming in Section 11.6. As a novel application,
we give a proof of Ho#man's lemma using duality in Section 11.8.
Semi-in
nite programming is the topic of Chapter 12. This subject is
not commonly included in most optimization textbooks, but many important problems
in
nite dimensions require it, such as the problem of
nding
the extremal-volume ellipsoids associated with a convex body in Rn. We derive
the Fritz John optimality conditions for these problems using Danskin's
theorem when the set indexing the constraints is compact. In the rest of the
chapter we solve several speci
c, important semi-in
nite programming problem
rather than giving a systematic theory. Another method to treat convex
semi-in
nite programs, using Helly's theorem, is given in Section 13.2.
Chapter 13 is devoted to several special topics in convexity that we deem
interesting: the combinatorial theory of convex sets, homogeneous convex
functions, decomposition of convex cones, and norms of polynomials. The last
topic
nds an interesting application to self-concordant functions in interiorpoint
methods.
The focus of Chapter 14 is on algorithms. The development of numerical
algorithms for optimization problems is a highly intricate art and science,
and anything close to a proper treatment would require several volumes. This
chapter is included in our book out of the conviction that there should be a
place in a book on theory for a chapter such as this, which treats in some
depth a few select algorithms. This should help the reader put the theory
in perspective, and accomplish at least three goals: the reader should see
how theory and algorithms
t together, how they are di#erent, and whether
there are di#erences in the thought processes that go into developing each
part. It should also give the reader additional incentive to learn more about
algorithms.
We choose to treat three fundamental optimization algorithms: the steepestdescent
(and gradient projection), Newton's, and conjugate-gradient methods.
We develop each in some depth and provide convergence rate estimates where
possible. For example, we provide the convergence rate for the steepest-descent
method for the minimization of a convex quadratic function, and for the
minimization
of a convex function with Lipschitz gradient. The convergence theory
of Newton's method is treated, including the convergence theory due to Kantorovich.
Finally, we give a very extensive treatment of the conjugate-gradient
method. We prove its remarkable convergence properties and show its connection
with orthogonal polynomials.
In Appendix A, we give the theory for the consistency of a system of
nitely many linear (both strict and weak) inequalities in arbitrary vector
spaces. The algebraic proof has considerable merits: it is very general, does
not need any prerequisites, and does not use the completeness of the
eld
over which the vector space is de
ned. Consequently, it is applicable to linear
inequalities with rational coe#cients.
In Appendix B, we give a short proof of Descartes's exact rule of sign,
and in Appendix C, the classical proofs of the open mapping theorem and
Graves's theorem.