EC400 Optimisation and Fixed Points Course Pack
EC400 Optimisation and Fixed Points Course Pack
EC400 Optimisation and Fixed Points Course Pack
The course is based on three recorded lectures (available on Moodle), on three synchronous Q&A
sessions (approximately 1 hour long) and on three classes (2.5 hours each). Lectures revise essential
mathematical notions for static optimization and …xed point theorems. Classes show how to apply
these notions to solve relevant problems.
Course Outline
Quadratic Forms
Determinant
Taylor’s Expansion
Lecture 2: Optimization
Unconstrained Optimization
Lagrange Theorem
Kuhn-Tucker Theorem
Arrow-Enthoven Theorem
Envelope Theorem
Correspondences
Ok, Real Analysis with Economic Applications, Princeton University Press, 2007.
Exercises
2
London School of Economics Dr Francesco Nava
Department of Economics O¢ ce: 32L.3.20
Course Outline
Quadratic Forms
Determinant
Taylor’s Expansion
Section 2: Optimization
Unconstrained Optimization
Lagrange Theorem
Kuhn-Tucker Theorem
Arrow-Enthoven Theorem
Envelope Theorem
Correspondences
Quadratic forms are useful and important because: (i) the simplest functions after linear ones;
(ii) conditions for optimization techniques are stated in terms of quadratic forms; (iii) relevant
economic optimization problems have a quadratic objective function, such as risk minimization
problems in …nance (where riskiness is measured by the quadratic variance of the returns from
investments).
De…nition: A quadratic form on Rn is a real valued function:
X
Q(x1 ; x2 ; :::; xn ) = aij xi xj
i j
Among all functions of one variable, the simplest functions with a unique global extremum are
the pure quadratic forms: x2 and x2 . The level curve of a general quadratic form in R2 is
such curve can take the form of an ellipse, a hyperbola, a pair of lines, or the empty set. A a
quadratic form in R2 a11 x21 + a12 x1 x2 + a22 x22 can always be written as
a11 a12 x1
x1 x2 = xT Ax.
0 a22 x2
Q(x) = xT Ax
Conversely if A is a symmetric matrix, then the real valued function Q(x) = xT Ax, is a
quadratic form.
2
De…niteness of Quadratic Forms
1. if a > 0, ax2 is non negative and equal to 0 only when x = 0 which implies that the
quadratic form is positive de…nite and that x = 0 is a global minimizer;
In two dimensions there are additional intermediate cases, the quadratic form is:
1. positive de…nite when it is always non-negative, and positive for all x 6= 0, eg:
x21 + x22 ;
2. negative de…nite when it is always non-positive, and negative for all x 6= 0, eg:
(x1 + x2 );
3. inde…nite when it can take both positive and negative values, eg:
x21 x22 ;
4. positive semide…nite when it is non-negative, but equal to zero for some x 6= 0, eg:
(x1 + x2 )2 ;
5. negative semide…nite when it is non-positive, but equal to zero for some x 6= 0, eg:
(x1 + x2 )2 .
We apply the same terminology for the symmetric matrix A and the implied quadratic form
Q(x) = xT Ax.
De…nition: Let A be an (n n) symmetric matrix. If so, A is:
(a) positive de…nite if xT Ax > 0 for all x 6= 0 in Rn ;
(b) positive semide…nite if xT Ax 0 for all x 6= 0 in Rn ;
(c) negative de…nite if xT Ax < 0 for all x 6= 0 in Rn ;
(d) negative semide…nite if xT Ax 0 for all x 6= 0 in Rn ;
(e) inde…nite xT Ax > 0 for some x 6= 0 in Rn and xT Ax < 0 for some x 6= 0 in Rn .
Application: A function y = f (x) of one variable is concave if its second derivative f 00 (x) 0
on some interval. The generalization of this concept to higher dimensions states that a function
is concave on some region if its second derivative matrix is negative semide…nite for all x on
that region.
3
The Determinant of a Matrix
The determinant of a matrix is a unique scalar associated with the matrix. Rather than showing
that the many usual de…nitions of the determinant are all the same by comparing them to each
other, I’m going to state four general properties of the determinant that are enough to specify
uniquely what number you should get when given a matrix. Then, you may check that all of
the de…nitions for determinant that you’ve encountered satisfy those properties. What we’re
really looking for is a function that takes n vectors (the n columns of the matrix) and returns
a number. Consider a matrix A, and denote its ith column by Ai so that
A = (A1 ; A2 ; :::; An ) .
Assume we’re working with real numbers. Standard operation change the value of the deter-
minant, as follows:
These three properties, together with the fact that the determinant of the identity matrix is
one, are enough to de…ne a unique function that takes n vectors (each of length n) as inputs
and returns a real number which amounts to the determinant of the matrix given by those
vectors.
This observation helps with some of the interpretations of the determinant. In particular,
there’s a nice geometric way to think of a determinant. Consider the unit cube in n dimensional
space (that is the set of vectors of length n with coordinates 0 or 1 in each spot). The determi-
nant of the linear transformation (i.e. of the matrix) T is the signed volume of the region you
get by applying T to the unit cube. Consider how that follows from the abstract de…nition. If
you apply the identity to the unit cube, you get back the unit cube, and the volume of the unit
cube is 1. If you stretch the cube by a constant factor in one direction only, the new volume
is that constant. And if you stack two blocks together aligned on the same direction, their
combined volume is the sum of their volumes: this all shows that the signed volume we have
is linear in each coordinate when considered as a function of the input vectors. Finally, when
you switch two of the vectors that de…ne the unit cube, you ‡ip the orientation.
4
With this geometric de…nition of determinant, if you are familiar with multivariate calculus,
you could think about why determinants (the Jacobian) appears when we change coordinates
doing integration. Hint: a derivative is a linear approximation of the associated function,
and consider a “di¤erential volume element” in your starting coordinate system. It’s not too
much work to check that the area of the parallelogram formed by vectors (a; b) and (c; d) is
det((a; b); (c; d)).
5
Computing the Determinant of a Matrix
With more than three dimensions standard recursive de…nitions can be applied for its calcula-
tion. To do so, denote by:
Mij the submatrix obtained by removing row i and column j form a matrix A.
6
Submatrices and Minors
possesses one third order principal minor, namely det A, three second order principal minors,
a11 a12
L3 = det A, L2 = det , L1 = a11 .
a21 a22
7
Testing the De…niteness of a Matrix
positive de…nite if and only if every leading principal minor is strictly positive;
negative de…nite if and only if every leading principal minor alternates in sign
with the k th order leading principal minor having the same sign as ( 1)k .
negative semide…nite if and only if every principal minor of odd order is non-positive and
every principal minor of even order is non-negative.
Obviously such a quadratic form will be: positive (negative) de…nite if and only if every ai
is positive (negative); positive (negative) semide…nite if and only if every ai is non-negative
(non-positive); inde…nite if there are two entries ai with opposite signs.
a b x1
Q(x1 ; x2 ) = x1 x2 = ax21 + 2bx1 x2 + cx22
b c x2
If a = 0, then Q cannot be de…nite since Q(1; 0) = 0. If a 6= 0 instead, add and subtract b2 x22 =a
to get
b2 2 b2 2
Q(x1 ; x2 ) = ax21 + 2bx1 x2 + cx22 + x x
a 2 a 2
2bx1 x2 b2 2 b2 2
= a(x21 + + 2 x2 ) x + cx22
a a a 2
b 2 (ac b2 ) 2
= a(x1 + x2 ) + x2 .
a a
8
If both coe¢ cients, a and (ac b2 )=a, are positive, then Q will never be negative. It will equal
0 only when x1 + ab x2 = 0 and x2 = 0 in other words, when x1 = 0 and x2 = 0. That is, if
then Q is positive de…nite. Conversely, if Q is positive de…nite then both a and det A = ac b2
are positive.
Similarly, Q is negative de…nite if and only if both coe¢ cient are negative, which occurs if and
only if a < 0 and ac b2 > 0, that is, when the leading principal minors alternative in sign.
Finally, if a < 0 and ac b2 < 0, then the two coe¢ cients will have opposite signs and Q will
be inde…nite.
2 3
Consider A = . Since jA1 j = 2 and jA2 j = 5, A is positive de…nite.
3 7
2 4
Consider B = . Since jB1 j = 2 and jB2 j = 2, B is inde…nite.
4 7
Consider Q(x) = bx21 + a(x22 + x23 ) + 2bx2 x3 . Its principal submatrices are:
0 1
b 0 0
b 0 b 0 a b
3rd : @ 0 a b A ; 2nd : , , ; 1st : (b) , (a) , (a).
0 a 0 a b a
0 b a
Q(x) is positive semide…nite if all principal minors are non-negative which requires:
a 0, b 0, ab 0, a2 b2 0, a2 b b3 0.
a 0, b 0, ab 0, a2 b2 0, a2 b b3 0.
For these to hold at once it must be that a 0, b 0, b a. Thus, the quadratic form
is inde…nite if and only if either ab < b2 or ab < 0. The plot below depicts the answer
9
Section 1: Tools for Optimization
Taylor Theorem
Taylor’s Expansion
The second tool needed for maximization is Taylor’s expansion (or approximation). The …rst
order Taylor’s approximation of a function f : R1 ! R1 satis…es
P (hja) = f (a) + f 0 (a)h.
The polynomial approximates f around a, since f (a + h) can be written as
f (a + h) = f (a) + f 0 (a)h + R(hja)
where R(hja) is the di¤erence between the two sides of the approximation.
R(hja)
The de…nition of the derivative f 0 (a) implies that h
! 0 as h ! 0.
Geometrically, this formalizes the approximation of f by its tangent line at (a; f (a)). Analyti-
cally, it describes the best approximation of f by a polynomial of degree 1.
De…nition: The k th order Taylor expansion of f at x = a is
f 00 (a) 2 f [k] (a) k
Pk (hja) = f (a) + f 0 (a)h + h + ::: + h .
2! k!
Example: Compute the …rst and second order Taylor polynomial of the exponential function
f (a) = ea at a = 0. Since all the derivatives of f at a = 0 are equal 1:
P1 (hj0) = 1 + h
h2
P2 (hj0) = 1 + h +
2
For h = 0:2, then P1 (0:2j0) = 1:2 and P2 (0:2j0) = 1:22 compared with the actual value of
e0:2 = 1:2214.
10
Multivariate Taylor’s Expansion
For functions of several variables the …rst order Taylor expansion satis…es
@F @F
P1 (hja) = F (a) + (a)h1 + ::: + (a)hn = F (a) + DF (a)h,
@x1 @xn
where DF (a) denotes the Jacobian,
@F @F
DF (a) = (a); :::; (a) .
@x1 @xn
0 @2F @2F 1
@x1 @x1
(a)
::: @xn @x1
(a)
D2 F (a) = @ ::: ::: ::: A.
@2F @2F
@x1 @xn
(a) ::: @xn @xn
(a)
11
Section 1: Tools for Optimization
Concavity and Quasi-Concavity
tx + (1 t)y 2 U .
Observations:
(1) A function f is concave if and only if f is convex.
(2) Linear functions are both convex and concave.
(3) The de…nitions of concavity and convexity require f to have a convex set as a domain.
12
More on Concavity
tf (y) + (1 t)f (x) f (ty + (1 t)x) , t(f (y) f (x)) + f (x) f (x + t(y x)) ,
f (x + t(y x)) f (x) f (x + h) f (x)
f (y) f (x) , f (y) f (x) (y x),
t h
for h = t(y x). Taking limits when h ! 0, the latter implies that
To prove the converse, let z = ty + (1 t)x. By applying equation (1), we have that
Summing the …rst inequality multiplied by t to the second multiplied by (1 t), we …nd the
result
f 0 (y) f 0 (x)
0:
y x
The limit of the last expression when y ! x yields f 00 (x) 0. To prove the converse, pick
y x, and note that by Fundamental Theorem of Calculus and f 00 0, we have that
Ry Ry 0
f (y) f (x) = x f 0 (t)dt x
f (x)dt = f 0 (x)(y x).
13
Claim: Let f1 ; :::; fk be concave functions de…ned on the same convex set U Rn . Let
a1 ; a2 ; :::; ak be positive numbers. If so, the function
g = a1 f1 + a2 f2 + ::: + ak fk
is a concave function on U .
Proof: Immediately observe:
P Pk
g(tx + (1 t)y) = ki=1 ai fi (tx + (1 t)y) i=1 ai (tfi (x) + (1 t)fi (y)) =
Pk Pk
= t i=1 ai fi (x) + (1 t) i=1 ai fi (y) = tg(x) + (1 t)g(y).
Example: Consider a …rm whose production function is y = g(x), where y denotes output
and x denote the input bundle. If p denotes the price of output and wi is the cost per unit of
input i, then the …rm’s pro…t function satis…es
Thus, the pro…t function is a concave function if the production function g is concave, since
the previous claim holds and because (w1 x1 + w2 x2 + ::: + wn xn ) is linear and, thus concave.
14
Quasiconcave and Quasiconvex Functions
The level sets of a quasiconcave function bound convex subsets from below.
The level sets of a quasiconvex function bound convex subsets from above.
Claim: Every concave function is quasiconcave and every convex function is quasiconvex.
Proof: Let x; y 2 Ca+ , so that f (x) a and f (y) a, and let f be concave. If so:
f (tx + (1 t)y) tf (x) + (1 t)f (y)
ta + (1 t)a = a.
So tx + (1 t)y is in Ca+ and hence this set is convex. We have shown that if f is concave, it
is also quasiconcave. Try to show that every convex function is quasiconvex.
15
The Economic Content of Quasiconcavity
Quasi-concavity is simply a desirable property when we talk about economic objective functions
such as preferences:
(1) The convexity of uppercontour sets is a natural requirement for utility and production
functions. For example, consider an indi¤erence curve of the concave utility function f (x; y).
Take two bundles on this indi¤erence curve. The set of bundles which are preferred to them, is
a convex set. In particular, the bundles that mix their contents are in this preferred set. Thus,
a consumer with a concave utility function always prefers a mixture of any two bundles with
the same utility to any of them.
(2) A more important advantage of the shape of the indi¤erence curve is that it displays a
diminishing marginal rate of substitution. As one moves left to right along the indi¤erence
curve increasing consumption of good x, the consumer is willing to give up more and more
units of good x to gain an additional unit of good y. This property is a property of concave
utility functions because each level set forms the boundary of a convex region.
16
Properties of Quasiconcavity
Observations:
(1) Any increasing transformation of a concave function is quasiconcave.
(2) An monotone function f (x) on R1 is both quasiconcave and quasiconvex.
(3) A single peaked function f (x) on R1 is quasiconcave.
(4) The function minfx; yg is quasiconcave, as Ca+ is convex.
17
Example: To show why monotone functions R2+ are not quasiconcave, consider the function
f (x; y) = x2 + y 2 .
Such function is clearly increasing in R2+ as Df (x; y) = (2x; 2y). However, the function is not
quasi-concave. In fact, (1; 0) ; (0; 1) 2 C1+ . However, (1=2; 1=2) 2
= C1+ as
Claim: Let f be a function de…ned on a convex set U Rn . The following are equivalent:
1. f is a quasiconcave function on U ;
Proof: To prove that (1) implies (2), take any x; y 2 U and let minff (x); f (y)g = k. The latter
implies that x; y 2 Ck+ . But if f is quasiconcave, Ck+ must be convex and thus tx+(1 t)y 2 Ck+
for all t 2 [0; 1]. Thus, for any t 2 [0; 1],
To prove the converse, take any x; y 2 Ck+ and note that (2) implies that for all t 2 [0; 1]
Thus, tx + (1 t)y 2 Ck+ for all t 2 [0; 1] and all x; y 2 Ck+ , meaning that Ck+ is convex for all
k, and hence f is quasiconcave.
18
Section 2: Optimization
De…nition: The ball B(xj") centred at x of radius " is the set of all vectors y in Rn whose
distance from x is less than ",
(3) A local maximizer for f (x) if there is a strictly positive number " such that:
(4) A strict local maximizer for f (x) if there is a strictly positive number " such that:
(5) A critical point for f (x) if the …rst partial derivative of f (x) exists at x and
@f (x )
= 0 for i = 1; 2; :::; n , Df (x ) = 0.
@xi
@F @F
(x; y) = 3x2 + 9y = 0 and (x; y) = 3y 2 + 9x = 0.
@x @y
19
Do Extreme Points Exist?
Consider real valued functions f (x) that are continuously di¤erentiable function on an interval
I (so that, f 0 (x) exists and is continuous).
20
Unconstrained Optimization: Functions of Several Variables
Consider real valued functions f (x) that are continuously di¤erentiable on a subset C Rn
(so that, all …rst partial derivatives exist and are continuous).
@f @f
Df (x ) = (x ); :::; (x ) = 0.
@x1 @xn
How does one determine whether a critical point is a local maximum or a local minimum
then? To this end one has to consider the Hessian of the map f (the matrix of the second
order partial derivatives). By the Clairaut-Schwarz theorem, the Hessian is always a symmetric
matrix as cross-partial derivatives are equal (if the function has continuous second order partial
derivatives).
Observations:
(1) If x is an interior point and a global maximum (minimum) of f , then D2 f (x ) is negative
(positive) semide…nite.
(2) If x is a critical point, and D2 f (x ) is negative (positive) semide…nite, then x may not
be local maximum (minimum).
Example: Again consider the example f (x1 ; x2 ) = x31 x32 + 9x1 x2 . Compute the Hessian:
6x1 9
D2 f (x1 ; x2 ) =
9 6x2
The …rst order leading principle minor is 6x1 and the second order leading principal minor is
det (D2 f (x1 ; x2 )) = 36x1 x2 81. At (0; 0) the two minors are 0 and 81 and hence the matrix
is inde…nite. The point is neither a local min nor a local max (it is a saddle point). At (3; 3)
the two minors are positive and hence the point is a strict local minimum of f . However, this
is not a global minimum (why? set x1 = 0 and x2 ! 1).
21
Unconstrained Optimization: More on Su¢ cient Conditions
f (y) f (x) 0.
If follows then immediately that ful…lling (i) or (ii) implies that x is a global maximizer of f .
Comment: The property that critical points of concave functions are global maximizers is an
important one in economic theory. For example, many economic principles, such as marginal
rate of substitution equals the price ratio, or marginal revenue equals marginal cost are simply
the …rst order necessary conditions of the corresponding maximization problem. Ideally, one
would like such conditions also to be a su¢ cient for guaranteeing that utility or pro…t is being
maximized. This will indeed be the case when the objective function is concave.
22
Unconstrained Optimization: Summary Necessary and Su¢ cient Conditions
Recall that B(xj") denotes an open ball of radius epsilon around x, and let B(xj") = B(xj")nfxg.
The following table summarizes necessary and su¢ cient conditions for a maximum:
23
Constrained Optimization: A General Problem
We now analyze an optimization problem in which constraints limit the choice of variable
24
Equality Constraints: Necessary Conditions
p 1 x1 + p 2 x2 = m
Geometrical Representation:
Draw the constraint on the (x1 ; x2 ) plane.
Draw representative samples of level curves of the objective function f .
The goal is to …nd the highest value level curve of f which meets the constraint.
It cannot cross the constraint set; it therefore must be tangent to it.
To …nd the slope of the level set f (x1 ; x2 ) = k, use total di¤erentiation:
25
Alternatively, form the Lagrangian function associated to the maximization problem:
Intuition: The Lagrangian amounts to the objective function diminished by a penalty (the
Lagrange multiplier) for violating the constraint. As the constraint holds with equality when
@L=@ = 0, no penalty is ever paid at the optimum. But the Lagrange multiplier characterizes
the smallest …ne for which it is not worthwhile to violate the constraint. The same analysis
easily extends to the case of several equality constraints.
x2 = 0,
x1 4 = 0,
x1 + 4x2 16 = 0.
26
Equality Constraints: Lagrange Theorem in General
Lagrange Theorem: Let f and hi for all i 2 f1; :::; mg be continuously di¤erentiable functions
from Rn to R. Let x maximize f (x) subject to h(x) = 0 and suppose that the gradients
Dh1 (x ); :::; Dhm (x ) are linearly independent. If so, there exists a vector = ( 1 ; :::; m )
such that P
Df (x ) = Dh(x ) = m i=1 i Dhi (x ).
| {z } |{z}| {z }
1 n 1 m m n
Proof: For convenience, consider the norm khk = (hT h)1=2 . For a; c > 0, de…ne the following
auxiliary function
a c
f a (x) = f (x) kh(x)k2 kx x k2 .
2 2
Consider the neighborhood B(x j") = fx 2 R j " kx x kg. Let xa maximize f a (x) subject
n
to x 2 B(x j") (where xa exists by the extreme value theorem). For any a > 0, by optimality
of xa for all x 2 B(x j"),
f a (xa ) f a (x ) = f (x ). (2)
As f a (x) is bounded on B" by continuity, we have that
or else lima!1 f a (xa ) = 1. Thus, if lima!1 xa = x, we must have that kh(x)k2 = 0. Taking
limits of equation (2) implies that
c
lima!1 f a (xa ) = f (x) kx x k2 f (x ).
2
But if so, x = x , given that f (x) f (x ) as x maximizes f (x) subject to h(x) = 0.
Necessary conditions for the optimality of xa require
As Dh(x ) is linearly independent, Dh(x) must be linearly independent for any x 2 B(x j")
1
when " su¢ ciently small, since h is continuously di¤erentiable. But if so, Dh(x)Dh(x)T ex-
a T a a T 1
ists for all x 2 B(x j"). Then, post-multiplying equation (3) by Dh(x ) Dh(x )Dh(x )
implies that
1
(Df (xa ) c(xa x )T )Dh(xa )T Dh(xa )Dh(xa )T = ah(xa )T .
Taking limits of equation (3) then delivers the desired result since
Df (x ) = Dh(x ).
27
Inequality Constraints: Introduction
Df (x ) = Dh(x ),
Now consider a problem with two variables and one inequality constraint,
Graphical Representation:
Either the solution is on the boundary of the constraint set.
If so, the constraint is binding and there is a tangency at the solution.
Or the solution is in the interior of the constraint set.
If so, the constraint is slack and the solution is una¤ected.
When the constraint binds g(x1 ; x2 ) = 0, the problem is similar to one with an equality con-
straint. However, even when the constraint binds, a restriction has to be imposed on the sign
of Lagrange multiplier . Gradients still satisfy
Df (x ) = Dg(x ).
But now the sign of is important. In fact, the gradients must point in the same direction or
else f could increase and still satisfy the constraint. This means that 0. This is the main
di¤erence between inequality and equality constraints.
28
But what about @L=@ ?
(1) If the maximum x is achieved when g(x ) = 0, then the problem is solved as for an equality
constraint and @L=@ = 0.
(2) If the maximum x is achieved when g(x ) < 0, the constraint is not binding and the optimal
solution is interior. If so, the point x is an unconstrained local maximum and satis…es
@f @f
(x ) = (x ) = 0,
@x1 @x2
and therefore can still use the Lagrangian, provided that we set = 0!!
In other words, either the constraint is binding and g(x1 ; x2 ) = 0, or it does not bind and
= 0 as a slack constraint cannot a¤ect the solution to a maximization problem. In short, the
following complementary slackness condition has to hold
@L
= g(x1 ; x2 ) = 0.
@
29
Inequality Constraints: Examples
Recall that the constrained optimization problem with two variables and one inequality con-
straints. Let f and g be continuous di¤erentiable functions. Let x = (x1 ; x2 ) be a solution to
max f (x) subject to g(x) 0. If g(x ) = 0, suppose that x is not a critical point of g. Then,
there is a real number such that
Example One Constraint: Consider a pro…t maximizing …rm that produces output y from
input x according to the production function x0:5 . Let the price of output be 2, and that of
input be 1. The …rm cannot buy more than a > 0 units of input. This …rm’s maximization
problem amounts to
maxx 2x0:5 x subject to x a 0.
= f 0 (a) = a 0:5
1.
(2) If = 0, the constraint must be slack. From the …rst order condition,
0:5
x 1 = 0 , x = 1.
30
Inequality Constraints: Several Constraints
Before solving the Lagrangian, observe that at the relevant critical points 1 = 2 = 3 = 0.
In fact, if i > 0, complementary slackness would require xi = 0 and x1 x2 x3 = 0.
Obviously we could improve, for instance by setting xi = 1=2 for all i.
x1 + x2 + x3 < 3,
and it would be possible to meet the constraint while increase the value of the function x1 x2 x3
by increasing any one of the three variables.
x1 x2 = x2 x3 = x1 x3 = 4.
31
Inequality Constraints: Su¢ cient Conditions
where g consists of m constraints g(x) = (g1 (x); :::; gm (x))T . For this problem we have char-
acterized necessary conditions for a maximum. These conditions state that any solution x of
the constrained optimization problem must also be a critical point of the Lagrangian,
Solution: To answer the latter, let (x ; ) satisfy necessary conditions for a maximum:
@L(x ; ) @f (x ) @g(x )
= = 0 for any i 2 f1; :::; ng ,
@xi @xi @xi
@L(x ; )
j = 0, j 0 and gj (x ) 0 for any j 2 f1; :::; mg .
@ j
Claim: If x maximizes the Lagrangian, it also maximizes f subject to g(x) = 0.
Proof: To see this note that by complementary slackness g(x ) = 0 and thus
L(x ; ) = f (x ) g(x ) = f (x ).
To conclude the argument recall the main results from unconstrained optimization. If f is a
concave function de…ned on a convex subset of Rn and x0 is a point in the interior in which
Df (x0 ) = 0, then f (x) f (x0 ) for all x. You have shown in class that in the constrained
optimization problem, if f is concave and g is convex, then the Lagrangian function is also
concave which means that …rst order conditions are su¢ cient in this case.
32
Inequality Constraints: Karush-Kuhn-Tucker Theorem
We need to impose additional constraint quali…cations to ensure that the problem is not degen-
erate. These can be stated in many ways. Two equivalent requirements appear below. Denote
by gE = (g1 ; ::; ge )T the vector of binding constraints (namely, those for which ( 1 ; ::; e ) > 0),
while letting ge+1 to gk denote slack ones. Posit that the multipliers associated to all equality
constraints di¤er from 0 (or else, drop those for which this is not the case).
KKT Constraint Quali…cations: Consider one of the following two restrictions:
Slater Condition: there exists a point x 2 Rn such that g(x) < 0 and h(x) = 0, and h
is a¢ ne.
Dx gE (x )
wT 6= 0 for any w 2Re+m nf0g.
Dx h(x )
KKT Theorem: Let f , h and g be di¤erentiable. Let f be concave, hj be convex for all
j 2 f1; :::; kg, gj be convex for all j 2 f1; :::; mg, and a constraint quali…cation hold. Then x
solves the constraint optimization problem if and only if (x ; ; ) solves
Df (x ) Dh(x ) Dg(x ) = 0,
h(x ) = 0, g(x ) = 0, 0 and g(x ) 0.
33
Inequality Constraints: Local Su¢ cient Conditions
Concavity of the objective function and convexity of the constraints imply that any critical
point of the Lagrangian is a global solution of the constraint optimization problem. If such
conditions are not satis…ed, we can still rely on su¢ cient conditions for local maxima.
Mechanically, one can solve constrained optimization problems in the following way:
(1) Form the Lagrangian L(x; ) = f (x) g(x), and a solution to KKT necessary conditions:
(2) Let g1 to ge denote the binding constraints and ge+1 to gk slack ones. De…ne gE = (g1 ; ::; ge )T .
(3) Observe that the set fv 2Rn jDx gE (x )v = 0g identi…es the hyperplane tangent to the
binding constraint set at the critical point x . In particular, the hyperplane tangent to
constraint i 2 f1; :::; eg at x is fx 2Rn jDx gi (x )(x x ) = 0g, which we normalize to
fv 2Rn jDx gi (x )v = 0g. Each of these hyperplanes can be interpreted as a linear map from
Rn 1 into R. Now, fv 2Rn jDx gE (x )v = 0g denotes the intersection of these hyperplanes
which can be interpreted as a linear map from Rn e into R. Note that if n = e, we are looking
at a point as there are no free parameters.
(4) For local second order conditions to hold we need the Lagrangian to be negative de…nite
for any non-degenerate point in a neighborhood of x on such a linear subspace tangent to the
constraint set. Formally this requires the Hessian of the Lagrangian with respect to x at the
critical point (x ; ) to be negative de…nite on the hyperplane fv 2Rn jDx gE (x )v = 0g. That
is, we one of the followowing two conditions to hold:
D2 L(x ; ) D Dx L(x ; ) 0 Dx gE (x )
D2 L(x ; )= = .
D Dx L(x ; )T Dx2 L(x ; ) Dx gE (x )T Dx2 L(x ; )
(6) If the largest n e leading principal minors of D2 (x ; ) alternate in sign with the sign of
the largest leading principal minor the same as the sign of ( 1)n , then su¢ cient second order
conditions hold for the critical point to be a local maximum of the constrained maximization
problem.
Remark: If the largest n e leading principal minors of D2 (x ; ) have the same sign as
( 1)e , then su¢ cient second order conditions hold for the critical point to be a local minimum
of the constrained maximization problem.
34
Inequality Constraints: Arrow-Enthoven Theorem
The KKT Theorem applies to concave objective functions and convex constraints. Many eco-
nomic objective functions are however only quasi-concave, and many constraints are only quasi-
convex. The following result generalizes the KKT Theorem to these settings. For sake of brevity,
we state the result only for inequality constraints. But the generalization is straightforward.
solves the constraint optimization problem provided that Dx f (x ) 6= 0 and that f (x) is twice
di¤erentiable in the neighborhood of x .
Numerous alternative regularity conditions can replace the latter requirement which only deals
with slopes being well de…ned.
35
Section 3: Comparative Statics and Fixed Points
Value Functions and Envelope Theorem
The value function is the objective function of a constraint optimization problem evaluated at
the solution. Pro…t functions and indirect utility functions are examples of maximum value
functions, whereas cost functions and expenditure functions are minimum value functions.
where g(xjb) = (g1 (xjb); ::; gm (xjb))T denotes the vector of constraints and b a parameter.
For this constrained optimization problem:
let x(b) = (x1 (b); :::; xn (b))T denote the optimal solution;
let (b) = ( 1 (b); :::; m (b)) denote the corresponding Lagrange multipliers;
let L(x; jb) denote the Lagrangian of the constraint optimization problem;
v(b) = f (x(b)jb).
Envelope Theorem: Suppose that (x(b); (b)) are di¤erentiable functions and that x(b)
satis…es the constraint quali…cation. If so,
36
A General Proof of the Envelope Theorem
Envelope Theorem: Suppose that (x(b); (b)) are di¤erentiable functions and that x(b)
satis…es the constraint quali…cation. If so,
Proof: Recall that FOC for the constraint optimization problem require that
Observe that, as all the constraints bind, total di¤erentiation implies that
dv(b) df (x(b)jb)
= = Db f (x(b)jb) + Dx f (x(b)jb)Db x(b) =
db db
= Db f (x(b)jb) + (b)Dx g(x(b)jb)Db x(b) =
= Db f (x(b)jb) (b)Db g(x(b)jb) =
@L(x(b); (b)jb)
= .
@b
where the …rst equality is de…nitional, the second holds by simple di¤erentiation, the third
holds by FOC, the fourth by totally di¤erentiating constraints, and the …fth amounts to the
partial of the Lagrangian.
37
The Meaning of Lagrange Multiplier
Dx h(x(b))Db x(b) = 1
df (x(b))
= Dx f (x(b)Db x(b) = (b)Dx h(x(b))Db x(b) = (b).
db
Interpretation: This observation allows us to attach to the multiplier the economic interpre-
tation of a shadow price. For instance, in a model in which a …rm maximizes pro…ts subject
to some resource constraint, the multiplier identi…es how valuable an additional unit of input
would be to the …rm’s pro…ts, or how much the maximum value of the …rm changes when the
constraint is relaxed. In other words, it is the maximum amount the …rm would be willing to
pay to acquire another unit of input. This immediately follows as
d @
f (x(b)) = (b) = L(x(b); (b)jb).
db @b
38
Envelope Theorem Intuition
Consider the implications of the envelope theorem for a scalar b in an unconstrained problem
maxx2Rn f (xjb).
Let x(b) denote the solution of this problem, and assume it to be continuous and di¤erentiable
in b.
Intuition: When we are already at a maximum, changing slightly the parameters of the
problem or the constraints, does not a¤ect the value through changes in the solution x(b),
because the …rst order conditions hold. Thus, only the partial derivative matters. If there are
multiple solutions and you can still use the envelope theorem, but you have to make sure that
you do not jump to a di¤erent solution in a discrete manner.
39
Implicit Function Theorem and Comparative Statics
The IFT provides su¢ cient conditions for a set of simultaneous equations
Fi (xjb) = Fi (x1 ; :::; xn jb1 ; :::; bm ) = 0 for 8i = 1; :::; n or equivalently F(xjb) = 0,
to have a solution de…ned by of implicit functions
xi = xi (b) = xi (b1 ; :::; bm ) for 8i = 1; :::; n or equivalently x = x(b).
In other words, IFT disciplines the system of equations so to ensure that the n equations
de…ning the system can be solved by the n variables, x1 ; :::; xn , even when an explicit form
solution cannot be obtained.
Implicit Function Theorem: Consider the system of equations F(xjb) = 0. Let any function
Fi have continuous partial derivatives with respect to all x and b variables. Consider a point
x that solves the system of equations at parameter values b and at which the determinant of
the n n Jacobian with respect to the x variables is not zero,
@F1 @F1 @F1
@x1 @x2
::: @xn
@F2 @F2 @F2
@F(xjb) @x1 @x2
::: @xn
Dx F(xjb) = = 6= 0.
@x ::: ::: ::: :::
@Fn @Fn @Fn
@x1 @x2
::: @xn
If so, there exists an m-dimensional neighborhood of b in which the variables x are functions
of b. For every vector b in the neighborhood F(x(b)jb) = 0, thereby giving to the system of
equations the status of a set of identities in this neighborhood. Moreover, the implicit functions
x(b) are continuous and have continuous partial derivatives with respect to all the variables b.
Thus, IFT allows us to …nd the partial derivatives of the implicit functions x(b) without solving
for the x variables. To do so, since in the neighborhood of the solution the equations have a
status of identities, we can set the total di¤erential to zero, dF (x(b)jb)=db = 0. For instance,
when considering only dbi 6= 0 and setting dbj = 0 for j 6= i, the result in matrix notation
implies that
1
@F(x(b)jb) @x(b) @F(x(b)jb) @x(b) @F(x(b)jb) @F(x(b)jb)
+ =0 ) = .
@x @bi @bi @bi @x @bi
Since Dx F(xjb) 6= 0, a unique nontrivial solution exists to this linear system which by
Cramer’s rule satis…es
@xj (b) Dx F(xjb)j
=
@bi Dx F(xjb)
where Dx F(xjb)j is the matrix formed by replacing the j th column of Dx F(xjb) with the vector
Dbi F(xjb).
40
Example: Implicit Function Theorem
x 1 e x2
F(xjb) = = 0.
x21 + x22 b
Thus, when x 6= 0, IFT applies and there exists implicit functions such that
41
Implicit Function Theorem and Optimization
The previous argument was designed to hold for arbitrary systems of equations. Optimization
problems however, have a unique feature though, namely that the condition jDx F(xjb)j 6= 0
always holds. This is the case since in an optimization problem such matrix simply coincides
with the Hessian of the Lagrangian (that is, the bordered Hessian).
This means that indeed we can take the maximum value function, or set of equilibrium condi-
tions, totally di¤erentiate them and …nd how the endogenous variables change with the exoge-
nous ones in the neighborhood of the solution.
An Example: Consider a simple optimization with two variables and one equality constraint
F 1 ( ; xjb) = g(x) + b = 0,
F 2 ( ; xjb) = f1 (x) g1 (x) = 0,
3
F ( ; xjb) = f2 (x) g2 (x) = 0.
We need to verify that the Jacobian of the system is di¤erent from zero,
@F 1 @F 1 @F 1
@ @x1 @x2 0 g1 g2
@F 2 @F 2 @F 2
jD ;x F( ; xjb)j = @ @x1 @x2 = g1 f11 g11 f21 g21 6= 0.
@F 3 @F 3 @F 3 g2 f12 g12 f22 g22
@ @x1 @x2
But the determinant of such a matrix is that of the bordered Hessian of the problem.
@x1 @x2
g1 + g2 1 = 0,
@b @b
@x1 @x2 @
(f11 g11 ) + (f12 g12 ) g1 = 0,
@b @b @b
@x1 @x2 @
(f21 g21 ) + (f22 g22 ) g2 = 0.
@b @b @b
@x1 @x2 @
At the equilibrium solution, one can then solve for @b
, @b , and @b
.
42
Correspondences
As equilibria in economics are generally de…ned as …xed points of some system of equations,
proving equilibrium existence essentially relies on classical …xed point theorems. This true
in game theory and in general equilibrium theory. We state these results more generally for
correspondences (and not just for functions) since the solutions of the optimization problems
faced by players may not be unique.
0 if x 2 [0; 1)
h(x) = .
[0; 1] if x = 1
43
Properties of Correspondences
1
upper-hemicontinuous if for any a sequence y k ; xk k=1
such that y k 2 h(xk ) for all k,
44
Fixed Point Theorem
Kakutani Fixed Point Theorem: Let X be a non-empty, compact and convex subset of Rn .
Let h : X X be a non-empty, convex-valued, upper-hemicontinuous correspondence. Then
h has a …xed point.
45
Intuition: Fixed Point Theorem
Proof Intuition One-Dimension: Let X = [0; 1] which is a non-empty, compact and convex.
If 0 2 h(0) or if 1 2 h(1), we have a …xed point. If not, y > 0 for all y 2 h(0) and y < 1 for all
y 2 h(1). If so, take the smallest value x such that y x for some y 2 h(x). By UHC, there
1
exists y 0 2 h(x) such that y 0 x. To show the latter, consider any sequence y k ; xk k=1 such
that y k 2 h(xk ) and xk < x for all k. If so, as y k > xk for all k, we have that
y0 limk!1 y k limk!1 xk x,
and by UHC we further have that y 0 2 h(x). But if so, as h is convex-valued, we must have
that x 2 h(x).
46
More on Continuity and Correspondences
47
More on Continuity and Theorem of the Maximum
Remark: The theorem plays a key role in invoking FPT as it delivers UHC and non-empty
compact values not convex values.
then x is convex-valued.
48
Extra: Dynamic Consumption-Savings Problem
A player starts life with wealth a0 > 0. Each period t = 0; : : : ; 1, he decides how much to
consume ct 0. Consumption is chosen to maximize your lifetime utility,
X1
t
u(ct ),
t=0
where 2 [0; 1) is the discount factor. Further assume that limc!0 u0 (c) = +1, so that ct > 0
each period. Wealth at the beginning of period t is denoted by at . Wealth at is invested at a
constant interest rate r. Wealth evolves according to the law of motion
at+1 = (1 + r)at ct .
at+1 = (1 + r)at ct ,
at+1 0.
where f t ; t g1
t=0 are sequences of Lagrange multipliers associated to the double sequence of
constraints. The …rst-order conditions of the problem require for all t 0:
@L=@ct = t u0 (ct ) t = 0,
@L=@at+1 = (1 + r) t+1 t+ t = 0.
Necessary conditions are also su¢ cient if the problem is well behaved. Wealth at never falls to
zero. If aT = 0 for some T , then ct = 0 for all t T which cannot be as u0 (0) = 1. Therefore,
we have that for all t 0:
u0 (ct ) = (1 + r) u0 (ct+1 );
at+1 = (1 + r)at ct .
49
London School of Economics Dr Francesco Nava
Department of Economics O¢ ce: 32L.3.20
can be written as
a11 a12 x1
x1 x2 ;
0 a22 x2
and …nd its unique symmetric representation.
2. List all the principal minors of a general (3 3) matrix and denote which are the three
leading principal minors.
For what values of the parameter values, a and b, is the quadratic form Q(x) inde…nite?
Plot your answer in R2 .
5. Approximate ex at x = 0 with a Taylor polynomial of order three and four. Then compute
the values of these approximation at h = 0:2 and at h = 1 and compare with the actual
values.
6. For each of the following functions on R, determine whether they are quasiconcave, qua-
siconvex, both, or neither:
State the corresponding theorem for quasiconvex functions and prove it.
8. Prove that a weakly increasing transformation (g : R ! R such that g 0 0) of a quasi-
concave function is quasi-concave.
@f (x)
x 0,
@x
must be quasi-concave.
2
London School of Economics Dr Francesco Nava
Department of Economics O¢ ce: 32L.3.20
1. Find critical points, local maxima and local minima for each of the following functions:
(a) x4 + x2 6xy + 3y 2
(b) x2 6xy + 2y 2 + 10x + 2y 5
(c) xy 2 + x3 y xy
Which of the critical points are also global maxima or global minima?
2. Let S Rn be an open set and f : S ! R be a twice continuously di¤erentiable function.
Suppose that Df (x ) = 0. State the weakest su¢ cient conditions that the Hessian must
satisfy at the critical point x for:
(i) x to be a local max;
(ii) x to be a strict local min.
maxx;y 0 x2 + y 2 subject to ax + y = 1.
3
9. Consider the problem of maximizing xyz subject to x + y + z 1, x 0, y 0, and
z 0. Obviously, the three latter constraints do not bind, and we can concentrate only
on the …rst constraint, x + y + z 1. Find the solution and the Lagrange multiplier,
and show how the optimal value would change if instead the constraint was changed to
x + y + z 9=10.
u(x) if g(x) 0
f (x) = .
v(x) if g(x) 0
Further, suppose that: (i) u(x) = v(x) if g(x) = 0; (ii) u and v are di¤erentiable, strictly
concave, and posses maximizer in Rn ; and (iii) g(x) is di¤erentiable and strictly convex.
Carefully explain how you would solve the problem of maximizing f by choosing x 2 Rn .
4
London School of Economics Dr Francesco Nava
Department of Economics O¢ ce: 32L.3.20
maxx;y ax + y
s.t. x2 + (x y)2 1
x a, y 0.
Using the Kuhn-Tucker approach, write down the necessary …rst order conditions that
must be satis…ed by the solution of the constrained optimization problem. Are solutions to
these conditions also maximizers of the Lagrangian? Solve the constrained optimization
problem in terms of a. Find the slope of value function with respect to a, possibly relying
on the Envelope Theorem.
u(x; y) = x + y 1=2 .
The consumer has a positive income I > 0 and faces positive prices px > 0 and py > 0.
The consumer cannot buy negative amounts of any of the goods.
x1 ex2
F(xjb) = = 0.
ex1 + x2 b2
Argue whether you can invoke the implicit function theorem. Then, exploit the theorem
where possible to …nd the slopes of the implicit functions f (b).
5
5. For each of the following correspondences h : R R show whether they are convex-
valued, upper-hemicontinuous or lower-hemicontinuous:
Argue whether the Kakutani’s Fixed Point Theorem applies and …nd all the …xed points.
Show how you can expolit Kakutani’s Fixed Point Theorem despite the domain being
open, and …nd all the …xed points.