Lagrange New Qualif

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

1 Constraint Optimization: Equality Constraints

Reading [Simon], Chapter 18, p. 411-424.

1.1 General Problem


Maximize f (x1 , ..., xn ) where (x1 , ..., xn ) Rn must satisfy

g1 (x1 , ..., xn ) b1 , ... , gk (x1 , ..., xn ) bk ;


h1 (x1 , ..., xn ) = c1 , ... , hm (x1 , ..., xn ) = cm .

The function f (x1 , ..., xn ) is called objective function.


The functions gi (x1 , ..., xn ), i = 1, ..., k, gj (x1 , ..., xn ), j = i, .., m are
called constraint functions.
gi (x1 , ..., xn ) bi are called inequality constraints.
hj (x1 , ..., xn ) = cj are called equality constraints.

1.2 Equality Constraints, Necessary Conditions


1.2.1 Two variables and One Equality Constraint
Theorem 1 Suppose x = (x1 , x2 ) is a solution of the problem:
maximize f (x1 , x2 ) subject to h(x1 , x2 ) = c.
Suppose further that (x1 , x2 ) is not a critical point of h:

h h
h(x1 , x2 ) = ( (x , x ), (x , x )) 6= (0, 0),
x1 1 2 x2 1 2
(this condition is called constraint qualification at the point (x1 , x2 )).
Then, there is a real number such that (x1 , x2 , ) is a critical point of
the Lagrangian function

L(x1 , x2 , ) = f (x1 , x2 ) [h((x1 , x2 ) c].

In other words, at (x1 , x2 , ) we have

L L L
= 0, = 0, = 0.
x1 x2

Proof. The solution (x , y ) lies on highest-valued level curve of f which


meets the constraint set

C = {(x1 , x2 ), h(x1 , x2 ) = c}.

1
In other words at the point (x1 , x2 ) the level set of f and C have common
tangent, that is their tangents have equal slopes, or equivalently, parallel
gradient vectors:

These slopes are


f h
x (x1 , x2 ) x (x1 , x2 )
f
1
and h
1
.
x2
(x1 , x2 ) x2
(x1 , x2 )

The fact that these two slopes equal at (x1 , x2 ) means


f h
x (x1 , x2 ) x (x1 , x2 )
f
1
= h
1
.
x2
(x1 , x2 ) x2
(x1 , x2 )

Let us rewrite this equality as


f f
x1
(x1 , x2 ) x2
(x1 , x2 )
h = h .
x1
(x1 , x2 ) x2
(x1 , x2 )

Let us denote by the common value of these two quotients:


f f
x1
(x1 , x2 ) x2
(x1 , x2 )
h = = h .
x1
(x1 , x2 ) x2
(x1 , x2 )

Rewrite this as the two equations


f h
(x1 , x2 ) (x , x ) = 0,
x1 x1 1 2
f h
(x1 , x2 ) (x , x ) = 0.
x2 x2 1 2

2
This two equations together with the third one

h(x1 , x2 ) c = 0

are exactly the conditions


L L L
(x , x , ) = 0, (x , x , ) = 0, (x , x , ) = 0,
x1 1 2 x2 1 2 1 2
this completes the proof.

This proof was based on the fact that the level curves of f and of h
at (x1 , x2 ) and have equal slopes. We now present another version of the
proof using the gradient vectors f (x1 , x2 ) and h(x1 , x2 ). Since gradient
is orthogonal to level curve, then the level curves of f and h at (x1 , x2 ) are
tangent if and only if the gradients f (x1 , x2 ) and h(x1 , x2 ) line up at
(x1 , x2 ), that is the gradients are scalar multiplies of each other

f (x1 , x2 ) = h(x1 , x2 ),

(note that f (x1 , x2 ) and h(x1 , x2 ) can point in the same direction, in
this case > 0, or point in opposite directions, in this case < 0). This
equality immediately implies the above condition
f h
(x1 , x2 ) (x , x ) = 0,
x1 x1 1 2
f h
(x1 , x2 ) (x , x ) = 0.
x2 x2 1 2

Remark 1. Let us express this main fact

f (x1 , x2 ) = h(x1 , x2 ),

also in the following crazy manner: the vector f (x1 , x2 ) is linear combi-
nation of the linearly independent (i.e. nonzero, and it is so because the
constraint qualification, is not it?) vector h(x1 , x2 )).
Actually, the Theorem can be reformulated as follows:

Suppose (x1 , x2 ) is a maximizer of f (x, y) s.t. h(x, y) = c, and suppose


h(x1 , x2 ) 6= 0. Then

f (x1 , x2 ) span(h(x1 , x2 )).

Remark 2. Note that f (x1 , x2 ) and h(x1 , x2 ) can point in the same
direction, in this case > 0, or point in opposite directions, in this case
< 0.

3
Remark 3. It is seen from this proof that the necessary condition for (x1 , x2 )
to be a maximizer is the system of two equalities
f h
x (x1 ,x2 ) x (x1 ,x2 )

1
= 1
f (x1 ,x2 ) h (x1 ,x2 )
x2 x2





h(x1 , x2 ) c = 0,
the equality of slopes and the constraint. The introduction of a Lagrange
multiplier as an additional variable looks artificial but it makes possible to
apply to the constrained-extremum problem the same first-order condition
used in the free-extremum problem (but for more complex function L). Note
also that has certain economical meaning, we will see it lather.

Remark 4. If we want to minimize f instead of maximizing on the same


constraint set C the same conditions are necessary also. So the above theorem
can not distinguish minimizer and maximizer.

Remark 5. The condition is not sufficient: for f (x, y) = y 3 and h(x, y) =


x = 0 the point (x = 0, y = 0, = 0) is a critical point of Lagrangian
L(x, y, ) = y 3 x, nevertheless this function has nether maximum, nor
minimum.

1.2.2 About Constrained Qualification


Why constrained qualification? Just to have nonzero gradient vector of
Dh(x ) to say that the gradient Df (x ) is scalar multiplier of Dh(x ).

The necessity of constraint qualification shows also the following


Example 1. Consider the following constrained minimization problem

minimize f (x, y) = y subject to h(x, y) = y 3 x4 = 0.

Actually the constraint set here is


4
C = {(x, y), h(x, y) = 0} = {(x, y), y = x 3 }.

Here works the naive method of substitution:


4 4
f (x, x 3 ) = x 3 ,

easy to see that the minimizer of this one variable minimization problem is
the point (x , y ) = (0, 0).
We claim that there exists NO for which (x , y , ) = (0, 0, ) is a
critical point of Lagrangian

L(x, y) = f (x, y) h(x, y) = y (y 3 x4 ).

4
Indeed, for our minimizer (0, 0) there exists no for which (0, 0, ) satisfies
the second equation of the system

Lx (x, y) = 4 x3 = 0;
Ly (x, y) = 1 3 y 2 = 0;
L (x, y) = y 3 + x4 = 0.

Why it happened? Well, because the constrained qualification Dh(x , y ) 6=


(0, 0) is not fulfilled for our minimizer (x , y ) = (0, 0)

Dh(x, y) = (hx (x, y), hy (x, y)) = (4x3 , 3y 2 ), so Dh(0, 0) = (0, 0)

1.2.3 Strategy

The above theorem implies the following strategy:


1. Find all critical points of the constraint function h(x, y), that is, solve the
system (
hx (x, y) = 0
.
hy (x, y) = 0.
Ignore the critical points which do not lie on the constraint set h(x, y) = c.
But a critical point which lies on the constraint set must be included in list
of candidates for a solution since they violate the constraint qualification.
2. Find the critical points of Lagrangian. They will be also candidates.
3. If the constrained set is compact, then the constrained max (min) is one
of these candidates.
If not, then the second order test will be in order, well discuss it later.

Example 2. Maximize f (x, y) = 10 x2 y 2 subject to x + y = 1.

1. There is naive way to solve this problem: just solve y from the
constraint y = 1 x, substitute to the function f

(x) = 10 x2 (1 x)2

and maximize this one-variable function. The solution gives the critical point
x = 1/2 and 00 (1/2) < 0, so (x = 1/2, y = 1/2) is a maximizer of f subject
of constrained by x + y = 1.

2. Now let us solve the problem using Lagrange method. First we remark
that qualification is satisfied: h(x, y) = x + y has no critical points at all.

5
The Lagrangian here is

L(x, y, ) = 10 x2 y 2 (x + y 1).

So we have the system


L
x
= 2x = 0
L
y
= 2y = 0
L

= (x + y 1) = 0,

solution gives x = 0.5, y = 0.5, = 1. So the only candidate is the point


(0.5, 0.5). Note that the constraint set is not compact, so the existence of min
of max is not guaranteed, and we do not yet have a second order sufficient
conditions, so this candidate can be min, max or neither. Let us believe that
this is maximizer.

Example 3. Maximize f (x, y) = 4x subject to h(x, y) = x2 + y 2 1 = 0.

The naive method does not work well in this case.


Qualification is not violated: the only critical point (0, 0) of h does not
lie on the constraint set C = {(x, y) : h(x, y) = x2 + y 2 1 = 0} which is the
unit circle.
The Lagrangian here is

L(x, y, ) = 4x (x2 + y 2 1).

So we have the system


L
x
= 4 2x = 0
L
y
= 2y = 0
L

= (x2 + y 2 1) = 0,

solution gives
x = 1, y = 0, = 2, f (1, 0) = 4;
x = 1, y = 0, = 2, f (1, 0) = 4.
Note that the constraint set is compact, so the function achieves its min and
max. Thus (1, 0) is minimizer and (1, 0) is maximizer.

Example 4. Maximize f (x1 , x2 ) = x21 x2 subject to 2x21 + x22 = 3.

Solution. First check the constraint qualification: computation shows that


the only critical point of h is (0, 0), but this point does not lie on the con-
straint 2x21 + x2 = 3 (by the way, this is an ellipse) therefore the constraint
qualification is fulfilled for any point of constraint set.
The Lagrangian looks as

L(x1 , x2 , ) = x21 x2 (2x21 + x22 3).

6
Compute partial derivatives
L
x1
= 2x1 (x2 2),
L
x2
= x21 2x2 ,
L

= 2x21 x22 + 3.

so we must solve the system



2x1 (x2 2) = 0

x21 2x2 = 0 .


2x21 x22 + 3 = 0
The long computation gives the solutions (candidates)

(0, 3, 0), (0, 3, 0), (1, 1, 0.5);
(1, 1, 0.5), (1, 1, 0.5), (1, 1, 0.5).

Note that the constraint set is compact, so the function achieves its min
and max. So we must seek max points among these six candidates. The
computation shows that

f (1, 1) = 1, f (1, 1) = 1, f (1, 1) = 1,



f (1, 1) = 1, f (0, 3) = 0, f (0, 3) = 0,
so the max occurs at (1, 1) and (1, 1).

1.2.4 The Meaning of the Multiplier


As it is mentioned above the introduction of Lagrange multiplier is somehow
artificial. Nevertheless it has particular economical meaning which we explain
now.
The solution of our optimization problem

max (f (x1 , x2 ), s.t. h(x1 , x2 ) = c)


(x1 ,x2 )

(x , y , ) depends on the constraint c, so this solution is an implicit function


of c:
x = x (c), y = y (c), = (c).
Substituting the optimal solution in the objective function we obtain one
variable function f (x (c), y (c)).

Theorem 2
df (x (c), y (c))
= (c).
dc

7
Proof*. By Lagrange theorem for all c we have
(1) h(x (c), y (c)) = c;

(2)

f (x (c), y (c)) = (c) h(x (c));
x x

(3)

f (x (c), y (c)) = (c) h(x (c)).
y y
Differentiating (1) with respect to c we obtain

h dx (c) h dy (c)
(x (c), y (c)) + (y (c), y (c)) = 1.
x dc y dc

Then, using (2) and (3) we compute


df (x (c),y (c))
dc
=
dx (c)

x
f (x (c), y (c)) dc + y f (x (c), y (c)) dydc(c) =



(c) x h(x (c), y (c)) dxdc(c) + (c) y
h(x (c), y (c)) dydc(c) =


(c) [ x h(x (c), y (c)) dxdc(c) + y
h(x (c), y (c)) dydc(c) ] =

(c) 1 = (c).

Remark. From this theorem follows that increasing the constraint c by 1


results the increasing of maximal value f (x , y ) approximately by .
More precisely, for a small relaxation of the constraint, replacing h(x, y) =
c by h(x, y) = c + we obtain a new optimum (x() , y () ) for which

f (x() , y () ) f (x , y ) + .

Economically this theorem gives interpretation of the Lagrange Multiplier


in the context of consumer maximization - if the consumer is given an extra
dollar (the budget constraint is relaxed) at the optimal consumption level
where the marginal utility equal to f (x , y ) as above, then the change in
maximal utility per dollar of additional income will be approximately equal
to . In some texts the Lagrange multiplier is called the marginal pro-
ductivity of money, or shadow price.

Example 5. 1. Solve the following constrained maximization problem:


maximize f (x, y) = xy subject to the constraint x + y = 20

Solution. (a) Construct the Lagrangian:

L(x, y, ) = xy (x + y 20).

8
(b) Find partials
Lx = y ;
Ly = x
L = x y + 20.
(c)
Solve the system

y=0


x=0 x = 10, y = 10, = 10.


x y + 20 = 0
(d) The maximal value is f (10, 10) = 100.

2. Now redo this problem, this time using the constraint x + y = 21.
The similar solution, or
> LagrangeM ultipliers(x y, [x + y 21], [x, y], output = detailed);
gives x = 212
, y = 212
, = 21
2
and the new maximal value is f ( 21 , 21 ) =
2 2
441
4
= 110.25. So increasing the constraint from 20 to 21 the maximal value
increases by 10.25.

3. Now solve the problem 2 using the shadow price = 10 and = 21 20 =


1:
f (x() , y () ) f (x , y ) + ,
in our case
f (x(1) , y (1) ) = f (x , y ) + 10 1 = 100 + 10 = 110.
As we see this result just slightly differs from the result 110.25 of 2.

1.2.5 Several Equality Constraints


Problem: Maximize a function f (x1 , ..., xn ) on the constraint set
Ch = {x = (x1 , ..., xn ), h1 (x) = a1 , ... , hm (x) = am }.
We say that (h1 , ..., hm ) satisfy Nondegenerate Constraint Qualifi-
cation (NDCQ) at x if the rank of jacobian
h1 h1
x1
(x ) ... xn
(x )

rank ... ... ... = m.
hm hm
x1
(x ) ... xn
(x )

In other words the NCQ means that the gradient vectors Dh1 (x ), ... , Dhm (x )
are linear independent in Rn .

So NDCQ implies that m n.

The Lagrangian in this case is defined as the function


L(x1 , ..., xn , 1 , ..., m ) =
f (x1 , ..., xn ) 1 [h1 (x1 , ..., xn ) a1 ] ... m [hm (x1 , ..., xn ) am ].

9
Theorem 3 Suppose x = (x1 , ..., xn ) Ch is a local max or min of f
on Ch . Suppose also that NDCQ is satisfied at x . Then there exists =
(1 , ..., m ) Rm so that (x , ) is a critical point of the Lagrangian L(x , ),
that is L(x , ) = 0, in other words
L L
x1
(x , ) = 0, ... , x n
(x , ) = 0,

L L
1
(x , ) = 0, ... , m
(x , ) = 0.

Remark 1. The NDCQ means that the system of vectors


h1 (x ) = ( h
x1
1 h1
(x ), ... , x n
(x )),
h2 (x ) = ( h
x1
2 h2
(x ), ... , x n
(x )),
...
hm (x ) = ( h
x1
m
(x ), ... , h
xn
m
(x )),

(which consists of m vectors, each of dimension n) is linearly independent.

Remark 2. NDCQ fails for example in the following situations:


1. hi (x ) = 0 for some i.
2. The number of constraints m is larger than the dimension n of the vector
x , that is there are more constraints then variables. So if NDCQ requires
first for all that m n.

Remark 3. The gradient of objective function at maximizer f (x ) is a


linear combination of gradients hi (x )

f (x ) = 1 h1 (x ) + ... + m hm (x ).

Note that this condition, together with constraints, gives exactly L(x , ) =
0.
Actually, the Theorem can be reformulated as follows:

Suppose x Rn is a maximizer of f (x) s.t. h1 (x) = a1 , ... , hm (x) = am ,


and suppose the vectors

h1 (x ), ... , hm (x )

are linearly independent. Then

f (x ) span(h1 (x ), ... , hm (x )).

Example 6. Minimize f (x, y, z) = x2 + y 2 + z 2 subject of constraints


h1 (x, y, z) = x + y + z = 1, h2 (x, y, z) = y x = 0.

10
1. Again the naive solution is possible in this case. From the second
constraint we have y = x and from the first constraint we have z = 1
x y = 1 2x. Substituting in f we obtain a function of one variable
6x2 4x + 1. Its minimizer is x = 1/3, thus for our problem we have the
minimizer (1/3, 1/3, 1/3).

2. Now switch to Lagrange method. Firstly, the qualification is satisfied


since the Jacobian !
1 1 1
Dh(x, y, z) =
1 1 0
is constant matrix whose rank is 2.
The Lagrangian is

L(x, y, z) = x2 + y 2 + z 2 1 (x + y + z 1) 2 (y x).

It gives the system


L
x
= 2x 1 + 2 = 0
L
y
= 2y 1 2 = 0
L
z
= 2z 1 = 0
L
1
= (x + y + z 1) =0
L
2
= (y z) = 0.

The solution gives x = 1/3, y = 1/3, z = 1/3, 1 = 2/3, 2 = 0.

1.3 Again About the Meaning of Multiplier


Let x = (x1 , ... , xn ) be a solution of the problem

max f (x1 , ..., xn ) s.t. h1 (x1 , ..., xn ) = a1 , ... , hm (x1 , ..., xn ) = am .

Then there exists = (1 , ... , m ) Rm such that (x1 , ... , xn , 1 , ... , m )


Rn+m is a critical point for Lagrangian
m
X

L(x , ) = f (x ) i (gi (x ) ai ).
i=1

This solution depends on budget constraints a = (a1 , ..., ak ), so we can


assume that x and are functions of a:

(x (a), (a)) = (x1 (a), ..., xn (a), 1 (a), ..., k (a)).

The optimal value f (x , ) also can be considered as a function of a:

f (x (a)) = f (x1 (a), ..., xn (a)).

11
Calculation similar to one used in two variable case shows that

f (x1 (a), ..., xn (a)) = j (a).
aj

This formula has the following meaning:

-j measures the sensitivity of optimal value f (x (a)) to changing the


constraint aj .

-In other words j measures how the optimal value is affected by relax-
ation of j-th constraint aj .

-One more interpretation: j measures how the additional dollar invested


in j-th input changes the optimal value.

That is why the Lagrange multiplier sometimes is called shadow price,


internal value, marginal productivity of money.

Example 7. Consider the problem

Maximize f (x, y) = x2 y on the constraint set h(x, y) = 2x2 + y 2 = 3.

The first order condition gives the solution

x (3) = 1, y (3) = 1, (3) = 0.5.

The second order condition allows to check that this is maximizer. The
optimal value is f (1, 1) = 1.

Now let us change the constraint to

h(x, y) = 2x2 + y 2 = 3.3.

The first order condition gives new stationary point

x (3.3) = 1.048808848, y (3.3) = 1.048808848, (3.3) = 0.5244044241

and the new optimal value is 1.153689733. So increasing the budget a = 3 to


a + a = 3.3 increases the optimal value by 1.153689733 1 = 0.153689733.
Now estimate the same increasing of optimal value using shadow price:

f (1.048808848, 1.048808848)f (1, 1) f (1, 1)+ 0.3 = 1+0.50.3 = 1.15,

this is good approximation of 1.153689733.

12
1.3.1 Income Expansion Path
Back to the problem
Maximize f (x, y) subject to h(x, y) = a.
The solution (x , y ) of this problem depends on a, so, assume x = x (a)
and y = y (a).

The parameterized curve R Rn given by a (x (a), y (a)) is called


income expansion path. This is the path on which moves the optimal solution
when the constraint a changes.

Example. Let f (x, y) = xy and g(x, y) = x + 2y. Consider the problem

min f (x, y) s.t. g(x, y) = a.

Let us try to write the equation of income expansion path for this problem.

L(x, y, ) = xy (x + 2y a);


Lx (x, y, ) = y = 0
x = 2y

Ly (x, y, ) = x 2 = 0 x = a2 , y = a4 , = a2 .

4y = a,
L (x, y, ) = (x + 2y a) = 0

Thus the income expansion path is (x(a) = a2 , y(a) = a4 ) (parametric form)


or y = 12 x (nonparametric form).

Later well prove the

Theorem 4 If the objective function is homogenous, and the constraint func-


tion is linear, then the income expansion path is a ray from origin.

13
Exercises

1. Minimize f (x, y, z) = x2 +y 2 +z 2 subject to h(x, y) = 2xy+3z = 28.

2. Maximize and minimize f (x, y, z) = 2x + 4y + 4z subject to h(x, y) =


x2 + y 2 + z 2 = 9.

3. Maximize f (x, y) = 4 x2 y 2 subject to the constraint h(x, y) =


y x2 + 1 = 0.

4. A manufacturing firm has budgeted $60,000 per month for labor and
materials. If $x thousand is spent on labor and $y thousand is spent on
materials, and if the monthly output (in units) is given by N (x, y) = 4xy
8x how should the $60,000 be allocated to labor and materials in order to
maximize N ? What is the maximum N ?

5. The Cobb-Douglas production function for a new product is given by

f (x, y) = 16x0.25 y 0.75 .

where x is the number of units of labor and y is the number of units of capital
required to produce f (x, y) units of the product. Each unit of labor costs $50
and each unit of capital costs $100. If $500,000 has been budgeted for the
production, how should this amount be allocated between labor and capital
in order to maximize production? What is the maximum number of units
that can be produced?

6. A consulting firm for a manufacturing company arrived at the follow-


ing Cobb-Douglas production function for a particular product: N (x, y) =
50x0.8 y 0.2 where x is the number of units of labor and y is the number of
units of capital required to produce N (x, y) units of the product. Each unit
of labor costs $40 and each unit of capital costs $80.
(A) If $400,000 is budgeted for production of the product, determine
how this amount should be allocated to maximize production, and find the
maximum production.
(B) Find the marginal productivity of money in this case, and estimate the
increase in production if an additional $50,000 is budgeted for the production
of this product.

7. The research department for a manufacturing company arrived at


the following Cobb-Douglas production function for a particular product:
N (x, y) = 10x0.6 y 0.4 where x is the number of units of labor and y is the
number of units of capital required to produce N (x, y) units of the product.
Each unit of labor costs $30 and each unit of capital costs $60.

14
(A) If $300,000 is budgeted for production of the product, determine
how this amount should be allocated to maximize production, and find the
maximum production.
(B) Find the marginal productivity of money in this case, and estimate the
increase in production if an additional $80,000 is budgeted for the production
of this product.

8. Find the maximum and minimum distance from the origin to the
ellipse x2 + xy + y 2 = 3. Then estimate the answer for the same problem for
the ellipse x2 + xy + y 2 = 3.3 using shadow price.

9. Find the point on the parabola y = x2 that is closest to the point


(2, 1). (Estimate the solution of the cubic equation which results.)

10. The standard beverage can has a volume 12 oz, or 21.66 in3 . What
dimension yield the minimum surface area? Find the minimum surface area.

11. Find the general expression (in terms of all the parameters) for the
commodity bundle (x1 , x2 ) which maximizes the Cobb-Douglas utility func-
tion U (x1 , x2 ) = kxa1 x1a
2 on the budget set p1 x1 + p2 x2 = I

12. Find the point closest to the origin in R3 that is on both the planes
3x + y + z = 5 and x + y + z = 1.

13. Find the max and min of f (x, y, z) = x+y+z 2 subject to x2 +y 2 +z 2 =


1 and y = 0.

14. Maximize f (x, y, z) = yz + xz subject to y 2 + z 2 = 1 and xz = 3.

15. Maximize the Cobb-Douglas utility function U (x, y) = x0.5 y 0.5 subject
to the budget constraint px + qy = I.

16. Maximize the Cobb-Douglas utility function U (x, y) = xa y 1a subject


to the budget constraint px + qy = I.

17. Maximize the Cobb-Douglas utility function U (x, y) = xa y b subject


to the budget constraint px + qy = I.

18. Write the income expansion path for the problem

min x2 + y s.t. x + y = a.

Homework
Exercises 3, 5, 7, 10, 14.

15

You might also like