Notes 02 - Producer Theory
Notes 02 - Producer Theory
Notes 02 - Producer Theory
September 2015
1. Firms are price takers. This “competitive firm” assumption applies to both
input and output markets and makes it reasonable to ask questions about
1
We begin with producer theory because it proves to be mathematically simpler. The sim-
plicity comes from the fact that the parameters (prices) enter a firm’s objective function (profits)
but not its feasible set (production set). In consumer theory, conversely, prices determine the
consumer’s feasible set (budget set) but not his objective (utility function). Nevertheless, there
is a close connection between consumer and producer theory, which we will highlight later.
1
(1) what happens to the firm’s choices when a price changes and (2) what
can be inferred about a firm’s technology from its choices at various price
levels. For output markets, the assumption fits best when each firm has
many competitors who produce perfectly substitutable products, and a par-
allel condition applies to input markets. Of course, even the most casual
empiricism suggests that many firms sell di↵erentiated products and have at
least some flexibility in setting prices, and even small firms may have market
power in buying local inputs, such as hiring workers who live near a mine or
factory, so the results of the theory need to be applied with care. Even so,
the pattern of analysis established in this way is often partially extendable
to situations in which firms are not price takers.
(i) the firm is “competitive” – i.e, cannot a↵ect prices for any of its inputs
or outputs (assumption (1) above),
(ii) there is no uncertainty about profits (e.g., the firm can buy all inputs
and sell all outputs before uncertainty is resolved, thus ensuring profits),
and
(iii) the firm’s managers are perfectly controlled by the owners/shareholders.
2
Under (i)-(iii), all of the firms’ shareholders would agree to maximize profits,
since this would then maximize each shareholder’s income, which he could
spend optimally according to his preferences.
If (i) is violated, an owner who is also an input supplier or an output con-
sumer would have an interest in raising/lowering the relevant price. (E.g., a
worker-owner in the absence of a perfect labor market would want to deviate
from profit-maximization to drive up the wages.) If (ii) is violated, profits are
uncertain, and the optimal decisions depend on the owners’ beliefs regarding
possible realizations of uncertainty, or their attitudes towards risk, and these
beliefs and/or risk attitudes may di↵er across owners. Regarding (iii): Since
the time of Adam Smith, if not earlier, many observers have emphasized
that corporations are characterized by a separation between ownership (the
stockholders) and control (management), and that this separation weakens
the incentives of managers to maximize profits. The problem of motivat-
ing managers to act on behalf of owners has been a main concern for the
economics (and law) of agency theory.
3
Students sometimes wonder about the role of assumptions such as these, par-
ticularly when they are contrary to the facts of the situation. Economists have
taken a range of positions concerning how to think about simplifying assumptions,
and there is no consensus about the “correct” view. One extreme position is to
deny the relevance of any inference based on such models, because the premises
of the model are false. At the opposite extreme, some practicing economists seem
willing to accept “standard” or “customary” assumptions uncritically. Both of
these extreme positions are rejected by thoughtful people.
All economic modeling abstracts from reality by making simplifying but untrue
assumptions. Experience in economics and other fields shows that such assump-
tions models can serve useful purposes. One purpose is to support tractable models
that isolate and highlight important e↵ects for analysis by suppressing other ef-
fects. Another purpose is to serve as a basis for numerical calculations, possibly for
use in estimating magnitudes, deciding economic policies, or designing economic
institutions. For example, one might want to estimate the e↵ect of a tax policy
change on overall investment or hiring.The initial calculations based on a simplified
model might then be adjusted to account for the e↵ects suppressed in the model.
For a model to serve these practical purposes, its relevant predictions must be
reasonably accurate. The accuracy of predictions can sometimes be checked by
testing using data. Sometimes, the “robustness” of predictions can be evaluated
partly by theoretical analyses. In no case, however, should models or assumptions
be regarded as adequate merely because they are “usual” or “standard.” Although
this seems to be an obvious point, it needs to be emphasized because the temptation
to skip the validation step can be a powerful one. Standard assumptions often make
the theory fall into easy, recognizable patterns, while checking the suitability of
the assumptions can be much harder. The validation step is not dispensable.
4
nothing to do with good k, then yk = 0. The production possibilities of the firm
are described by a set Y ✓ Rn , where any y 2 Y is feasible production plan. Figure
1 illustrates a production possibility set with one input and one output as the area
below the curve in the second quadrant, with y labeling a point in the set.
x2
6
·y
- x1
Throughout our analysis, we will make the innocent technical assumptions that
Y is non-empty (so as to have something to study!) and closed (to help ensure
the existence of optimal production plans). Consider some more interesting and
substantive economic properties production sets might have:
• Shut Down. The production set Y has the shut-down property if 0 2 Y ; that
is, the firm has the option of using no resources and producing nothing.
5
• Nondecreasing Returns to Scale. The production set Y has nondecreasing
returns to scale (loosely, “increasing returns to scale”) if y 2 Y implies that
↵y 2 Y for ally ↵ 1.
• Convexity. The production set Y is convex if for all y, y 0 2 Y , all t 2 (0, 1),
ty + (1 t) y 0 2 Y . This condition incorporates a kind of “nonincreasing
returns to specialization,” meaning that if two “extreme” plans are feasible,
their combination will be as well. In addition, if 0 2 Y , then convexity
implies nonincreasing returns to scale.
6
x2
6
c
c
c
c
cy T (y) > 0
c
c
c
T (y) < 0 c
c
c - x1
c
c
slopec= MRT(y)
use widgets to make gadgets, with yk being the net amount of widgets produced.
Often, it is convenient to separate inputs and outputs, letting q = (q1 , ..., ql ) denote
the vector of the firm’s outputs, and z = (z1 , ..., zm ) the vector of inputs (where
l + m = n).
If the firm has only a single output, we can describe the transformation frontier
by writing output as a function of the inputs used, q = f (z). Formally, allowing
for free disposal, the production set can then be described as
Y = {(q, z) 2 R ⇥ Rm : q f (z)} .
In this case, we refer to f (·) as the firm’s production function. Equivalently, this
production set can be described with the transformation function T (q, z) =
q f (z). The marginal rate of transformation between inputs k and l, also known
as the marginal rate of technical substitution, can then be computed as
@f (z)/@zl
M RTk,l (y) =
@f (z)/@zk
This expression tells us how many units of input k must be used in place of one
unit of input l to maintain the same level of output. It is illustrated in Figure 3.
7
z2
6
@
@
@
@
@
@
@
y@
@
@
@ {z : f (z) = q}
slope = MRTS(y) @
@ - z1
8
returns to scale and has the shutdown property, then at any p, either ⇡ (p) = 0 or
⇡ (p) = +1 (Exercise: show this.)
Suppose that we don’t know the firm’s production set Y , but we observe some
of the firm’s supply decisions y (p) ✓ Y ⇤ (p) for p 2 Rn . (This formulation is quite
general: it allows that some prices p may not be observed at all and so for them
y (p) = ?, for other prices we may observe only some but not all optimal decisions,
and so y (p) could be a strict subset of Y ⇤ (p).) We can ask three questions:
1. What can we infer from the observations about the underlying production
set?
Note that these questions are parallel to those asked in “revealed preference”
theory, with one di↵erence: In revealed preference theory, we observed the decision-
maker’s choices and the feasible sets and wanted to infer his objective function.
Here we observe the firm’s choices and the objective function (profits) and want
to infer the feasible set (production set). The roles of the objective function and
the feasible set in the two problems are swapped.
Y I = [p2Rn y (p) .
9
Similarly, we can use idea (2) to construct an “outer bound” on Y , which only
include plans that don’t give the firm higher profits at any given price vector p
than what it obtained given its observed choices:3
Remark 2 Note the parallel to revealed preference in constructing the outer bound:
there from the fact that an alternative is feasible we inferred that it can’t be better
than the chosen point. Here from the fact that an alternative is better than the
chosen point we infer that it can’t be feasible.
It turns out that Y I and Y O summarize all that can be inferred about the
production set:
Proof. The “only if” part holds by construction of Y I and Y O , as argued in the
text. For the “if” part, note that with production set Y , for any price vector
p 2 Rn and any y 2 y (p), we have y 2 Y I ✓ Y , and also p · y p · y 0 for all
y 0 2 Y ✓ Y O , and so y 2 Y ⇤ (p). Q.E.D.
If Y has free disposal, then knowing that Y I ✓ Y implies that YFID ✓ Y . Note also
that with free disposal, it does not make sense to face the firm with negative prices,
3
In general convex analysis terms, this construction of the outer bound is known as a “Fenchel
duality.” For a general and deep treatment of duality, see Rockafellar’s (1970) Convex Analysis.
10
since whenever pl < 0 the firm can make unbounded profits by taking yl ! 1.
Thus, we focus on nonnegative nonzero prices: p 2 Rn+ \ {0}.
It turns out that closed convex production sets with free disposal are fully
inferred if the data is “complete” in the following sense:
Proposition 2 Suppose the production set Y is convex and closed and has free
disposal. Then
(i) if y (p) 6= ? for all p 2 Rn+ \ {0} (i.e., we observe some optimal choice at
each price) then Y O = Y .
(ii) if y (p) = Y ⇤ (p) for all p 2 Rn+ \ {0} (i.e., we observe all optimal choices
at each price) and Y 6= Rn , then YFID = Y .
11
completely: There is insufficient information in y(p) to decide whether the points
in the set di↵erence Y O \Y I are in the set Y .
Remark 3 We can infer from the complete data whether Y is convex: it is convex
if and only if YFID = Y O . However, this inference relies on observing all profit-
maximizing choices Y ⇤ (p) at a given price p. If we only observe a subset of optimal
choices, y (p) ✓ Y ⇤ (p), we may not be able to tell whether some choices in Y O
that would be optimal for price p (such as point x in Figure 4 are not chosen
because they are unavailable due to nonconvexity (so the production set is Y as
depicted in the Figure), or they are available but some other equally profitable
choice y was made instead. Distinguishing between the two cases, however, is
important for many economic issues - e.g., whether a competitive equilibrium with
such production technology exists.
12
It should be clear that to construct the “outer bound” as in part (i) of the
Proposition above (this is in contrast to the “inner bound”), we actually do not
need to observe any of the firm’s choices, as long as we observe its profit function
⇡ (p) = p · y (p) (which must be single-valued). We can describe the outer bound
with the “gain function” : Rn ⇥ Y ! R, defined as
In other words, the “outer bound” production set Y O can be described by means
of the “transformation function” T (y) = supp2P (p, y).
Thus, the outer bound Y O can be described by means of the production function
Note that observing the set of prices P amounts to having “complete data” (i.e.,
fixing the output price at 1 is just a normalization), since ⇡ must be homogeneous
of degree one, so ⇡ (p, w) = p⇡ (1, w/p) – see below). Thus, if the firm’s actual
production function f is concave, then by Proposition 2, we have f O = f . More
generally, we have f O f , and f O will be the lowest concave function that is
nowhere below f .
13
Now we proceed to Question 3: which obervations are rationalizable. Proposi-
tion 1 immediately implies
Thus, when given a supply correspondence, we only need to check that (i) each
selection from it is a rationalizable supply function, and (ii) the profit function
⇡ (p) = p · y (p) at any given p 2 P does not depend on which selection is chosen.
Since checking (ii) is trivial, from now on we focus on checking rationalizability of
a given supply function (rather than correpondence).
14
(ii) Any profit function ⇡ given by (1) is homogeneous of degree one, i.e., ⇡ ( p) =
⇡ (p) for all p 2 Rn , > 0.
(ii)
⇡( p) = max p · y = max p · y = ⇡(p).
y2Y y2Y
Y ⇤ ( p) = {y 2 Y : p · y = ⇡( p)} = {y 2 Y : p · y = ⇡(p)}
= {y 2 Y : p · y = ⇡(p)} = Y ⇤ (p).
Q.E.D
k
Proof. Di↵erentiate the identity f ( p) = f (p) with respect to and set = 1.
Q.E.D
15
WAPM holds. Writing Y O in the form (4) above, this means that for all p 2 P ,
supp0 2P (p0 , y (p)) 0 (it is now convenient to denote the maximization variable
by p0 rather than p). Since we also know that (p, y (p)) = 0 by the definition of
⇡ (p), this is equivalent to
max
0
(p0 , y (p)) = (p, y (p)) = 0 for all p 2 P. (D)
p 2P
Intuitively, the gain from choosing a production plan y that is optimal for price p
when the actual price is p0 must be nonpositive, and is exactly zero when p0 = p.
(D) can be viewed as a dual problem to the profit-maximization problem: its
solution is a price vector p supporting a given production plan y as an optimal
choice.
Since the set P is open, all of its points are interior. At any p 2 P at which
⇡ is di↵erentiable, so is the objective function (·, y (p)) in (D), and therefore the
following FOC must be satisfied:
16
these assumptions – instead we only had to make the (weaker) assumption that the
value function ⇡ (·) is di↵erentiable. Later we will dispense with this assumption
as well. For a more general statement of the Envelope Theorem, see Milgrom and
Segal (Econometrica 2002).
Recall, in particular, that the profit function ⇡ cannot depend on which selec-
tion y (p) 2 Y ⇤ (p) is chosen, and so Hotelling’s Lemma implies that the firm could
only have a unique optimal supply decision (i.e., Y ⇤ (p) must be a singleton) at
each price vector p at which ⇡ is di↵erentiable.
Observe that the convexity of ⇡ implies the concavity of the objective function
(·, y (p)) in (D), and so (along with the convexity of the feasible set P ) implies
that any price vector satisfying the FOC for problem (D) must solve this problem.
Thus, for the special case where ⇡ is di↵erentiable everywhere on P , we can state:
17
Proof.
Let ⇡ (p) = p · y (p), and use the chain rule to write
Note that the condition Dy (p) p = 0 is nothing but Euler’s Law for the degree-
0 homogeneity of the supply function y. (Recall that by Hotelling’s Lemma, the
di↵erentiability of ⇡ implies that Y ⇤ (p) = {y (p)}, and we know it must be homo-
geneous of degree 0.)
Remark 5 One can wonder what can we say about rationalizability if we only
observe the profit function (1) but not the supply choices. We already showed that
any rationalizable profit function must be homogeneous of degree 1 and convex.
It turns out that any profit function function ⇡ : P ! R satisfying these two
conditions on an open convex set P ✓ Rn is in fact rationalizable. Exercise: prove
this characterization of rationalizability for di↵erentiable profit functions, using
Proposition 7
4
Note that
0 1n
X @yj (p)
p · Dy (p) = @ pj A , while
j
@p i
0 1ni=1
X @yi (p)
Dy (p) p = @ pj A ,
j
@pj
i=1
so in general they need not coincide, but they do coincide when Dy (p) is a symmetric matrix.
18
5 Rationalizability “in the Large”
5.1 Law of Supply
Now we develop “finite-change” analogues of the positive semi-definiteness and
symmetry of the substitution matrix. The goal of this is twofold: (i) the finite-
changes analogues permit more intuitive interpretations, and (ii) they will permit
a general characterization of rationalizability, which dispenses with any di↵eren-
tiability assumptions.
We begin with positive semidefiniteness of the substitution matrix, which has a
simple intperpretation: Take a small change dp in the prices. The resulting small
change in supply will be Dy (p) (dp). Positive semidefiniteness means that for any
dp, (dp) · Dy (p) (dp) 0, i.e., the change of supply in the direction of the price
change is nonnegative. In particular, if only the price of good i changes, then
@yi (p) /@pi 0 (so the supply curve of good i is upward-sloping). To obtain a
finite-change version of this condition, write a double application of WAPM:
and compare the first and last expressions and rearranging terms to get
19
We will now derive a “large-change” implication of Hotelling’s Lemma (which
as we have seen is equivalent to symmetry of the substitution matrix, under the
extra assumption of homogeneity of degree zero). Consider a smooth path ⇢ con-
necting two price vectors p0 , p00 2 Rn . Formally, the path is described by a smooth
(i.e., continuously di↵erentiable) function ⇢ : [0, 1] ! Rn such that ⇢ (0) = p00
and ⇢ (1) = p0 . Assuming the profit function ⇡ is di↵erentiable, we can use the
Fundamental Theorem of Calculus and the Chain Rule to write
Z 1
00 0 d
⇡ (p ) ⇡ (p ) = ⇡ (⇢ (⌧ )) d⌧
0 d⌧
Z 1
= r⇡ (⇢ (⌧ )) · ⇢0 (⌧ ) d⌧
Z0
= r⇡ (p) · dp
⇢
(The last expression is a “shorthand” for writing a path integral, like the previous
expression.) Note in particular that the path integral cannot depend on the smooth
path ⇢ chosen to connect p00 and p0 . By Hotelling’s Lemma, this implies
Z
00 0
⇡ (p ) ⇡ (p ) = y (p) · dp.
⇢
This expression is known as the “Producer Surplus Formula”. The “path indepen-
dence” of the path integral is mathematically equivalent to the symmetry of the
substitution matrix (@yi (p) /@pj )i,j .
For example, consider the special “one-dimensional” case, in which only one
price is changing from pi = p0i to pi = p00i and the other prices p i are fixed.
(Formally, we the path is given by ⇢ (⌧ ) = ((1 ⌧ ) p0i + ⌧ p00i , p i ).) In this case, the
Producer Surplus Formula yields
Z p0i
0
⇡ (pi , p i ) ⇡ (pi , p i ) = yi (pi , p i ) dpi .
pi
This one-dimensional PSF simply gives the profit change as the area below the
supply curve for good i. Note that it allows us to calculate how the firm’s profits
change in response to changes in the price of good i knowing only the supply
function for good i, without knowing the prices or supply choices for other goods
20
(so the profits p · y (p) could not be calculated). This is very useful for empirical
work in “partial equilibrium,” which focuses on some markets and ignores other
markets.
When more than one prices change at the same time, there is no “natural”
path to choose, and we have many options for calculating the change in prices.
E.g., we could change dimensions one by one – and the result should not depend
on the order in which we change prices. Say, with two dimensions we can write
Z p00
1
Z p00
2
⇡ (p001 , p002 ) ⇡ (p01 , p02 ) = y1 (p1 , p02 ) dp1 + y2 (p001 , p2 ) dp2
p01 p02
Z p00
2
Z p00
1
= y2 (p01 , p2 ) dp2 + y1 (p1 , p002 ) dp1 .
p02 p01
(i) (Producer Surplus Formula): ⇡ (p) = p · y (p) satisfies, for any p, p0 2 P , and
5
Indeed, suppose in negation that there exist y, y 0 2 Y ⇤ (p) such that y 6= y. Then we have
y 00 = 12 y + 12 y 0 2 interior(Y ) and p · y 00 = 12 p · y + 12 p · y 0 = ⇡ (p), hence y 00 2 Y ⇤ (p). This is
impossible, because a non-trivial linear function (one with p 6= 0) has no local maximum.
21
any path smooth ⇢ : [0, 1] ! P such that ⇢ (0) = p and ⇢ (1) = p0 ,
Z 1
0
⇡ (p ) = ⇡ (p) + y (⇢ (t)) · ⇢0 (t) dt
0
@ (t0 , t) 0
= ⇢0 (t) · y (⇢ (t)) (t) = 0.
@t0 t0 =t
Thus, we must have 0 (t) = ⇢0 (t) · y (⇢ (t)) at each t at which the derivative exists.
Now we observe that
where the first inequality obtains from (5), while the second inequality obtains
because by the Law of Supply, q · y (p + tq) is nondecreasing in t 2 [0, 1]. Hence
is Lipshitz continuous on [0, 1], which implies that it is absolutely continuous, which
in turn implies that it is di↵erentiable almost everywhere and can be represented
as the integral of its derivative. Together with the expression derived for 0 (t)
wherever it exists, this gives the Producer Surplus formula.
“If”: For all p, p0 2 P , write
22
⇡ (p0 ) p0 · y (p) = [⇡ (p0 ) p0 · y (p)] [⇡ (p) p · y (p)]
= ⇡ (p0 ) ⇡ (p) (p0 p) · y (p)
Z 1
= (p0 p) · y (p + t (p0 p)) dt (p0 p) · y (p)
Z0 1
= (p0 p) · [y (p + t (p0 p)) y (p)] dt 0,
0
where the third equality is by the Producer Surplus Formula for the linear path
⇢ (t) = p + t (p0 p), and the inequality is by Law of Supply. By Corollary 3, this
implies rationalizability of y. Q.E.D
Since the general proof is cumbersome, consider the following simpler proof
and some intuition for the special “one-dimensional” case, in which the set P is
such that only one price pi varies and the other prices p i stay fixed (and so we
omit them from the arguments). From double WAPM inequalities (5), we see
that |⇡ (p0i ) ⇡ (pi )| max {|yi (p0i )| , |yi (pi )|} · |pi p0i |. By the Law of Supply,
yi (pi ) is nondecreasing in pi , and therefore bounded by max {|yi (a)| , |yi (b)|} on
an interval [a, b]. Thus, ⇡ is Lipshitz continuous on [a, b], which implies that it is
di↵erentiable a.e. and can be written as an integral of its derivative. At any pi at
which ⇡ 0 (pi ) exists, Hotelling’s Lemma (which is the FOC for problem (D)) yields
⇡ 0 (pi ) = yi (pi ). Thus, we obtain the one-dimensional PSF:
Z b
⇡ (b) ⇡ (a) = yi (pi ) dpi .
a
Remark 6 When multiple optimal supply choices exist, the profit ⇡ (p) cannot
depend on which selection is used, and so the integral in the Producer Surplus
Formula cannot depend on it either. This implies that the supply correspondence
must be single-valued a.e. on any straight line.
23
To show the sufficiency part of the Proposition in the one-dimensional case,
by Corollary 3, we verify that PSF and the Law of Supply imply WAPM: for any
pi , p0i ,
where the first equality is by definition of ⇡ (p0i ), the second equality is by PSF,
and the inequality is by the Law of Supply (sign [yi (p0i ) yi (t)] = sign [p0i pi ]).
The proof in the multidimensional case makes exactly the same one-dimensional
arguments along other straight lines, that are not necessarily parallel to any of the
axes. (They can be interpreted as changing the price of one “good” in a di↵erent
coordinate system in the commodity space).
Y = (q, z) : z 2 Rm
+ , q f (z) .
24
With a positive output price p > 0, profit-maximization requires choosing q =
f (z), and so the profit maximization problem can be written as:
max pf (z) w · z,
z2Rm
+
where w 2 Rm
+ is the vector of input prices.
The profit-maximization problem can be separated into two subproblems:
(ii) find an output level that maximizes the di↵erence between its revenue and its
cost function.
c (q, w) = inf w · z,
z2Rm
+ :f (z) q
Z ⇤ (q, w) = z 2 Rm
+ : f (z) q, w · z = c (q, w) .
The value function c (q, w) for this problem is called the cost function, and
the minimizer set Z ⇤ (q, w) is called conditional factor demand correspondence (to
indicate that it is conditional on a fixed output level q).
Once problem (i) is solved, problem (ii) can then be written as maxq 0 pq
c (q, w).
Note that the cost-minimization problem can be viewed as the profit-maximization
problem on the restricted production set Yq = y = (q, z) : z 2 Rm + , q f (z)
(this is an “upper level set” of the production function). Thus, the properties of
the cost function and conditional factor demand as functions of the input prices
w exactly mirror those of the profit function and the supply correspondence, re-
spectively, with the obvious sign reversions. For example, Proposition 7 can be
restated for this case as
25
is di↵erentiable in w. Them z is rationalizable by some production function if and
only if
(i) (Shepard’s Lemma) rw c (q, w) = z (q, w) .
(ii) c (q, ·) is concave.
Other properties of the cost function and conditional factor demand as functions
of w follow as well from the corresponding properties of profit-maximization, e.g.,
(a) c(q, ·) is homogeneous of degree one in w, (b) Z ⇤ (q, ·) is homogeneous of degree
zero, and (c) if Z ⇤ (q, ·) is a di↵erentiable function, then the matrix Dw Z ⇤ (q, w) =
Dw2 c(q, w) is symmetric and negative semi-definite.
What about the properties of the cost function c (q, w) as a function of the out-
put q? Under free disposal, it should be nondecreasing in q. Additional assump-
tions on the production function yield additional properties of the cost function,
e.g.,
26
The Lagrangian is L (y, ) = p · y T (y) where 0 denotes the dual variable
with the constraint. By the Kuhn-Tucker Theorem, the following FOC is then
necessary for profit-maximization:
rT (y) = p.
Geometrically, this means that at the optimal production plan y, the price vector
is normal to the production possibility frontier (since the gradient of the transfor-
mation function is is the normal vector to the frontier).
For a single-output firm with m inputs and production function f , the problem
can be written as
max
m
pf (z) w · z,
z2R+
where p > 0 is the price of output and the vector w 2 Rm reflects the input prices.
If f is di↵erentiable, an interior optimal vector of factor demands must satisfy the
following FOC: for all i,
@f (z)
p wi , with equality if zi > 0.
@zi
Remark 8 Applying the Kuhn-Tucker Theorem formally, we could write the FOC
as p @f@z(z)
i
+ µi = wi , where µi 0 is the dual variable with the constraint zi 0,
satisfying the Complementary Slackness Condition µi zi = 0. It is customary to
suppress the dual variables with the nonnegativity constraints and write the FOC
in the above-displayed form.
Finally, let us separate the firm’s problem into cost-minimization and profit-
maximization using a cost function. The cost-minimization problem
Letting 0 denote the dual variable with the production constraint f (z) q,
the FOC for cost minimization is
@f (z)
wi , with equality if zi > 0.
@zi
27
Thus, with f concave, one can think of profit maximization as the special
case of cost minimization in which the shadow price of output is the market price
p. There is more to this account. From the envelope therem for parameterized
constraints, we have:
@c(q, w)
= .
@q
Thus, at the solution to the cost minimization problem, the shadow value of output
is exactly the marginal cost of production.
Returning to our characterization of the firm’s problem, suppose the firm solves
the cost minimization problem for every q, yielding a cost function c(q, w). The
profit maximization problem can then be seen as:
Example 2 Consider a single-output firm with a cost function c (q) that has a “U-
shaped” marginal cost c0 (q). (See MWG Figure 5.D.3 on p.144). Intuitively, the
firm has economies of scale at low production and diseconomies at high production.
When p > min c0 (q), the FOC for profit-maximization then gives two output levels
ql < qh satisfying p = c0 (q), plus q = 0 when c0 (0) > p. Which of these outputs is
profit-maximizing?
28
The SOC for profit-maximization is c00 (q) 0, which rules out the lower output
ql , at which the marginal cost curve is downward-sloping (and so in fact it is a local
profit-minimizer rather than maximizer). It remains to compare qh to 0. For this
comparison, it is enough to compare the average cost at qh , to p. The firm will
produce qh if and only if c (qh ) /qh p.
We can now construct the firm’s supply curve from its AC and the MC curves.
Note that
✓ ◆0
0 c (q) c0 (q) q c (q) 1
AC (q) = = 2
= (M C (q) AC (q)) .
q q q
Thus, AC is downward-sloping where M C < AC, upward-sloping where M C >
AC, and minimized where M C = AC. For example, AC could be given by1 a
U-shaped curve, at whose bottom it must intersect the MC curve. Let q m denote
the output that minimizes AC (called the firm’s “most efficient scale”). Then the
firm’s supply correspondence is
8
>
< 0 if p < minq AC,
⇤ m
Q (p) = {0, q } if p = minq AC,
>
:
the higher solution of c0 (q) = p if p > minq AC.
A similar supply curve obtains if M C is downward sloping bur production involves
a positive fixed cost, so the AC curve is still U-shaped (see MWG Figure 5.D.4 on
p.145).
Thus, with nonconvex technology, the firm should then consider “discrete”
changes (e.g., whether to shut down), as well as changes “on the margin.” More
generally, some or all components of the firm’s decision set may discrete (e.g.,
which product to produce), and so convexity or di↵erentiability are unapplicable.
This makes the profit-maximization problem much harder to solve, but we can still
obtain some of its qualitative properties without solving it.
8 Monopoly
See MWG Section 12.B.6
6
As discussed in the beginning, profit-maximization is harder to justify for a firm that is not a
price-taker, because the firm’s owners may be at the same time consuming the firm’s outputs or
29
9 Comparative Statics
An important question in economics is the comparative statics question: How do
endogenous variables in the economy respond to changes in exogenous variables?
For example, in producer theory, exogenous variables could be prices or techno-
logical parameters, and the endogenous variables are the firm’s profit-maximizing
production choices.
We could ask the comparative statics question in a general maximization prob-
lem: Let F : X⇥T ! R, where X, T ✓ R, and consider the problem
• Convexity of X.
• Strict Concavity: Fxx < 0. (In particular, together with the previous bullet,
this ensures that the maximizer is unique: X ⇤ (t) = {x (t)})
Under these assumptions, the unique maximizer x(t) is the unique solution to
the following First-Order Condition:
Fx (x(t), t) = 0. (FOC)
inputs or other goods whose prices may in general be a↵ected by the firm’s behavior. Justification
is possible in a “partial equilibrium” setting where we may assume that the output price set on
the firm does not a↵ect prices in other markets and that the firm’s owners consume negligible
amounts of its output.
30
Thus, x (t) is a function given “implicitly” by the FOC. We can now apply the
Implicit Function Theorem, which amounts to di↵erentiating (FOC) with respect
to t, which yields
Fxx (x(t), t)x0 (t) + Fxt (x(t), t) = 0.
This yields
Fxt (x(t), t)
x0 (t) = .
Fxx (x(t), t)
The advantage of this approach is that, if the function F is exactly known
and the above conditions are satisfied, then we calculate the value of x0 (t) exactly.
However, this approach is not useful in many theoretical studies, because
(a) F is not known exactly; we may only know some of its qualitative properties.
We want to have predictions that are robust to specification of F .
(b) F and/or X may not satisfy the assumptions. For example, F may be non-
smooth, or non-concave (e.g., a firm with fixed costs), or X may not be
convex (e.g., a nonconvex production set).
(c) The theoretical models are in many cases not calibrated to give quantitative
predictions. Instead, we are only interested in qualitative predictions: in
what direction do endogenous variables respond to changes in exogenous
variables? E.g., when can we say that x0 (t) 0? What qualitative features
of F that are important for this conclusion?
31
At the same time, the above formula relies on the smoothness of F and strict
concavity of F in x. Are these assumptions important?
is clearly equivalent to the original problem, and x̃⇤ (t) is nondecreasing in t if and
only if x⇤ (t) = (x̃⇤ (t)) is nondecreasing in t. For example, suppose that x is the
variance of a certain distribution to be chosen optimally, and x̃ is its standard
deviation (and we can write x = (x̃) = x̃2 ). Clearly, the optimal variance should
be nondecreasing in the parameter t if and only if the optimal standard deviation
is nondecreasing in t.
32
Exercise 1 Show that di↵erentiable monotone rescaling does not preserve con-
cavity of the objective function, but it preserves the complementarity condition
Fxt 0.
Exercise 2 Using the Fundamental Theorem of Calculus, prove that X and T are
intervals and the function F : X ⇥ T ! R is sufficiently smooth, then F has
increasing di↵erences if and only if
33
(c) Fxt (x, t) 0 for all (x, t).
Now we can formulate the simplest monotone comparative statics result, which
is proven using the “Revealed Preference” approach:
If x > x0 , then using Increasing Di↵erences, the two inequalities imply, respec-
tively
34
•| • •
{z •}•| {z• • •} •| •{z• •}
A\B A\B B\A
The above Theorem then simply says that the set of maximizers X ⇤ (t) is non-
decreasing in t the strong set order. This implies, in particular, that the extreme
points of the set, sup X ⇤ (t) and inf X ⇤ (t), are nondecreasing in t. Clearly, all these
statements are equivalent when the maximizer is unique.
Sometimes we can obtain the stronger result that the maximizer cannot go
down at all when the parameter goes up:
35
Example 5 Suppose we want to know the e↵ect of a unit tax t on the optimal
price set by a monopolist. Here adopt the convention that the tax is paid by the
firm and consider the e↵ect on the “before-tax” price p received by the firm (i.e.,
the price paid by consumers); in another exercise we will examine the e↵ect on the
“after-tax” price p̄ = p t). If the monopolist faces a downward-sloping demand
curve D (p), his profit is F (p, t) = (p t) D (p) c (D (p)). This function has
strictly increasing di↵erences, since Ft (p, t) = D (p) is strictly increasing in p.
Thus, the Monotone Selection Theorem implies that any optimal price selection
p⇤ (t) is nondecreasing in t, and therefore the corresponding output D (p⇤ (t)) is
nonincreasing in t.
36
of functions for which monotone comparative statics obtains includes all functions
of the form F , but is wider than that.
Example 6 Consider the e↵ect of a tax on the monopolist’s output on the “after-
tax” price p̄ received by the monopolist (so the price faced by the consumers is
p = p̄ + t). Assume that the firm has a constant marginal cost c, and write
its profits as F (p̄, t) = (p̄ c) D (p̄ + t). While it is hard to ensure ID for this
function, note that
where " (p) = pD0 (p) /D (p) is the elasticity of demand at price p. Thus, when
" (p) /p is increasing/decreasing, log F (p̄, t) has increasing di↵erences in (p̄, t) /
(p̄, t), and so the before-tax price is decreasing/increasing in the tax. (For constant
" (p) /p, which corresponds to demand functions of the form D (p) = Ae Bp , the
after-tax price received by the monopolist does not depend on the tax.)
37
Definition 4 Function F : X ⇥T ! R with X, T ✓ R satisfies the Single-Crossing
Condition (SCC) if for all x, x0 2 X, t, t0 2 T such that x0 > x and t0 > t,
SCC can be understood as saying that when x0 > x, the function (t) =
F (x0 , t) F (x, t) crosses the horizontal axis at most once, and from below (al-
though the function is allowed to stay zero on an interval). The second implica-
tion in the definition of SCC is sometimes more useful in its contrapositive form,
F (x0 , t0 ) F (x, t0 ) ) F (x0 , t) F (x, t). Strict SCC strengthens SCC by requir-
ing that (t) cannot turn zero at more than one point. Note that (strict) SCC
is a relaxation of (strict) ID, which requires that (t) be (strictly) increasing. In
contrast to ID, these conditions are purely ordinal: they only make ordinal com-
parisons of the values of F at di↵erent points, not cardinal comparisons (i.e., ask
only whether F is increased or decreased, not by how much) , and so they are
invariant to strictly increasing transformations of the objective function. Also,
unlike ID, these conditions are not symmetric in (x, t).
Proof. We prove the first statement (the proof of the second statement is similar
and left as an exercise).
The “if” part: We want to show that when t0 > t, x 2 X ⇤ (t), and x0 2 X ⇤ (t0 ),
then min {x, x0 } 2 X ⇤ (t) and max {x, x0 } 2 X ⇤ (t0 ). If x x0 , then the statement
is trivial, so suppose x > x0 , and so min {x, x0 } = x0 and max {x, x0 } = x.
Since x 2 X ⇤ (t), we have F (x, t) F (x0 , t), but then by the first part of SCC
F (x, t0 ) F (x0 , t0 ), which in conjunction with x0 2 X ⇤ (t0 ) implies x 2 X ⇤ (t0 ).
38
Similarly, since x0 2 X ⇤ (t0 ), we have F (x, t0 ) F (x0 , t0 ), but then by the
second part of SCC F (x, t) F (x0 , t), which in conjunction with x 2 X ⇤ (t)
implies x0 2 X ⇤ (t).
The “only if” part: Let S = {x, x0 } with x0 > x, and t0 > t. If F (x0 , t)
F (x, t), then x0 2 X ⇤ (t) . Since by assumption X ⇤ (t0 ) X ⇤ (t) in the strong set
order, this implies x0 2 X ⇤ (t0 ) (indeed, otherwise x 2 X ⇤ (t0 ) and then again
max {x, x0 } = x0 2 X ⇤ (t0 )), and therefore F (x0 , t) F (x, t0 ). Similarly, if
F (x0 , t0 ) F (x, t0 ), then x 2 X ⇤ (t0 ). Since by assumption X ⇤ (t) X ⇤ (t0 ) in
the strong set order, this implies x 2 X ⇤ (t) (indeed, otherwise x0 2 X ⇤ (t) and
then again min {x, x0 } = x 2 X ⇤ (t)), and therefore F (x0 , t) F (x, t).
While SCC is the “right” condition for MCS in the sense stated above, it has two
shortcomings: (i) it is difficult to check, since it can’t be verified by checking the
sign of some derivatives, and (ii) it does not ensure robustness to perturbations of
the objective function. Specifically, Milgrom-Shannon consider objective functions
of the form F (x, G (x) , t). For example, G(x) could be the monetary benefit (or
cost) of choosing action x, which is independent of the parameter t. In this setting,
the “right” property of F to ensure “robust” monotone comparative statics is that
F (x, G (x) , t) have SCC for any perturbation G.
With some extra assumptions about the shape of F , the “right” property takes
familiar forms in the following two cases.
39
Similarly, for strict SCC, we need f (x0 , t) f (x, t) to be strictly increasing, i.e., f
to have strict ID. Thus, the property of (strict) ID is the right condition to ensure
(strict) monotone comparative statics that is robust to additive perturbations of
the objective function.
Here F takes the form F (x, y, t) = f (x, t) · y, with the restriction f (x, t) , y 0.
Since SCC is invariant to strictly increasing transformation of values, we can check
that log [f (x, t) · G (x)] = log [f (x, t)] + log [G (x)] has SCC for all nonnegative
functions G. But this means that the function (x) = log [G (x)] could be an arbi-
trary function, and therefore the “right” condition is that the function log [f (x, t)]
has ID. (This property of f is also known as “log-supermodularity”).
40
Proof. First we show the “if” part of the first statement, letting for definite-
ness Fy > 0 (if Fy < 0 we can replace y with y). Denote by ŷ (x|t, ↵) the
value of y 2 Y satisfying F (x, y, t) = ↵, which is at most unique under our
assumption. Thus ŷ (x|t, ↵) describes an isoquant of F , and ŷ 0 (x|t, ↵) =
Fx (x, ŷ (x|t, ↵) , t) /Fy (x, ŷ (x|t, ↵) , t). Observe that when t00 > t0 ,
d
F (x, ŷ (x|t0 , ↵) , t00 ) = Fx (x, ŷ (x|t0 , ↵) , t00 ) + Fy (x, ŷ (x|t0 , ↵) , t00 ) ŷ 0 (x|t0 , ↵)
dx
Fx (x, ŷ (x|t0 , ↵) , t0 )
= Fx (x, ŷ (x|t0 , ↵) , t00 ) Fy (x, ŷ (x|t0 , ↵) , t00 )
Fy (x, ŷ (x|t0 , ↵) , t0 )
Fx (x, ŷ (x|t0 , ↵) , t00 ) Fx (x, ŷ (x|t0 , ↵) , t0 )
= Fy (x, ŷ (x|t0 , ↵) , t00 )
Fy (x, ŷ (x|t0 , ↵) , t00 ) Fy (x, ŷ (x|t0 , ↵) , t0 )
0.
(In words, increasing x while moving along the isoquant of type t0 benefits type
t00 .)
Now, suppose that x00 > x0 and F (x00 , G (x00 ) , t0 ) F (x0 , G (x0 ) , t0 ). Then,
since compensation is possible, there exist y 0 , y 00 such that F (x00 , y 00 , t0 ) = F (x0 , y 0 , t0 ) ⌘
↵, and furthermore since Fy > 0 we can choose them so that y 0 G (x0 ) and y 00
G (x00 ) . Then, noting that y 00 = ŷ (x00 |t0 , ↵) and y 0 = ŷ (x00 |t0 , ↵) and using the
previous display, we have
Similarly, starting from the premise F (x00 , G (x00 ) , t0 ) > F (x0 , G (x0 ) , t0 ), some of
the above inequalities become strict to yield the conclusion F (x00 , G (x00 ) , t00 ) >
F (x0 , G (x0 ) , t0 ). This establishes that F has SCC. Similarly, the strict Spence-
Mirrlees condition yields strict SCC.
To see the “only if” part, note that if the weak Spence-Mirrlees condition fails,
then for some t00 > t0 , Fx (x, y, t00 ) / |Fy (x, y, t00 )| < Fx (x, y, t0 ) / |Fy (x, y, t0 )| at
some point (x, y), and therefore by continuity on some open square X̄ ⇥ Ȳ ✓ X ⇥Y .
But this implies, by the strict “if” part, that F (x, G (x) , t) has strict SCC in (x, t)
on X̄ ⇥ {t0 , t00 } for all G : X̄ ! Ȳ , which contradicts SCC in (x, t).
41
Remark 9 The “only if” statement for strict SCC is not be true, as shown by
Edlin and Shannon (1998): the strict Spence-Mirrlees condition is not necessary
for F (x, G (x) , t) to satisfy SCC in (x, t) for all G : X ! Y.
Remark 11 In the special case where F takes the quasilinear form F (x, y, t) =
f (x, t) + y, Fy ⌘ 1 and so the Spence-Mirrlees condition means that fx (x, t) is
nondecreasing in t, i.e., that f has ID. This is consistent with the previous part
(robustness to additive perturbations) except here it imposes smoothness of the
function.
Remark 12 The proofs of the “only if” results do not use all possible perturbation
functions G. Instead, any “sifficiently rich” family of functions G which allows to
assign arbitrary values at two given points x0 , x00 will do to to obtain the necessity
of ID or of the Spence-Mirrlees condition in the respective setting. In particular,
it suffices to consider affine perturbation functions G (x) = a + bx with arbitrary
parameters a, b. Thus, to have monotone comparative statics that is robust to such
perturbations, F must satisfy appropriate conditions, which in turn ensures that
the comparative statics is robust to arbitrary perturbations G.
42
customer output q and price p in the firm’s profits. Thus, for any demand function
the firm might face, it would respond to a growing market by raising per customer
output and reducing price when its cost function is concave, and doing the reverse
when its cost function is convex.
max F (x1 , x2 , t)
(x1 ,x2 )2X✓R2
Univariate Topkis’s Theorem implies that if F has ID in (x1 , t), then the optimal
value of x1 holding x2 fixed is non-decreasing in t. Similarly, if F has ID in (x2 , t),
then the optimal value of x2 holding x1 fixed is non-decreasing in t. However, now
both variables are chosen simultaneously, and we need to think of indirect e↵ects
(“feedbacks”) arising from the interaction between x1 and x2 . For example, how
does the fact that x2 increases in response to an increase in t a↵ect the optimal
value of x1 ?
Intuitively, if we assume in addition that F has ID in (x1 , x2 ), then the indirect
e↵ects will work in the same direction as the direct e↵ects. For example, under
this assumption, the fact that x2 optimally increases in response to an increase in
t further increases our incentive to raise x1 (as in the picture below). So, in the
end, when all the feedbacks play out, both x1 and x2 are increased.
+
% x1
t l +
& x2
+
More generally, when F has ID in all pairs of variables, all indirect e↵ects will
reinforce the direct e↵ects and each other. Formally, ID in all pairs of variables is
characterized by a property called “supermodularity,” which we now define.
43
For x, y 2 Rn , define operations meet and join, respectively, as follows:
(They can also be called “greatest lower bound” and “least upper bound” of {x, y},
respectively.) A set X ✓ Rn is a sublattice if for all x, y 2 X, we have x ^ y 2 X
and x _ y 2 X.
Graphically, when X ✓ R2 , the sublattice property means that when two non-
ordered corners of a rectangle whose edges are parallel to the axes are in X, then
the other two corners are also in X. Intuitively, X being a sublattice means
that the feasible set induces a (weak) complementarity in the dimensions of x:
if it is possible to increase [reduce] dimension xi of x 2 X (i.e., find y 2 X
s.t. yi > [<] xi ), this can always be done without reducing [increasing] any other
dimension xj , simply by going to x _ y [x ^ y] (but sometimes this might involve
increasing [reducing] the other dimensions).
Here are some examples of sublattices:
In case (1), increasing one dimension does not a↵ect the feasibility of increasing
another dimension, while in case (2) increasing dimension i helps make an increas-
ing in dimension j feasible (and vice versa.) In fact, it has been shown (by Topkis)
that any sublattice of Rn can be described as an intersection of sets of the form
(1) and (2).
For a set that is NOT a sublattice, take a consumer’s budget set: (x1 , x2 ) 2 R2+ : p1 x1 + p2 x2 w
where p1 , p2 > 0 are prices of the two goods, and w > 0 is the consumer’s wealth.
Intuitively, here increasing x1 might necessitate a reduction x2 so as to preserve
the consumer’s budget constraint.
Remark 13 We have defined “meet” and “join” operations on Rn , but the the-
ory of supermodularity applies to any partial ordered set X on which the “meet”
44
and “join” operations are defined as the supremum (greatest lower bound) and the
infimum (least upper bound), respectively. If these two operations are well-defined
within the set, it is called a “lattice,” and the study of such sets is called “lattice
theory.” We could use di↵erent lattices to examine monotone comparative statics
on choice sets other than subsets Rn and/or in partial orders that are di↵erent
from the vector ordering on Rn . To give one example, if X is the set of all
subsets of some set Y , and the partial order on X is given by the set inclusion
(✓), the “meet” and “join” operations become the set intersection [ and set union
\ operations, respectively.
But this inequality means that the benefit of increasing the first argument from
y1 to x1 can only go up when the second argument increases from x2 to y2 . Thus,
supermodularity is here implied by ID. In fact, when X = X1 ⇥ X2 ✓ R2 , then
supermodularity on X also implies ID, by writing, for each x1 > y1 and y2 > x2 ,
the supermodular inequality for (x1 , y2 ) and (x2 , y1 ). More generally, when X is a
product set in Rn with n 2, supermodularity is characterized by ID in each pair
of variables holding the others fixed:
45
Proof: “Only if”: Take any xi , x0i 2 Xi , xj , x0j 2 Xj with x0i > xi and x0j > xj ,
and x ij 2 X ij = ⇧l6=i,j Xl . Writing the supermodular inequality for (x0i , xj , x i j )
and xi , x0j , x i j yields
The inequality is by ID (since Mj yj and xj mj for all j). For the second
equality, note that for each i, either Mi = yi and mi = xi , or Mi = xi and mi = yi
(and in the latter case both di↵erences are zero). QED
On the other hand, by the definition of meet and join and the supermodularity
inequality,
46
Hence, F (x ^ x0 , t) = F (x, t) and F (x _ x0 , t0 ) = F (x0 , t0 ). QED
In particular, if X ⇤ (t) = {x} and X ⇤ (t0 ) = {x0 }, then the statement says
that x ^ x0 = x and x _ x0 = x0 , which means that x x0 –i.e., the maximizer is
nondecreasing in t. In the general case, the conclusion of the theorem is often stated
as X ⇤ (t) is “nondecreasing in t in the stronger set order,” with the appropriate
definition of the strong set order: A B in the strong set order if for all a 2 A, b 2
B, we have a ^ b 2 A and a _ b 2 B.
Note: The theorem also applies to the case t0 = t, in which it says that the set
of maximizers X ⇤ (t) is a sublattice.
2. Some inputs S ✓ {1, . . . , n} are held fixed at some levels zS 2 RS+ (which
could be interpreted as “short-run” optimization), or
47
The first result reported below is for the profit maximization problem with a
single output and all inputs free to vary.
Proof. Since f (·) is increasing and supermodular, the firm’s objective function
p · f (z) w · z is supermodular in (z, p). Also, the choice set Rn+ is a lattice. So by
Topkis’ Monotonicity Theorem, z(p, w) must be nondecreasing in p. Similarly, the
firm’s objective is also supermodular in (z, w). So z(p, w) is nonincreasing in wi .
Q.E.D.
Proof. Left as an exercise. (Hint: Suppose that all but 2 inputs are fixed. How
does f vary in the remaining two inputs?) Q.E.D.
48
By Topkis’s Monotonicity Theorem, if f is supermodular, then z(p, w, xS ) is
nonincreasing in w, for all S, even without the assumptions that f is increasing
and concave. So, the import of the proposition is that for the case of an increasing
concave production function f , inputs are complements in the strong sense defined
by the proposition if and only if f is supermodular.
The substitutes case is similar for the two-input case. A function f is called
submodular if ( f ) is supermodular.
Note well that this characterization applies only to the two-input case. With
more than two inputs, the following problem arises: submodularity of f does not
ensure that the indirect e↵ects do not counter each other. E.g., suppose w1 goes up
which makes the firm reduces z1 (by the Law of Supply). By univariate Topkis’s
theorem f , this would lead the firm to raise each z2 and z3 if the other were held
fixed. However, the increase in z2 would then lead the firm to reduce z3 :
+
% z3
w1 " ! z1 # l
& z2
+
49
Proposition 21 Suppose the profit function ⇡(p, w) is di↵erentiable in w. For
two inputs i 6= j, zi (p, w) is non-increasing [non-decreasing] in wj if and only if
the profit function ⇡(p, w) has increasing di↵erences in (wi , wj ) [(wi , wj )].
@
Proof. By Hotelling’s lemma, zi (p, w) = @w i
⇡(p, w), which is always non-
increasing [non-decreasing] in wj if and only if ⇡ has increasing di↵erences in
(wi , wj ) [(wi , wj )]. Q.E.D.
50
policies are frequently used to forecast their long-run e↵ects, and such forecasts
can influence policymakers.
We begin our analysis with an example to prove that the Samuelson-LeChatelier
principle does not apply to large price changes. Consider a single-output firm with
the production function f (k, l) = 10 if either l 2 or k, l 1, and 0 otherwise.
Thus, the firm can produce 10 units of output either by using two units of labor,
or by using one unit of each input. Suppose that the output price is 1.
Suppose that initial long-run input prices are given by w = (3, 2). At the
corresponding initial long-run optimum, the firm achieves its maximum profit by
buying two units of labor: z 0 = (0, 2). Suppose that the price of labor rises to
6, so the new price vector is w0 = (3, 6). If the use of capital is fixed in the
short run at zero , the firm can no longer make a positive profit since 2 · 6 > 10,
so it shuts down, hence z SR = (0, 0). The firm’s long run choice at price vector
w0 is using both capital and labor: z LR = (1, 1) (yielding a profit of 1). In this
example, when the price of labor rises from 2 to 6, the demand for labor changes
in the short-run from z10 = 2 to z1SR = 0, but then recovers in the long-run to
z1LR = 1. So, the short-run change is larger than the long-run change, contrary
to the Samuelsonian conclusion. Although the production function may seem
unusual, it can be modified to be concave, smooth, and strictly increasing, yielding
the same input demand functions.
There is an interesting set of economic models in which it is always true that
long-run responses to price changes are larger than short run responses. Intuitively,
these are models in which a “positive feedbacks” argument applies, as follows.
Consider the profit-maximization problem for a single-output firm with two
inputs, capital and labor, and production function f (k, l) in which the inputs are
substitutes, in the sense of submodularity (fkl 0 ). Suppose that capital is fixed
in the short-run. By the law of demand, if the wage increases, the firm will use
less labor both in the short-run and in the long-run. Since the two inputs are
substitutes, the increased wage implies an increased use of capital in the long-run.
Since fkl 0, the additional capital used in the long-run will reduce the marginal
product of labor, so in the long-run the firm will use still less labor. In summary,
the long-run e↵ect is larger than the short-run e↵ect because, in the short-run the
51
firm responds only to a higher wage, but in the long-run, it responds both to a
higher wage and to an increased capital stock that reduces marginal product of
labor. Graphically, the additional e↵ect in this example can be represented by a
positive feedback loop.
Next, suppose that the two inputs, capital and labor, are complements, i.e.,
fkl 0 Again, by the law of demand, if the wage increases, the firm will use
less labor input, both in the short-run and in the long-run. Since the inputs are
complements, the increased wage implies a reduced use of capital in the long run.
Since fkl 0, the reduced capital used in the long-run will reduce the marginal
product of labor, so in the long-run the firm will use still less labor. Again, we
have a positive feedback loop.
The general positive feedback argument for two inputs (due to Milgrom and
Roberts (1996)) goes as follows. Let X and Y be lattices (for example, let X =
Y = R). Define:
x(y, t) = arg max F (x, y, t)
x2X
and
y(t) = arg max F (x(y, t), y, t).
y2Y
Proof. By Topkis’ Theorem applied to max(x,y)2X⇥Y F (x, y, t), the function y(t) is
nondecreasing. Then, since t0 t, y(t0 ) y(t). Similarly, by Topkis’s Theorem, the
function x(y, t) is nondecreasing (in both arguments). The claims in the theorems
follow immediately from that and the inequalities t0 t and y(t0 ) y(t). Q.E.D.
Now let’s apply the result. If capital and labor are “complements” in the sense
that the production function f (k, l) of the capital input k and labor input l is
52
supermodular, then we let x = l, y = k, and t = wl , where wl is the price of
labor. The firm’s objective function is
F (x, y, t) = pf (y, x) + tx wk y,
F (x, y, t) = pf ( y, x) + tx wk y,
53