Growth Theory
Growth Theory
Growth Theory
3. Neoclassical Growth 26
4. Multisector Growth 73
2
1. Classical Growth Theory: From Smith to Marx 4
3
1. Classical Growth Theory: From Smith to Marx
When Adam Smith wrote his famous 1776 treatise, he called it An Inquiry into
Nature and Causes of the Wealth of Nations. Some have taken this as
indicating that he was concerned primarily with economic growth. In this way,
Smith moved away from the Cantillon-Physiocratic system which concentrated
on "natural equilibrium" of circular flows, and brought back into economics
what had been the Mercantilists' pet concern.
Smith posited a supply-side driven model of growth. Succinctly we can lay out
the story via the simplest of production functions:
Y = ƒ(L, K, T)
Population growth, Smith proposed in the traditional manner of the time, was
endogenous: it depends on the sustenance available to accommodate the
increasing workforce. Investment was also endogenous: determined by the rate
of savings (mostly by capitalists); land growth was dependent on conquest of
new lands (e.g. colonization) or technological improvements of fertility of old
lands. Technological progress could also increase growth overall: Smith's
famous thesis that the division of labor (specialization) improves growth was a
fundamental argument. Smith also saw improvements in machinery and
international trade as engines of growth as they facilitated further
specialization.
Smith also believed that "division of labor is limited by the extent of the
market" - thus positing an economies of scale argument. As division of labor
increases output (increases "the extent of the market") it then induces the
possibility of further division and labor and thus further growth. Thus, Smith
argued, growth was self-reinforcing as it exhibited increasing returns to scale.
Finally, because savings of capitalists is what creates investment and hence
growth, he saw income distribution as being one of the most important
determinants of how fast (or slow) a nation would grow. However, savings is in
part determined by the profits of stock: as the capital stock of a country
increases, Smith posited, profit declines - not because of decreasing marginal
4
productivity, but rather because the competition of capitalists for workers will
bid wages up. So lowering the living standards of workers was another way to
maintain or improve growth (although the counter-effect would be to reduce
labor supply growth).
Despite increasing returns, Smith did not see growth as eternally rising: he
posited a ceiling (and floor) in the form of the "stationary state" where
population growth and capital accumulation were zero.
5
bargaining between capitalists and workers and this process would be
influenced by the amount of unemployed laborers in the economy (the "reserve
army of labor", as he put it). Marx also saw profits and "raw instinct" as the
determinants of savings and capital accumulation.
Thus, contrary to Smith, he saw a declining rate of profit doing nothing to stem
capital accumulation and bring the stationary state about, but only as an
inducement for capitalists to further reduce wages and thus increase the
misery of labor.
Like the Classicals, Marx believed there was a declining rate of profit over the
long-term. The long-run tendency for the rate of profit to decline is brought
about not by competition increasing wages (as in Smith), nor by the
diminishing marginal productivity of land (as in Ricardo), but rather by the
"rising organic composition of capital".
Marx defined the "organic composition of capital" as the ratio of what he called
constant capital to variable capital. It is important to realize that constant
capital is not what we today call fixed capital, but rather circulating capital
such as raw materials. Marx's "variable capital" is defines as advances to
labor, i.e. total wage payments, or heuristically, v = wL (where w is wages and
L is labor employed).
The rate of profits, Marx claimed, are defined as:
r = s/(v+c)
where r is the rate of profit, s is the surplus, and (v+c) are total advances
(constant and variable). Surplus, s, is the amount of total output produced
above total advances, or s = y - (v+c), where y is total output. It is important
to note that for Marx only labor produces surplus value. This was to become a
sore point of debate between the Neo-Ricardians and the Neo-Marxians in later
years. Marx called the ratio of surplus to variable capital, s/v, the "exploitation
rate" (surplus produced for every dollar spent on labor).
Marx referred to the ratio of constant to variable capital, c/v, as the organic
composition of capital (which can be viewed as a sort of capital-labor
ratio). Notice that dividing numerator and denominator of r by v we obtain:
r = (s/v)(v/(v+c)))
so the rate of profit can be expressed as a positive function of the exploitation
rate (s/v) and a negative function of the organic composition of capital (c/v)).
Marx then argued that the exploitation rate (s/v) tended to be fixed, while the
organic composition of capital (c/v) tended to rise over time, thus the rate of
profit has a tendency to decline. Why? The basic logic can be thought of as
follows. For simplicity, assume a static economy (no labor supply growth). As
the surplus accrues to capitalists and, necessarily, capitalists invest that
surplus into expanding production, then output will rise over time while the
labor supply remains constant. Thus, the labor market gets gradually "tighter"
and so wages will rise. Thus, v (= wL) rises and r falls.
But this decline in r is temporary. There are forces at work which will restore
profit rate What are these forces? In Cantillon, Smith, et al., a rise in wages
would induce population growth which would then loosen the labor markets
and bring wages back down again. Marx does not accept this story. For Marx,
wages are set by "bargaining" in the labor market. Thus, there is no "extra
supply of labor" being encouraged by the higher wages. However, Marx
6
argued, capitalists can boost their profit rate back up by introducing labor-
saving machinery into production -- thereby releasing labor into
unemployment.
There are two effects of this. Firstly, notice that v declines because labor (L) is
released. But, concurrently, the employment of machinery implies that
constant capital, c, rises. Thus, the introduction in labor-saving machinery
does not seem to change anything: the fall in v from less labor is counteracted
by the rise in c, so it seems that c/v stays constant. This is where the second
effect kicks in: the concurrent expansion in the unemployed -- the "reserve
army of labor" -- will, by itself, influence the labor bargaining process and
reduce wages down to subsistence. Thus v declines further. So, on the whole,
the net effect of a labor-saving technology is to raise c/v, i.e. to reduce the rate
of profit.
But notice that v declines further because labor is released. So, both the w and
the L part of v = wL declines. But, concurrently, the employment of machinery
implies that constant capital rises, thus c rises. Thus, the fall in L is
counteracted by the rise in c, so that, on the whole, v
So, in sum, the organic composition of capital, c/v, falls. Profits, consequently,
are increased.
Thus, the L part of v = wL declines and so r = s/(v+c) comes back up. There is
a double effect in that, of course, the release of labor is not automatically
absorbed by higher investment so that a "reserve army of labor" is created. In
this manner, at the bargaining table, firms will be at an advantage relative to
their employees, so that wages decline (or at least are prevented from rising
further).
But this is merely a temporary respite. Profits will be reinvested, output will
grow again, labor markets will tighten once more and the whole process will
repeat itself. The problem is that the second time around, there is less labor to
lay off. Recall, L was already reduced in the first round. Introducing more
machinery reduces L further -- and, via several rounds, further and further --
until there is hardly any more L that can be released. When the system gets to
the point that there are no more laborers to be fired, then there is nothing to
bring s/v back up. The profit rate declines and firms will begin going bankrupt.
The bankruptcy of firms means a sudden release of even more labor and
capital into the market, depressing prices tremendously. Firms which remain
active will thus be able to buy the bankrupt smaller firms and thus acquire
more labor and capital at very cheap rates -- indeed, cheaper than their proper
"value". This increases the
The unemployed, thus, act as a "reserve army of labor" and bring wages back
down to a manageable level.
However, the introduction of labor-saving capital and laying off of workers
means that c rises while v falls, i.e. the organic composition of capital rises. It
is easy to notice that a constant s/v and a rising c/v will necessarily reduce the
profit rate (to see this, just notice that r can be rewritten as: r =
(s/v)(v/(v+c))). Thus, there is a natural tendency for the rate of profit to fall.
One way to prevent this decline in r would be to increase the exploitation rate
in proportion to which variable capital declines relative to constant capital. The
7
manner of increasing the exploitation rate, Marx claimed, was up to the devilish
imagination of the capitalist. Technological progress in the form of machinery or
division of labor were not wholly beneficial ways of improving growth either.
Marx took on Ricardo's idea that machinery is labor-saving and leads to a
disproportional adjustment: the rate of release of labor does not accompany
the rate of re-absorption of that labor, so that there tends to be permanent
"technological" unemployment which can be used to bring down the wage. One
does not even need to undertake it: technological improvement is also a way
capitalists can increase their leverage over labor merely by threatening it with
mechanization. Whereas Marx contended that division of labor was a way of
generating the "alienation" of the working classes and thus tie them more
dependently to the production process - thereby, again, reducing the
bargaining position of labor.
The issue of trade, another possible check to the decline in profit rate, was
seen by Marx as an inducement to produce on an even greater scale - thereby
increasing the organic composition of capital further (and reducing profit
quicker). The connection between trade with non-capitalist economies to
prevent of the decline in profit rate was for later Marxians like Rosa Luxemburg
(1913) to propose in their theories of imperialism.
However, despite all their efforts, Marx claimed that there were social limits to
the extent to which capitalists could increase the exploitation rate, while no
such thing limited the growing organic composition of capital. Consequently,
Marx envisioned that greater and greater cut-throat competition among
capitalists for that declining profit. Then a crisis occurs: large firms buy up the
small firms at cheaper rates (i.e. below , and thus the total number of firms
declines. This will boost the surplus value as firms can now purchase capital
As capital becomes more concentrated in fewer . The increasing
increasing the tendency for capital to be concentrated in fewer and fewer
hands, combined with the greater misery of labor would culminate in ever
greater "crises" which would destroy capitalism as a whole. Marx had only
temporary "stationary states", punctuating the secular tendency to breakdown.
Marx's frightening vision did not carry over into Neoclassical theory. But then, it
is hard to say the early Neoclassicals had a substantial theory of growth at all.
The possible exception was Marshall, but even he improved little upon the
Classical system (of Smith and Ricardo, not Marx). That was only to be really
developed in later years.
Concern with growth was then largely confined to the German and English
Historical Schools, although these thinkers did little more than improve the
recording and collection of facts on economic history. They did explore, for
instance, institutional and cultural roots of productivity and factor changes
(especially regarding population growth and the social-cultural habits that
induced capital accumulation), but the essence of their system were only
footnotes to the Classical theory. The American Institutionalists also did little
beyond this - except that their massive empirical efforts on business cycles and
national incomes accounts might have spurred new interest into the
8
phenomenon of growth. Simon Kuznets, in particular, was instrumental in this
respect.
However, in the 1920s and 1930s, three new sets of stories emerged which
improved upon the Classical theory substantially. They all drew, to a good
extent, from Karl Marx's theoretical schema that had been channeled by a
European tradition that ran through Tugan-Baranovsky, Spiethoff and Aftalion.
Specifically, two themes ran through the new stories: firstly, that the economy
should be considered explicitly in its disaggregated, multi-sectoral structure;
secondly, the concept of a steady-state growth path is introduced as a a
reference point for such an economy.
The entrepreneur's innovations drive development but their motive, like Marx
had argued, was "raw instinct" - profit-derived wealth being merely an "index"
of that instinct. Innovation, again like Marx, was not wholly exogenous: quite
the contrary, competition for small profits "induced" entrepreneurs to innovate,
whereas uncompetitive periods with high profits were a brake on the rate of
innovation.
One may think of this as more of a cycle theory than a growth theory, but
Schumpeter claimed that there were ratchet effects in innovation so that
entrepreneurial-driven spurts of economic activity led to progressively higher
9
levels of income. And there is no long run need to slow down: unlike Ricardo,
Schumpeter claimed that there were no diminishing returns to innovation. The
only reason one may be driven towards slower, steady-state is that all the
entrepreurs in a generation might be already "used up".
John von Neumann, in particular, followed the Classical idea that surplus is the
determinant of growth but, contrary to the Classicals, did not concern himself
with any falling rates of profit. John von Neumann's concern was in the
formalization of steady-state growth, but without reference to any Classical
constraints that might bring the surplus down and bring the economy to a
stationary state without growth. To some extent, this was due to the fact that,
as a mathematician, von Neumann abstracted much from the "social
considerations" that often went into the Classical theories, i.e. he did not
concern himself with possible resource constraints presented by land, or
changes in fertility, or "entrepreneurial" behavior or any other such concepts.
His exercise was a thoroughly mathematical one -- foreshadowing the later
formalization of the Classical theory by Sraffa and Leontief in many ways.
10
As a result, the dynamic models of Gustav Cassel and John von Neumann have
a perpetual multi-sectoral, steady-state growth rate which they saw as
perpetual and constant. They identified the rate of growth to be identical to
the rate of profit - the "Golden Rule" already implicit in the Classicals,
Schumpeter and Walras. For more on all this, see our reviews of the Walras-
Cassel model and the von Neumann system.
The final set of growth theories that emerged in the 1920s and 1930s are the
"structural" theories of growth developed by the Soviet economist Grigory
Fel'dman (1928) and the Kiel School (e.g. Adolph Lowe (1926, 1954, 1976),
Fritz Burchardt (1928, 1931), Alfred Kähler (1933), Emil Lederer (1931), Hans
Neisser (1933, 1942), Wassily Leontief (1941)). They effectively take the story
up where Cassel-von Neumann drop off. They are more explicitly indebted to
Marx's theory, particularly his schema of extened reproduction and his
recognition of technological unemployment.
The Kiel School was particularly interested in what happens off the steady-state
path. They focus on technological change as the big crucial variable that is
constantly leading to increases in the rate of return on capital and thus higher
investment. The difference is that the resulting growth is not steady, but
rather "disproportionate". For instance, after technical progress, investment
goods sector output increases while that of consumer goods industries lags
behind, leading thereby to changes in relative prices during the process of
adjustment. These changes in relative prices can lead to technological
unemployment in certain industries (e.g. consumer goods) while growth
proceeds at bursting speed in others. There is, as Neisser expressed it, "a
race" betwen technical displacement of labor in some sectors and the rate of
absorption of labor in other sectors from capital accumulation. Notice that
"traditional" recipes for curing unemployment, e.g. lowering wages or
stimulating demand, will not affect the technological unemployment problem
as these are aggregate measures, not designed for the specific sectoral
problems. As they Lederer noted:
Several aspects of the structural theories of growth of the Kiel School were
absorbed by dynamic input-output models (e.g. Leontief , 1953; Samuelson
and Solow, 1953; Morishima, 1964, 1973). They were more directly
influential on the development of Friedrich von Hayek's (1928, 1931) theory of
macrofluctuations and John Hicks's (1973) theory of the disequilibrium
growth "traverse".
11
2. Keynesian Growth: The Cambridge Version 16
12
2. Keynesian Growth: The Cambridge Version
The issue is that Keynes did not extend his theory of demand- determined
equilibrium into a theory of growth. This was left for the Cambridge Keynesians
to explore. The first to come up with an extension was Sir Roy F. Harrod who
(concurrently with Evsey Domar) introduced the "Harrod-Domar" Model of
growth (Harrod in 1939, Domar in 1946).
Y = (1/s)I
13
I/Y = (I/K)(K/Y) = gv
But recall our goods market equilibrium term from the multiplier, i.e. Y = (1/s)I
which can be rewritten I/Y = s. Thus, the condition for full employment steady-
state growth is gv = s, or simply:
g = s/v
Thus, s/v is the "warranted growth rate" of output. However, Harrod and Domar
originally held s and v as constants - determined by institutional structures.
This gives rise to the famous Harrodian "knife-edge": if actual growth is slower
than the warranted rate, then effectively we are claiming that excess capacity
is being generated, i.e. the growth of an economy's productive capacity it
outstripping aggregate demand growth. This excess capacity will itself induce
firms to invest less - but, then, that decline in investment will itself reduce
demand growth further - and thus, in the next period, even greater excess
capacity is generated.
Similarly, if actual growth is faster than the warranted growth rate, then
demand growth is outstripping the economy's productive capacity. Insufficient
capacity implies that entrepreneurs will try to increase capacity through
investment - but that that itself is a demand increase, making the shortage
even more acute. With demand always one step ahead of supply, the Harrod-
Domar model guarantees that unless we have demand growth and output
growth at exactly the same rate, i.e. demand is growing at the warranted rate,
then the economy will either grow or collapse indefinitely.
The "knife-edge", thus, means that the steady-state growth path is unstable:
the only stable growth path, the "knife-edge", is where the real growth rate is
equal to s/v permanently. Any slight shock that will lead real growth to deviate
from this path ensures that we will not gravitate back towards that path but will
rather move further away from it.
S = sP + s'W
14
For goods market equilibrium, it must be that investment is equal to savings, I
= S. Following the Keynesian axiom that investment is independent, then
investment determines savings (or, to word it differently, aggregate demand
determines aggregate supply). However, as noted profits are positively related
to savings. Hence, by substitution:
I = sP + s'(Y - P)
In other words, given the marginal propensities to save of each class, the
relative size of profits in income is dependent only on the investment decision,
I/Y. Naturally, the more investment, the greater the necessary slice profit takes
out of income.
If we assume workers save nothing, so that s' = 0, then we quickly reach the
conclusion that:
P/Y = (1/s)I/Y
where P/Y depends on I/Y. Note that this is reminiscent of Keynes' famous
"widow's cruse" remark:
What about growth? Recall that I/Y = (I/K)(K/Y), where I/K is the rate of capital
accumulation (equal to the rate of growth of productive capacity, g) and K/Y is
the capital-output ratio (v). Thus, we can write I/Y = gv.
15
Now recall Kaldor's relationship, P/Y = (1/s)I/Y. Thus:
P/Y = gv/s
g = s(P/K)
But we should note that the ratio P/K is merely the rate of profit, r. Calling it
thus, we can rewrite:
r = g/s
the rate of profit is equal to the growth rate divided by the savings rate of
capitalists - which is also known as the "Cambridge rule" for growth. In a von
Neumann model, recall, workers consumer everything (as here), but he also
has it that capitalists save everything (so s = 1). But note that in this case, we
have r = g, or "Golden Rule" growth. Thus, we immediately see the affinity
between Cambridge growth models and von Neumann growth models.
Morishima's (1960, 1964) extension of von Neumann models which allowed for
capitalist consumption produces precisely this "Cambridge rule" for von
Neumann.
However, we know from the Kaldor relationship, P/Y = (1/s)I/Y or r = g/s, that
profits are themselves generated by investment. Thus, Robinson's question can
be asked: when is it true that the profits generated by the investment in the
Kaldor relationship will themselves generate investment decisions that, in turn,
generate the original profits? Alternatively, what is there that guarantees that
the profits generated by the Kaldor relationship will themselves generate the
amount of investment needed to sustain them? This is a question of stability.
16
Robinson's (1962: p.48) diagram above of the concave Kalecki function and the
linear increasing risk function is reproduced below. Assuming all is well, then
we should have two equilibria where rs = g = f(r). Consider the rightmost
equilibrium first. To the right of that equilibrium, Robinson posited that the
economy was generating less profits than planned and thus investment plans
will be shelved, inducing deaccumulation of capital and hence reducing growth.
To the immediate left of it, the economy is generating more profits than
planned, and thus firms will revise their expectations upwards and invest more,
thereby increasing accumulation and growth. Hence, the right equilibrium is
stable. A similar exercise will show that the left equilibrium is, for the same
reasons, unstable.
Robinson (1962) went on to enrich her analysis by introducing labor growth and
to consider the implications of including unemployment and inflation and the
method of adjustment explicitly in the model. She discusses the various types
of growth situations that could be encountered - Golden Rule and otherwise.
It is necessary that workers be paid a rate of interest on their capital just in the
same manner as capitalists receive a rate of profit on theirs. By competition
and arbitrage, Pasinetti argued that the rate of profit/interest for both capitalist
and workers on their capital is equalized. Or:
17
P/K = P'/K' = r
where P' is workers' profits. For savings, let S be capitalist savings and S'
worker savings out of profits. Therefore, for steady state growth:
S/K = S'/K' = g
In the long-run, for steady-state, it must be that the rate of accumulation must
be equal for both capitalists and workers, i.e.
P/S = P'/S'
otherwise, if the rate of wealth accumulation is faster for either of the classes,
then there will be a change in distribution and, as a result, a change in the
composition of aggregate demand. In long-run equilibrium, aggregate demand
must be stable therefore this is a necessary assumption.
where s and s' are the marginal propensity to save of capitalists and workers.
Note again that workers also save out of wages, W, as well as profits, P',
whereas capitalists only receive and save out of profits. Cross-multiplying:
I = s'(W + P') + sP
P* = (1/s)I
So it must be that:
r = (P*/K) = (1/s)I/K = g
i.e., for long run Golden Rule steady-state growth, only the capitalist's
propensity to save needs to be considered - workers' saving propensities can
be dropped by the wayside. Thus, even with worker savings, the "Cambridge
rule" is iron-clad. Only capitalists' savings propensity matters. As Pasinetti
notes:
18
"In the long run, workers' propensity to save, though influencing the distribution
of income between capitalists and workers, does not influence the distribution of
income between profits and wages. Nor does it have any influence on the rate of
profit!" (L.Pasinetti, 1962)
But there were important assumptions in the model yet undiscussed. Pasinetti
posits one of his conditions to guarantee existence to be:
so that profits cannot take "a null or negative share of wages" (Pasinetti, 1962).
This, in essence, defines the mechanism for adjustment. If distribution can be
somehow organized such that there will be a "correct" level of profits to give us
the savings necessary to be in equilibrium: i.e. make I/K = s/v. The first
question that must be asked here is not only whether you can calculate for a
given investment level what the profit level will be but whether there will be
pressures that might bring this into equilibrium. Within certain limits, Kaldor
argues, variations can take place such that P/Y is a function of the change in
the I/Y ratio. According to Kaldor, prices respond to relative money wage rates
as a consequence of demand. Assume, for instance, that given an excess
demand for goods, prices will increase but not wages. As a consequence there
is a shift in distribution such that there will be an increase in the profit share.
Since profits increase, this implies there will be a substantial growth in savings.
However, as J.E. Meade (1961) points out, if prices rise relative to wages, then
the real wage decreases. By substitution between capital and labor, there will
be a change in the capital-output ratio (v). Therefore, for Kaldorian adjustment
to be applied, there is an implicit dependence on a constant capital-output
ratio. However, a constant v necessarily means that we cannot be in long-run
equilibrium since technique would otherwise be entirely flexible. One can
perhaps regard at it as a vintage model, but here prices would have to change
faster than wages. The greatest difficulty in this model, nevertheless, remains
the adjustment towards the steady-state path. How do profits adjust so that
one will achieve the steady-state savings rate? According to Kaldor, prices
respond to relative money wage rates as a consequence of demand. Assume,
for instance, that given an excess demand for goods, prices will increase but
not wages. As a consequence there is a shift in distribution such that there will
be an increase in the profit share. Since profits increase, this implies there will
be a substantial growth in savings.
However, as J.E. Meade (1961, 1963, 1966) points out, if prices rise relative to
wages, then the real wage decreases. By substitution between capital and
labor, there will be a change in the capital-output ratio (v). Therefore, for
Kaldorian adjustment to be applied, there is an implicit dependence on a
constant capital-output ratio. However, a constant K/Y necessarily means that
we cannot be in long-run equilibrium since technique would otherwise be
entirely flexible.
But a more general criticism can be made. We can note that given a stock of
capital, labor and output, if prices move faster than wages, then profits will
19
increase whereas if wages move faster than prices, then profits will fall -
without changing techniques. The variety of consequences of this has led
several economists, such as Meade (1961) and, later, Nell (1982), to argue that
at least for a long-run model, Kaldor's theory has a rather poor price-
adjustment mechanism. "Mr. Kaldor's theory of distribution is more appropriate
for the explanation of short-run inflation than of long-run growth." (Meade,
1961: x).
3. Crecimiento Neoclásico 26
20
3.1.4. 1. La Solución tipo Cobb-Douglas 33
3.1.4. 2. La Solución General 36
3.1.5. El Procerso de Ajuste: Solow vs. Harrod 37
3.2. Implicaciones Empíricas 44
3.2.1. Introdución 44
3.2.2. La Paradoja de Solow 45
3.2.3. La Hipótesis de la Convergencia 49
3.2.3.1. Convergencia Absoluta 49
3.2.3.2. Convergencia Condicional 51
3.2.4. Trampas de Pobreza 52
3.2.4.1. La Trampa Tecnológica 52
3.2.4.2. La Trampa de la Población 58
3.3. El Cambio Técnico 60
3.3.1. Adición del Cambio Técnico 61
3.3.2. Implicaciones Empíricas 67
3.4. Bibliografía 69
3. Neoclassical Growth
____________________________________________________________________
21
____________________________________________________________________
3.1.1. Introduction
In the Harrod-Domar growth model, steady-state growth was unstable. In the
popular term of the day, it was a "knife-edge" in the sense that any deviation
from that path would result in a further move away from that path. However,
Robert M. Solow (1956), Trevor Swan (1956) and, a bit later, James E. Meade
(1961) contested this conclusion. They claimed that the capital-output ratio of
the Harrod-Domar model should not be regarded as exogenous. In fact, they
proposed a growth model where the capital-output ratio, v, was precisely the
adjusting variable that would lead a system back to its steady-state growth
path, i.e. that v would move to bring s/v into equality with the natural rate of
growth (n). The resulting model has become famously known as the "Solow-
Swan" or simply the "Neoclassical" growth model.
A brief word or two on historical precedence is warranted. James Tobin (1955)
introduced a growth model similar to Solow-Swan which also included money
(and thus a predecessor of the monetary growth theory). However, Tobin did
not solve explicitly for the stability of the steady-state. Also, it is become
increasingly common to credit Jan Tinbergen (1942) as presenting effectively
the same model as Solow-Swan, including even empirical estimates of the
relevant coefficients. Finally, Harold Pilvin (1953), before Solow, had argued
that the Harrodian knife-edge problem could be resolved if a flexible capital-
output ratio were introduced, but he did not formulate the concept of a steady-
state. Harrod's (1953) response to Pilvin is quite instructive in stressing that
flexible technology does not, in fact, resolve the Harrodian knife-edge as he
originally conceived it. But more on this later.
S = sY
I = sY
I/L = s(Y/L)
so, letting i = I/L and y = Y/L, we see that the macroeconomic equilibrium
condition becomes:
22
i = sy
Y = F(K, L)
Y/L = F(K/L, 1)
y = ƒ(k)
where ƒ (·) is the "intensive" or "per capita" form of the production function F(·).
As a result, the macroeconomic equilibrium condition can be rewritten as:
i = sƒ(k)
23
Fig. 1 - Intensive Production Function
gL = (dL/dt)/L = n
gKr = (dK/dt)/K = n
where we have attached the superscript "r" to indicate that this is the required
growth rate of capital to keep the capital-labor ratio, k, steady. As investment is
defined as I = dK/dt, then we can rewrite this as:
Ir = nK
ir = nk
24
"proportional" growth in a manner that there are no induced changes in relative
prices over time. It is obvious (from Figure 1, for instance) that a change in k
will change the marginal products of capital and labor. Assuming the marginal
productivity theory of distribution holds, so that capital and labor are, in
equilibrium, priced at their marginal products, then if we allow k to change over
time in our solution, then we are also allowing relative factor prices to fluctuate
over time. As Cassel's definition of a "steady-state" growth equilibrium does not
allow this, then, consequently, we must focus on getting k to stay constant.
We can depict the steady-state k in Figure 2, by superimposing the required
investment function, ir = nk, on top of our old diagram. Notice that only at k* is
actual investment equal to required investment, i = ir. At any other k, i ≠ ir.
dk/dt = i - ir
25
dk/dt = sƒ (k) - nk
26
ir = (n+δ )k
as our required investment rate, where δ is the capital depreciation rate. The
fundamental Solowian differential equation needs to be rewritten as:
In terms of Figure 2, all that will happen when we insert capital depreciation is
that the required investment line ir will become steeper (with slope n+δ ) and
so the steady-state ratio k* will be lower (see Figure 4). However, notice that
now the growth rates of the level variables -- capital stock, output and
consumption -- all rise to (n+δ).
27
The explicit solution to the fundamental Solowian differential equation, dk/dt =
sƒ (k) - nk, is particularly easy if we assume a specific form for the production
function. Let us assume that F(K, L) is Cobb-Douglas, so that:
Y = Kα L(1-α )
y = kα
dk/dt = skα - nk
which is a non-linear differential equation (see earlier Figure 3). To resolve this,
we can linearize it by defining a new term z = k 1-α , which upon differentiation
with respect to t yields:
dz/dt = (1-α)(dk/dt)/kα
(dk/dt)/kα = s - nk/kα
Now, as k/kα = k1-α = z by definition, then this equation can be expressed as:
z(t) = Ce-(1-α)nt + z*
so:
z* = s/n.
28
Thus:
To translate this to obtain the solution k(t), we just re-transform it back, i.e. as z
= k1-α , then k = z1/(1-α ). So:
To decipher C, we need to assume some initial value of k, call it k(0) = k0. So,
when t = 0:
k(0) = {C + s/n}1/(1-α ) = k0
so, rearranging:
C = k01-α - s/n.
thus, written out in full, the solution to the Solowian differential equation is:
Now, as n > 0 and 0 < α < 1 by assumption, then it is evident that this
equation is stable, i.e. [k01-α - s/n]e-(1-α)nt → 0 as as t → ∞ , so that in the end, k(t)
→ k*, where:
k* = (s/n)1/(1-α )
where k* is capital-labor ratio. Notice that if k(t) = k*, when we are in steady-
state, then V(t) = 0, but outside of steady-state, k(t) ≠ k*, so V(t) > 0. The
system is considered "stable" if dV(t)/dt < 0, i.e. as time progresses, the
difference between k(t) and k* is reduced.
29
Now, let z = k(t) - k*, so that the function becomes V(t) = z 2. So differentiating
V(t) with respect to time:
dV(t)/dt = 2z·(dz/dt)
Now, the concavity of the production function implies that [ƒ (z+k*) - ƒ (k*)]/z ≤
ƒ ′ (z), or ƒ (z+k*) ≤ ƒ (k*) + zƒ′(k*), so:
Since sƒ (k*) = nk* (by definition of steady-state equilibrium), then this reduces
to:
Notice that as we have assumed constant returns to scale, then the term [ƒ (k*)
- k*ƒ ′ (k*)] is the marginal product of labor, which is positive, while the term
[2nz2/ƒ (k*)] is unambiguously non-negative, thus:
dV(t)/dt < 0
and thus we know that the steady-state capital-labor ratio k* is globally stable.
In other words, beginning from any capital-labor ratio (other than 0), we will
converge to the steady-state ratio k*.
30
The argument of the Solow-Swan growth model, as we have presented it,
seems straightforward enough. But it gives the impression that the reason k
adjusts to steady-state k* comes out merely from the "technical" aspects of the
model -- from the properties of a constant returns to scale production function
and national income accounts identities. If there is an "economic theory" in
there, it seems to be well-disguised.
But there is a lot of economics in it -- we just have to look at the details. From
the outset, the first thing we need to fix in our minds is that the production
function ƒ (k) is not merely a "technical" thing; it contains within it an intricate
and completely Neoclassical economic theory.
31
ƒ (k*)/k* = n/s
s/v* = n
which is the Harrod-Domar equilibrium for the second knife-edge. Now, let v1
be the capital-output ratio associated with the disequilibrium capital-labor ratio
k1 (in Figure 1). Obviously, ƒ (k1)/k1 = 1/v1. As n and s are constant and 1/v1 >
1/v* (slope of the ray associated with k1 is steeper than that associated with
k*), then we know 1/v1 > n/s, or simply:
s/v1 > n.
What is the exact economic mechanism that permits this? Nothing less than
assuming that the factor markets clear at all times and that the marginal
productivity theory of distribution holds. Underlying the Solowian production
function is a detailed factor market adjustment process. To see this in action,
examine Figure 2. For a constant returns to scale intensive production function,
at any k, the slope of a tangent ray is ƒ k = Fk, the marginal product of
32
Fig. 2 - Factor Price Adjustment
capital while the intercept of the tangent ray on the vertical axis is merely y - ƒ
kk = FL, the marginal product of labor.
Let r denote the real rate of return on capital and let w denote the real wage.
By the Neoclassical theory of distribution, the marginal productivity of a factor
will constitute the demand for that factor. In equilibrium, factor demand equals
factor supply, and thus at the market-clearing factor prices, F k = r and FL = w.
Thus, in Figure 2, we denote the slope of the tangent ray as r and the intercept
on the vertical axis as w. One more thing can be deciphered from the diagram:
namely, the point where the tangent curve intersects the horizontal axis in the
left quadrant yields the factor price ratio, ω = w/r.
33
Now, suppose capital supply increases to K2 and labor supply to L2. Obviously,
as we see in Figure 3, capital increases a lot more than labor, so the new factor
supply position will be at point 2 = (L2, K2), and thus the capital-labor ratio is
higher (on ray k2). Now, suppose market prices do not adjust. In other words,
suppose that the wage-profit ratio remains ω1. In Figure 3, this is represented
by the isocost line ω1′ , which has the same slope as the isocost curve ω1, and
thus represents the "old" factor prices, but passes through the new factor
supply point 2. Notice then, that at point 2, with the old factor prices ruling, the
marginal rate of technical substitution exceeds the factor-price ratio, ƒ L/ƒ K >
w/r.
At these old factor prices, the profit-maximizing firms in our economy will
attempt to produce output Y2′. As shown in Figure 3, their demand for capital
will consequently be K2′ and their demand for labor is L2′ . So, with old factor
prices ω1′, new factor supplies are at 2 while new factor demands are at 2′ .
Notice immediately that K2′′ < K2 and L2′ > L2, so there is excess supply of
capital and excess demand for labor. We have factor market disequilibrium.
Consequently, assuming in a typical Neoclassical fashion that factor markets
adjust automatically, then wages must rise and profits must fall, so the factor
price ratio must increase from ω1′ to ω2. Notice that at these new factor prices,
34
ω2, firms will produce Y2 and demand exactly what is supplied, K2 and L2. We
have restored factor market equilibrium.
We can also depict this adjustment process in the intensive production function
diagram in Figure 2. Specifically, if k1 increases to k2 but factor prices remain
constant at ω 1, then firms will be attempting to produce y2′ in Figure 2, which
lies above the intensive production function. At this (y2′ , k2) position, we see
that w/r = ω 1 and ƒ L/ƒ K = ω2, thus ƒ L/ƒ K > w/r again. This is equivalent to the
factor market disequilibrium we depicted in Figure 3. If we permit the wage
profit ratio to adjust freely so that w/r = ω2, then ƒ L/ƒ K = w/r is restored, and
firms are now producing y2. This is equilibrium once again.
In sum, we see that a Neoclassical factor price adjustment process is exactly
what is captured by the "curvature" of the intensive production function. Thus,
when we posited the straight-line production function in our depiction of the
modern Harrod-Domar model, the critical feature is not so much that we were
assuming a single technology, but rather that we are not assuming that there
was an underlying Neoclassical factor market clearing process.
This is the important point. Harrod-Domar did not believe that factor prices
were driven by factor-market clearing, thus they did not incorporate a Solow-
Swan type of CRS production function with flexible technology. Specifically, as
Roy Harrod (1948, 1953, 1973) explains, following the Keynesian schema, the
rate of interest, r, is governed by monetary phenomena; the real wage, well, by
an assortment of other things, e.g. unions, etc. So, ω = w/r is not determined
by factor market clearing as Neoclassical theory (and Solow-Swan) assume. As
Harrod did not know how (or why) the monetary authorities, labor unions, firms,
etc., would adjust r and w so that the economy could be guided to the steady-
state capital-labor ratio, he therefore assumed that the capital-output ratio v
was a constant. [Note: in a revision of his model, Harrod (1960) does attempt
to account for the influence of changing interest, importing, incidentally, the
supply of saving from Ramsey (1928).]
Seen in this light, it is difficult to accept the common refrain that Solow-Swan
"generalized" the Harrod-Domar model just because they allowed for flexible
technology whereas Harrod-Domar did not. This is certainly what Solow
insinuated, arguing that the "bulk of this paper is devoted to a model of long-
run growth which accepts all the Harrod-Domar assumptions except that of
fixed proportions." (Solow, 1956: p.66). But, as this discussion has hopefully
made clear, it is not technology that is critically different. It is the adjustment
process.
From the perspective of Ockham's razor, it may very well be that the Harrod-
Domar model is "more general" than Solow-Swan. Harrod-Domar make fewer
restrictive assumptions. Firstly, they do not assume an instantaneously stable
macroeconomic equilibrium (as Solow-Swan do). Secondly, they do not assume
any particular factor price adjustment mechanism (as Solow-Swan do).
The question of generality, then, can only be resolved by an empirical race
between the two models. If Harrod-Domar can explain the same things as
Solow-Swan, then, by Ockham's razor, Harrod-Domar is clearly "superior"
because it does so with fewer assumptions. If Solow-Swan explains the data
better than Harrod-Domar, then we would lean towards declaring Solow-Swan
the "better" model.
However, as we shall see, it turns out that Solow-Swan performs poorly when
confronted with empirical evidence. Substantial modifications have to be added
35
(particularly regarding technical progress) to make it comply with the data.
Interestingly, the kind of modifications to the Solow-Swan growth model that
"endogenous growth theory" has proposed in recent years turn out to generate
a reduced-form dynamical system that is virtually identical to the Harrod-
Domar model. Taking an analogy from astronomy, economists have had to add
epicycles upon epicycles upon epicycles to the Solowian growth model in order
to have it explain what could be more simply explained by the Harrod-Domar
model. The conclusion imposes itself.
Finally, we should remind ourselves why Solow-Swan is a "Neoclassical" and not
a "Keynesian" growth model. From the outset, we have a very Neoclassical
factor market equilibrium adjustment process. But, perhaps more strikingly, the
macroeconomics are very different. A complete Keynesian growth model would
have investment as a function of financial conditions and savings derived from
investment via the multiplier. In the Solow-Swan model, not only are all
Keynesian "financial" factors omitted, but the direction of causality between
savings and investment is reversed. This is equivalent to re-imposing Say's
Law. Thus, the Solow-Swan growth model is "Neoclassical" in every respect,
and not an extension of "Keynesian" macroeconomics, as has occasionally been
advertised. For attempts at introducing more "Keynesian" features into a
growth model, see our review of Cambridge growth theory and Keynes-Wicksell
monetary growth models.
________________________________________________________
________________________________________________________
3.2.1. Introduction
Growth theory and development theory may seem like natural bedfellows. As it
happens, however, they have had a rather tempestuous relationship over the
past forty years. Growth theory focuses on how a nation's output-labor ratio
grows. Development theory tries to explain why nations possess very different
standards of living and what can be done about them. Of course, "standards of
36
living" is not quite the same as national income per capita. Increasing the latter
is not very meaningful if it is very unevenly distributed, does nothing to
alleviate mass poverty, exacerbates structural problems or is unsustainable in
the longer term.
Be that as it may, during the 1950s and early 1960s, the consensus among
economists was that development and growth were virtually one and the same
thing. The essential difference between a developed nation and an
underdeveloped nation, it was felt, was that one had a high income per capita
while the other had a lower ratio. In terms of growth models, the process of
"development" was identified merely as the attempt by a nation to increase its
capital-labor ratio, i.e. to accumulate capital at a faster rate than population, so
that its income per capita would "catch up" with the industrialized world.
In the 1960s, this view gradually disappeared. The idea that underdeveloped
nations were merely pint-sized, antiquated versions of industrialized nations
was seen as untenable. Underdeveloped nations in the modern world face
many unique challenges and problems which industrialized nations never had
to contend with when they were "growing up". For instance, a country trying to
develop while, at the same time, integrating itself into an advanced
international economic order is something quite unprecedented. As a result,
development economists began focusing on the particular experiences of
underdeveloped nations on their own terms, with all their peculiar features and
structural problems.
In recent years, however, the "development-as-growth" perspective has once
again emerged to the fore. Today, economists and policy-makers repeatedly
appeal to growth theory to explain the differences between the experiences of
nations and to guide development policy. Apparently, the theory of choice in
most studies is the Solow-Swan growth model (or one of its variants). Given the
importance of the policy questions involved, it is worthwhile to spend some
time on the implications of Neoclassical growth theory for economic
development.
What are the empirical implications of the Solow-Swan growth model? The first
thing to notice is that the population growth rate "dictates" the steady-state
growth rates of all the variables in the economy. In other words, at steady-
state, all level variables -- output, Y, consumption, C, capital, K, and labor, L --
grow at the same natural rate n. If there is depreciation, they all grow at the
same rate n+δ . As a result, all the per capita ratio variables -- output per
person, y, capital per person, k and consumption per person, c -- do not grow at
all. They are constant over time.
The policy implications are intriguing. If, at steady-state, we wish to change the
rate of growth of any of the level terms, then population growth must change.
A change in n implies a change in the required investment line as we see in
Figure 1 (n1 > n2) and thus an accompanying movement in the steady-state
ratio. So, a change in n changes the growth rates of the level variables but the
per capita ratios will, once the new steady state is achieved, remain constant.
37
Fig. 1 - A Rise in Population Growth
This leads us to our second point, the Solowian paradox of thrift. This claims
that a permanent change in the rate of savings, s, will not permanently change
the economy's growth rate. For instance, an increase in the savings rate (from
s1 to s2 in Figure 2), will "swing" the investment curve up, so that we move from
the steady-state ratio k1* to the new steady-state ratio k2*. Now, before this
shift, all level variables were growing at the rate n. Immediately after the
change in the savings rate, capital grows a little bit faster than n, so that k
increases from k1* (and output and consumption grow a bit faster too). But as k
approaches k2*, the growth rate of capital slows down. When we are at k 2*,
capital growth (and output and consumption growth) returns to n. So increasing
the savings rate permanently will only increase growth rates temporarily. In the
longer run, it will have no effect on growth rates.
38
Fig. 2 - Changing the Savings Rate
Associated with this is also the paradox of output, i.e. a country which has a
"higher" production function, ƒ (k), will also fail to permanently increase growth
rates. To see this, go through the same exercise in Figure 2, but leave s the
same and change only ƒ (k) -- the results are almost geometrically identical.
Intuitively, one can think of the Solow-Swan model as a person ("capital")
attempting to run on a treadmill ("labor"). The growth of capital is the speed of
the runner, while the growth of labor is the speed of the treadmill. We have a
"steady-state" if the runner manages to stay in the same place. If he runs too
slow, then he will fall behind (k declines); if he runs too fast, he will move
forward (k rises). Thus, in order to stay in the same place, capital has to run
exactly as fast as the treadmill. Anything that increases the speed of the
treadmill (a rise in n or δ), forces capital to run faster just to stay in the same
place. Stretching the analogy, the savings rate merely determines where on
the treadmill a person will be running-in-place (close to the front, in the middle,
close to the back, etc.), but regardless of where he chooses to be on the
treadmill, he still has to run at the same speed.
This result is "paradoxical" because one of the old saws of development theory
was that increasing the savings rate would accelerate growth. W. Arthur Lewis
(1954, 1955) was one of the primary proponents of this idea. As he writes "the
countries which are now relatively developed have at some time in the past
gone through a rapid acceleration in the course of which their rate of net
investment has moved from 5 per cent [of national income] or less to 12 per
39
cent or more" (Lewis, 1955, p.208). Consequently, "The central problem in the
theory of economic growth is to understand the process by which a community
is converted from being a 5 per cent to a 12 per cent saver." (1955, p.226),
However, Lewis relied on Classical growth theory and the Harrod-Domar model
to derive his conclusion. But the Solow-Swan model tells us is that the Lewis
thesis is only temporarily correct, i.e. there is a short-run acceleration in
growth, but in the long-run, growth settles down once again to its previous
rate.
A third thing that should be noticed from Figure 2 is that steady-state growth is,
in general, consumption inefficient. In other words, the economy's steady-state
does not necessarily yield the highest consumption per capita forever. In Figure
2, notice that c1* > c2*. So if we begin with savings propensity s2, steady state
k2* rules and consumption per capita is c2*. Clearly, the steady-state k2* is not
the maximum consumption possible. But recall that steady-states are stable, so
there are no inherent economic reasons to move away from this inefficient
position.
If, and this is a big if, everyone in the economy could somehow collectively
decide to decrease the savings rate from s2 to s1, then per capita consumption
could increase permanently from c2* to c1*. Of course we obtain this because of
the way we drew our diagram. It is conceivable that decreasing s might lead to
lower c, for instance. But the principle should be clear: it is quite possible that
the steady-state we end up at will be consumption inefficient in the sense that
somehow changing the propensity to save will improve consumption per capita
permanently. We shall return to this point when discussing "optimal growth". It
shall also crop up again when discussing monetary growth models, as
consumption-inefficiency makes growth theory amenable to government policy.
-. Absolute Convergence
- Conditional Convergence
40
Absolute convergence is depicted in Figure 1, where we can assume that k1
represents the capital-labor ratio of a poor country and k 2 the capital-labor ratio
of a rich country. As they are otherwise identical, the stability of the Solow-
Swan model predicts that both the poor and rich countries will approach the
same k*. Notice that this means that the poor country will grow relatively fast
(capital and output grow faster than n), while the rich nation will grow quite
slowly (capital and output grow slower than n). Stated differently in adjustment
terms, as k1 < k2, then ƒ ′ (k1) > ƒ ′ (k2), so the marginal product of capital
relative to labor is higher in the poor nations than in the rich ones, thus the
poor will accumulate more capital and grow at a faster rate than the rich.
This may seem a farfetched proposition -- but many Neoclassical argue that it
is not entirely ludicrous. Consider, say, the end of World War II, when the
capital stocks (but not the labor) of Japan and Germany were destroyed by
Allied bombing and other war-related actions. Notice that the other features of
the defeated nations, namely their technological possibilities, savings rates,
population growth rates, etc. were pretty much still the same as before the war.
Or, more pertinently, they were virtually the same as other countries in the
industrialized world. So, relative to other industrialized countries with similar
parameters, post-war Germany and Japan had exceptionally low capital-labor
ratios, k (akin to k1 in Figure 1). In accordance with the absolute convergence
hypothesis, the Solow-Swan model would predict that these two nations would
41
subsequently grow faster than other industrialized countries in the immediate
post-war period -- as indeed they did.
Of course, for the world as a whole, the absolute convergence hypothesis is
bound not to hold as nations are not as similar to one another as in the
aforementioned example. It is difficult to presume that, say, Mozambique and
Denmark ought to "converge" to the same ratios and growth rates. Their
savings propensities, technological possibilities and population growth rates are
just so very different.
42
growth rates, albeit with different steady-state capital-labor ratios, and thus
different income/consumption per capita.
There are two types of poverty traps: technologically-induced poverty traps and
demographically-induced poverty traps. We shall consider each of them
separately. Both cases involve the inclusion of a non-linearity into the system.
Both, incidentally, were considered by Robert Solow (1956).
In the 1940s, it was quickly realized that poor nations are poor, that they are
already saving what they can and still do not seem able to "accelerate"
anything. A consensus emerged that underdeveloped countries might be
caught in a "poverty trap", a vicious circle of low savings and few investment
opportunities. How can this be explained?
Allyn Young (1928) recalled Adam Smith's old idea about how the "division of
labor is limited by the extent of the market". Dynamized into a growth context,
this highlights the importance of externalities and increasing returns to scale in
generating and sustaining an accelerated rate of growth. Nations that did not
manage to achieve increasing returns were left behind. Those that did would
"take-off" into ever-increasing standards of living.
Paul Rosenstein-Rodan (1943, 1961), Hans Walter Singer (1949), Ragnar Nurkse
(1953), Gunnar Myrdal (1957) and Walt Whitman Rostow (1960) appropriated
this idea for development theory. They argued that increasing returns only set
in after a nation has achieved a particular threshold level of output per capita.
Poor countries, they argued, were caught in a poverty trap because they had
been hitherto unable to push themselves above that threshold. In contrast,
successful developing nations had benefited, at some earlier point, from a
massive and wide-spread injection of capital, just enough to push them over
the threshold and thereafter to "take-off". Their policy recommendations for
underdeveloped nations therefore focused on recreating this Big Push
43
artificially, whether by inflows of foreign capital or debt-financed government
investment.
44
Fig. 1 - Technological Trap
The features of the technological trap can be read from the diagram directly.
We have four steady-states -- 0, k1*, k2* and k3*. Of these, k1* and k3* are
stable, while 0 and k2* are unstable. The implication is that if a country begins
with a capital labor ratio that is below k2*, then it will inexorably approach the
stagnant steady-state ratio k1*. If its initial capital-labor ratio is above k2*, then
it will approach the much better steady-state k3*. The ratio k2*, then, is the
"threshold" which a nation has to reach to "take-off" and achieve the higher
steady-state.
Here is where the "Big Push" story comes in. It is argued that developed
nations had, at some point in their history, a "Big Push" in terms of massive
capital investment (or a demographic collapse) which pushed them over the k 2*
edge, which then, by the regular forces of the Solow-Swan model, drove them
further up to the high k3* steady-state. Underdeveloped nations failed to
experience this "Big Push" and so remained stuck in the orbit of k 1*. It is not
that they did not try, of course. Efforts by developing nations to push the
capital-labor ratio up with public and private investment schemes did not work
simply because these were not bold enough. They might have pushed
themselves above k1*, but not enough to cross over the k 2* threshold. For this
to happen, a really big "Big Push" is needed.
There are a few alternative policy options for stagnant nations in light of the
technological trap. The first is that temporarily increasing the savings rate
might actually serve as a policy option in this case. Specifically, consider Figure
2 and suppose that we have a country with savings rate s 1 stuck at the
stagnant steady-state ratio k1*. In order to manipulate itself into a Big Push, a
rise in the savings rate from s1 to s2, will result in a situation where there is only
one stable steady-state ratio -- the very high k 4* in Figure 2. Maintaining the s2
savings rate for a while, the nation will enjoy a rapid rise in the capital-labor
ratio from k1* towards k4*. However, it need not maintain this savings rate
forever. Once the capital-labor ratio has gone past k2*, it can lower the savings
rate back down to s1, and now the country is within the orbit of the high capital-
labor ratio, k3*, and will move inexorably towards it by the standard properties
of Solow-Swan adjustment. Thus, a temporary rise in the savings rate is one
way for a nation to pull itself out of the technological trap.
45
Fig. 2 - Temporary Rise in the Savings Rate
One disappointing feature of the technological trap model is that, in the end,
different countries may be at different steady-state ratios, but they still exhibit
identical growth rates. In other words, in Figure 1, a poor economy at steady-
state k1* and a rich economy at steady-state k3* would still experience the
same growth rates of level variables and no growth in per capita variables. In a
way, then, this result is similar to the conditional convergence case.
46
This result can be easily circumvented if, following the arguments of the early
development theorists, we decide to simply omit the upper diminishing returns
portion of the production function, i.e. if we posit that:
< for 0 < k <
0 ka
ƒ ′′
(k)
> for ka< k
0
as shown in Figure 4. In this case, increasing returns hold from ka onwards.
The consequence of this modification is that once a nation passes over the k 2*
threshold, it will grow forever. Thus, now, a poor nation is trapped in the sense
that it will be stuck at its low steady-state and experience no growth in per
capita ratios (the true meaning of "stagnation"), while a rich country's per
capita income, constrained by no steady-state at all, will continue rising forever
after. In this kind of situation, there is no "convergence" of growth rates
between poor and rich nations.
47
development because it generates a poverty trap without having to assuming
anything about technology.
In the Solow-Swan model, the rate of population growth was given exogenously.
However, recall that in Classical models of growth, population growth is
endogenous. Following Robert Malthus (1798), it was posited that the rate of
growth of population is dependent on income per capita. Specifically, as
income per capita rises, then the population growth rate rises. This is known as
the Malthusian theory of demographic transition.
Solow (1956) introduced the Malthusian demographic transition into his model.
He followed the Classicals in allowing that when income per capita was very
low, then population declined i.e. n was negative. But as income per capita
increases, population growth would increase. Solow topped off the story by also
accounting for fertility declines at very high income per capita.
As population growth n is a function of y and y = ƒ (k), then population is
indirectly function of the capital-labor ratio, i.e. n = n(k). We can summarize
the demographic relationship by defining critical values ka and kb, where:
48
Fig. 5 - Population Trap
Notice that in Figure 5, we have three steady-state equilibria: 0, k1* and k2*.
However, of these, only k1* is stable; the origin and k2* are both unstable. Thus,
for any capital-labor ratio between 0 and k2*, the system will tend to bring it
back to k1*. Once again, the interesting implication is if, by some "Big Push",
the economy can be elevated to a capital-labor ratio above k2*, then there will
be a constantly increasing income per capita thereafter. In this demographic
transition model, we do not have convergence in levels or growth rates
between poor and very rich countries.
The Malthusian "population trap" story was emphasized in development theory
by Harvey Leibenstein (1954, 1957) and R. Nelson (1956). However, we should
note that, empirically, there is no relationship between population growth and
income per capita. More precisely, it is argued that this relationship is no longer
valid because of national and international health efforts of the past few
decades have helped push down the death rates and improved birth rates in
underdeveloped countries. So, even if this story could explain past experience,
it is not really "policy-effective" anymore. If anything, population growth is
today more correlated with income distribution rather than income levels.
___________________________________________________________________
49
"Is the "residual factor" a measure of the contribution of
knowledge or is it simply a measure of our ignorance of the
causes of economic growth?"
______________________________________________________________________
A property of the Solow-Swan growth model which is a bit disturbing is the fact
that, at steady-state, all ratios -- the capital-labor ratio, output per person and
consumption per person -- remain constant. This is a bit of a disappointment for
it implies that standards of living do not improve in steady-state growth. This is
not only despiriting, it is also empirically dubious: it contradicts at least two of
the "stylized facts" of industrialized economies laid out in Kaldor (1961) --
namely, that the capital-labor and output-labor ratios have been rising over
time and that the real wage has been rising.
O f course, just because industrialized countries, and others besides, have
experienced ever-increasing per capita consumption and output over the past
three centuries does not, by itself, "contradict" the Solow-Swan model. After all,
out of steady-state, we can easily have changing ratios. So, one possible
explanation for the "stylized facts" that is consistent with the Solow-Swan
model is simply that industrialized nations are still in the process of adjusting
and just have not reached their steady-state equilibrium yet. And why not? It is
not unreasonable to assume that adjustment to steady state might take a very
long time (cf. Sato, 1963; Atkinson, 1969).
But economists are a rather impatient sort. They like to believe that economies
tend to be at or around their steady-states most of the time (a noble exception
is Meade (1961)). As a consequence, in order to reconcile the Solow-Swan
model with the stylized facts, it is tempting to argue that there has been some
sort of "technical progress" in the interim that keeps pushing the steady-state
ratios outwards.
Y = F(K, L, t).
y = ƒ (k, t)
50
The impact of technical progress on steady-state growth is depicted in Figure 1,
where the production function ƒ (·, t) swings outwards from ƒ (·, 1) to ƒ (·, 2) to
ƒ (·, 3) and so on, taking the steady-state capital ratio with it from k 1* to k2* and
then k3* respectively.. So, at t =1, ƒ (·, 1) rules, so that beginning at k0, the
capital-labor ratio will rise, approaching the steady-state ratio k 1*. When
technical progress happens at t = 2, then the production function swings to ƒ (·,
2), so the capital-labor ratio will continue increasing, this time towards k2*. At t
=3, the third production function ƒ (·, 3) comes into force and thus k rises
towards k3*, etc. So, if technical progress is happens repeatedly over time, the
capital-labor ratio will never actually settle down. It will continue to rise,
implying all the while that that the growth rates of level variables (i.e. capital,
output, etc.) are higher than the growth of population for a rather long period
of time.
51
Before proceeding, the first thing that must be decided is whether this is a
"punctuated" or "smooth" movement. Is technical progress a "sudden" thing
that happens only intermittently (i.e. we swing the production function out
brusquely and drastically and then let it rest), or is it something that is
happening all the time (and so we swing the production function outwards
slowly and steadily, without pause). Joseph Schumpeter (1912) certainly
favored the exciting "punctuated" form of technical progress, but modern
growth theorists have adhered almost exclusively to its boring, "smooth"
version. In other words, most economists believe that ƒ(·, t) varies continuously
and smoothly with t.
The simple method of modeling production by merely adding time to the
production function may not be very informative as it reveals very little about
the nature and character of technical progress. Now, as discussed elsewhere,
there are various types of "technical progress" in a production function. The
one we shall consider here is Harrod-neutral or labor-augmenting technical
progress. In fact, as Hirofumi Uzawa (1961) demonstrated, Harrod-neutral
technical progress is the only type of technical progress consistent with a
stable steady-state ratio k*. This is because, as we prove elsewhere, only
Harrod-neutral technical progress keeps the capital-output ratio, v, constant
over time.
Formally, the easiest way to incorporate smooth Harrod-neutral technical
progress is to add an "augmenting" factor to labor, explicitly:
Y = F(K, A(t)·L)
where A(t) is a shift factor which depends on time, where A > 0 and dA/dt > 0.
To simplify our exposition, we can actually think of A(t)·L as the amount of
effective labor (i.e. labor units L multiplied by the technical shift factor A(t)). So,
output grows due not only to increases in capital and labor units (K and L), but
also by increasing the "effectiveness" of each labor unit (A). This is the simplest
way of adding Harrod-neutral technical progress into our production function.
Notice also what the real rate of return on capital and labor become: as Y =
F(K, A(t)·L) then the rate of return on capital remains r = FK, but the real wage
is now w = A(t)·(∂ F/∂ (A(t)·L)] = A·FAL.
Modifying the Solow-Swan model to account for smooth Harrod-neutral
technical progress is a simple matter of converting the system into "per
effective labor unit" terms, i.e. whenever L was present in the previous model,
replace it now with effective labor, A(t)·L (henceforth shortened to AL). So, for
instance, the new production function, divided by AL becomes:
Y/AL = F(K/AL, 1)
ye = ƒ (ke)
where ye and ke are the output-effective labor ratio and capital-effective labor
ratio respectively. Notice that as F(K, AL) = AL·ƒ (ke), then by marginal
productivity pricing, the rate of return on capital is:
52
r = FK = ∂ (AL·ƒ (ke))/∂ K
But as ƒ (ke) = ƒ (K/AL), then ∂ (AL·ƒ (ke))/∂ K = AL·ƒ ′ (ke)·(∂ ke/K), and since ∂ ke/
∂ K = 1/AL, then ∂ (AL·ƒ (ke))/∂ K = ƒ ′ (ke), i.e.
r = ƒ ′ (ke)
the slope of the intensive production function in per effective units terms is still
the marginal product of capital.
What about the real wage? Well, continuing to let the marginal productivity
theory rule, then notice that:
w = ∂ (F(K, AL)/dL
or simply:
I/AL = s(Y/AL)
or:
ie = sye = sƒ (ke)
gAL = gA + gL = θ + n
Now, for steady-state growth, capital must grow at the same rate as effective
labor grows, i.e. for ke to be constant, then in steady state gK = θ + n, or:
Ir = dK/dt = (θ +n)K
53
ire = (θ +n)ke
where ire is the required rate of investment per unit of effective labor.
The resulting fundamental differential equation is:
dke/dt = ie - ire
or:
which is virtually identical with the one we had before. The resulting diagram
(Figure 2) will also be the same as the conventional one. The significant
difference is that now the growth of the technical shift parameter, θ , is
included into the required investment line and all ratios are expressed in terms
of effective labor units.
Consequently, at steady state, dke/dt = 0, and we can define a steady-state
capital-effective labor ratio ke* which is constant and stable. All level terms --
output, Y, consumption, C, and capital, K -- grow at the rate n+θ .
54
receive the income and consume. In other words, to assess the welfare of the
economy, we want to look at output and consumption per physical labor unit.
Now, the physical population L is only growing at the rate n, but output and
consumption are growing at rate n + θ . Consequently, output per person, y =
Y/L, and consumption per person, c = C/L, are not constant; they are growing at
the steady rate θ , the rate of technical progress. Thus, although steady-state
growth has effective ratios constant, actual ratios are increasing: actual people
are getting richer and richer and consuming more and more even when the
economy is experiencing steady-state growth.
The main implication of all this is that the Solow-Swan growth model can only
explain steadily-increasing standards of living (growing y and c) via technical
progress.
There is an entire body of empirical literature, known as "growth accounting",
which attempts to address the empirical validity of this modified Solow-Swan
model. Unlike the model just described, they usually assume that the technical
progress factor A(t) is outside the production function, i.e.:
where A > 0 and dA/dt > 0, is the technical progress parameter (in this context,
A is referred to as the "Total Factor Productivity" or "TFP" parameter). Thus,
unlike the model above, growth accounting literature assumes that technical
55
progress is Hicks-neutral or TFP-augmenting rather than Harrod-neutral/labor-
augmenting.
[Note: Hello, does this not contradict Uzawa's (1961) proof? Not quite. It is
possible for technical progress to be both Hicks-neutral and Harrod-neutral if
the production function has constant unit elasticity of substitution, i.e. σ = 1. As
we prove elsewhere, the Cobb-Douglas form of the production function is the
only functional form that fulfills this. And you were wondering why it was so
popular?]
The growth accounting literature asks the simple question: given the history of
output growth, how much of it was due to growth of capital inputs (g K), growth
labor inputs (gL) and technical progress (gA)? Output growth, labor growth and
capital growth are observable, but technical progress is not. How do we
estimate it?
Empirical growth accounting began with the famous studies of Moses
Abramovitz (1956, 1962) and Robert Solow (1957). Their procedure in
calculating gA was to deduct the growth rates of capital and labor (multiplied by
their respective factor prices) and ascribing the "residual" to technical
progress. For example, if we assume Cobb-Douglas form, so that the production
function is:
Y = AKα L(1-α )
gY = gA + α gK + (1-α )gL
where, as gY, gK, gL and α are more or less observable, then gA can be imputed
residually. In fact, total factor productivity growth, gA, is often referred to simply
as the Solow residual.
The striking feature of the early investigations of growth accounting was the
size of the Solow residual. Solow (1957), for instance, calculates that only
12.5% of growth in output per capita in the 1909-1949 period in the United
States was due to factor accumulation -- leaving 87.5% to be explained by
technical progress! This is a bit dispiriting as it implies that the overwhelming
majority of the growth that is empirically observed is "outside" the explanatory
power of the Solow-Swan growth model!
In a series of famous studies, Edward Denison (1962), Zvi Griliches (1963) and
Dale W. Jorgensen and Zvi Griliches (1967) argued that the there were errors in
measurement in the early growth accounting work. For instance, if we remind
ourselves that technical progress usually arrives "embodied" in new capital
goods, then a lot more of growth can be ascribed to the "qualitative growth" of
capital inputs. Thus, the importance of the Solow residual -- the growth in "total
factor productivity" -- was argued to be substantially less than that estimated
by earlier researchers. We will turn to technical progress again when examining
endogenous growth theory.
56
M. Abramovitz (1956) "Resource and Output Trends in the United States since
1870", American Economic Review, Vol. 46
A.B. Atkinson (1969) "The Time Scale of Economic Models: How long is the long
run?", Review of Economic Studies, Vol. 36 (2), p.137-52.
E.F. Denison (1962) The Sources of Economic Growth in the United States and
the Alternatives Before Us. New York: Committee on Economic Development.
F.H. Hahn (1987) "`Hahn Problem'", in Eatwell, Milgate and Newman, editors,
The New Palgrave: A dictionary of economics. London: Macmillan.
F.H. Hahn and R.C.O. Matthews (1964) "The Theory of Economic Growth: A
survey", Economic Journal, Vol. 74, p.779-902. As reprinted in 1969 Surveys of
Economic Theory: Vol. II - Growth and development. London: Macmillan.
R.F. Harrod (1953) "Full Capacity vs. Employment Growth: Comment", Quarterly
Journal of Economics, Vol. 67 (4), p.553-9.
R.F. Harrod (1960) "A Second Essay in Dynamic Theory", Economic Journal, Vol.
70, p.277-93.
W.A. Lewis (1955) The Theory of Economic Growth. Homewood, Ill: Irwin.
57
T.R. Malthus (1798) An Essay on the Principle of Population. 1960 reprint of
1798 and 1892 editions, New York: Modern Library.
R.R. Nelson (1956) "A Theory of the Low Level Equilibrium Trap", American
Economic Review, Vol. 46, p.894-908.
P. Rosenstein-Rodan (1961) "Notes on the Theory of the Big Push", in H.S. Ellis
and H.C. Wallich, editors, Economic Development in Latin America. New York:
Macmillan.
R.M. Solow (1956) "A Contribution to the Theory of Economic Growth" Quarterly
Journal of Economics. Vol. 70 (1) pp. 65-94.
R.M. Solow (1957) "Technical Change and the Aggregate Production Function",
Review of Economics and Statistics, Vol. 39, pp. 312-20.
R.M. Solow (1970) Growth Theory: An exposition. 1988 edition, Oxford: Oxford
University Press.
58
T.W. Swan (1956) "Economic Growth and Capital Accumulation", Economic
Record, Vol. 32 (2), p.334-61.
4. Multisector Growth 73
59
4.2.2. Case I: Consumer Goods are More Capital-Intensive 114
4.2.3. Case II: Investment Goods are More Capital-Intensive 122
4.2.4. Conclusion 125
4.3. Selected References 126
IV - Multisector Growth
"It is evident that in all these constructions the condition that the
equilibrium at a moment in time be unique is crucial. The rest of
the story is really concerned with ensuring that there is a steady
state with positive factor prices. But the assumptions required to
establish uniqueness of momentary equilibrium are all terrible
assumptions."
________________________________________________________
60
in the Review of Economic Studies, on the two-sector growth model. Then, as
suddenly as it had appeared, this line of research evaporated in the 1970s.
The principal equations of the two-sector model can thus be set out as follows:
Yc = Fc(Kc, Lc) - consumer sector production function (1)
61
These equations should be self-evident. The consumer goods sector and the
investment goods sector each use both capital and labor to produce their
output. We capture this with equations (1) and (2), where Fc(·) is the consumer
goods industry production function and Fi(·) the investment goods industry
production function. Both production functions Fc(·) and Fi(·) are nicely
"Neoclassical", in the sense of exhibiting constant returns to scale, continuous
technical substitution, diminishing marginal productivities to the factors, etc.
Equation (3) is merely the definition of aggregate output, expressed in terms of
the consumer good. Equations (4) and (5) are also self-evident: the market
demand for labor is Lc + Li and the market demand for capital is Kc + Ki. As L
and K are the respective supplies, then (4) and (5) are merely the factor
markets equilibrium conditions so that demand equals supply in each market.
Now, we assume no barriers competition in the factor markets, so that there is
free movement of labor and capital across sectors. This implies that the wage
rate w and the profit rate r must be the same in both the consumer goods and
investment goods industry. Neoclassical economic theory tells us that the
marginal productivity schedules for each factor in each industry form those
industries' demand functions for the factors. As such, in labor market
equilibrium, the return to labor (w) must be equal to the marginal product of
labor in the consumer goods sector (dYc/dLi) and the marginal product of labor
in the investment goods sector p·(dYi/dLi). This is equation (6). Equation (7)
asserts the analogous condition in capital market equilibrium, i.e. that the rate
of return on capital (r) is equal to the marginal product of capital in both
sectors.
Finally, as the investment goods industry produces all the new capital goods in
the economy, then, ignoring depreciation, we can define the change in the total
stock of capital as that sector's output, i.e. dK/dt = Y i, so the growth rate of
capital is gK = (dK/dt)/K = Yi/K, which is equation (8). Labor supply is assumed
to grow exogenously at the exponential rate n, thus the growth rate of labor is
gL = (dL/dt)/L = n, which is equation (9).
We would now like to express everything in intensive form, i.e. in per capita or
per labor unit terms. This gets a bit tricky. But defining:
yc = Yc/L
λ c = Lc/L
kc = Kc/Lc
ƒ c(kc) = Fc(Kc, Lc)/Lc
yi = Yi/L
λ i = Li/L
ki = Ki/Li
ƒ i(ki) = Fi(Ki, Li)/Li
y = Y/L
k = K/L
62
yi = λ iƒi(ki) - investment sector intensive production (2′ )
function
Equations (1′ ) and (2′ ) are the intensive production functions. These are
derived as follows. Recall from (1) that Yc = Fc(Kc, Lc), then dividing through by
Lc, we obtain Yc/Lc = Fc(Kc/Lc, 1) = ƒc(kc). But Yc/Lc = (Yc/L)(L/Lc) = yc/λ c. Thus yc
= λ cƒc(kc), which is (1′ ). The transformation from (2) to (2′ ) follows a similar
procedure.
Each of these intensive production functions have simple properties. For
instance, their first derivatives are the marginal product of capital, i.e. ∂ Fc/dKi
= ƒc′ (kc) and ∂ Fi/dKi = ƒi′ (ki), so diminishing marginal productivity implies ƒc′′
(kc) < 0 and ƒi′′ (ki) < 0.
The production functions also fulfill the famous "Inada conditions",
formulated by Ken-Ichi Inada (1963). Specifically:
ƒc(0) = 0, ƒc(∞ ) = ∞
for the intensive production function for the consumption good. The equivalent
Inada conditions apply to the intensive production function for the investment
good:
ƒi(0) = 0, ƒi(∞ ) = ∞
63
Equations (6′ ) and (7′ ) use Euler's theorem. Now, it is a simple matter to show
that dYc/dKc = ƒc′ (kc) and dYi/dKi = ƒi′(ki). So, the competitive condition in (7) is
converted to r = ƒc′ = p·ƒi′ . By constant returns to scale, we know from Euler's
theorem that Yc = (dYc/dK)·K + (dYc/dLc)·Lc, thus dividing through by L and
rearranging: dYc/dLc = (Yc/L) - (K/L)·(dYc/dK) = yc - k·ƒc′ . The corresponding
transformation can be done for dYi/dLi. This is how we convert (6) to (6′ ).
Finally, equation (9′ ) is obtained simply by multiplying (9) through by 1 = L/L,
so gK = (Yi/L)/(K/L) = yi/k.
Now, following Uzawa's notation, let us define ω (omega) as the wage-profit
ratio, i.e. ω = w/r. Thus, combining equations (6′ ) and (7′ ):
or simply:
ω = (ƒ c(kc)/ƒc′ ) - kc = (ƒ i(ki)/ƒi′ ) - ki
Thus, ω is positively related to kc and ki. It is not difficult to see that these are
monotonic relationships. Consequently we can define the functions:
which will be used extensively as they will form the boundaries of our
equilibrium path.
The growth story can be quickly told. At steady-state, the capital-labor ratio k
must be constant. As k = K/L, then:
gk = g K - g L
(dk/dt)/k = yi/k - n
so:
dk/dt = yi - nk
64
Of course, this is not the end of the story, for we have yet to consider the
question of macroeconomic equilibrium. Specifically, note that while we have
laid out the supply of consumer and investment goods, we have said nothing
so far about the demand for these outputs. As it turns out, this will depend
crucially on the consumption-savings behavior of households. Specifically, the
demand for consumer goods will depend on the amount of income households
consume, while the demand for investment goods will depend on the amount
of savings. Now, we can follow the "Classical" economists and presume that all
wages are consumed and all profits are saved (as Uzawa (1961) did); or we
allow for some saving out of both wages and profits (as Uzawa (1963) allows)
and we can even impose that the propensity to save out of these two
categories of income is different (as Drandakis (1963) presumes).
Whatever the case, the model will not be closed until we consider the demands
for outputs explicitly. This is, after all, a Neoclassical model, which means that
the imputation theory should hold: output demands will determine output
supplies and consequently factor market equilibrium. Causality thus runs from
preferences of households to factor market equilibrium
So, with k*, we produce yc* as our aggregate output. As we know from our
discussion of intensive production functions, the slope of the ray tangent to the
production function is ƒc′ (= rc), the marginal product of capital, the point where
that ray intersects the vertical axis, ƒ c - kcƒc′ (= wc), the marginal product of
labor. Most importantly, the point where the tangent ray intersects the
horizontal axis is ωc = wc/rc, the ratio of marginal products. If we specialize in
producing consumer goods, then the ratio ωc can be seen as the resulting
equilibrium factor price ratio, i.e. the factor prices that clear the capital and
labor markets, where the initial supply of capital and labor is captured by the
given capital-labor ratio k*.
65
Fig. 1 - Two-Sector Model (for given k*)
66
equilibrium will be ω c; if we allocated all factors into the investment goods
sector, the equilibrium will be ωi. The assumption of strict concavity of
production functions guarantees this uniqueness. If we do not know the
sectoral composition of output, we cannot determine what the actual
equilibrium factor-price ratio ω is: it can range from a minimum of ω c (complete
specialization in consumer goods) to a maximum of ω i (complete specialization
in investment goods), i.e. for a given k,
where ω min = ω c and ω max = ω i. The Inada conditions guarantee us that these
exist.
Notice also that varying the given k, these upper and lower boundaries for
equilibrium factor prices will vary. Specifically, note that in Figure 1, if we
increase k above k*, then both ωc and ωi will increase. We depict the resulting
boundaries in Figure 2 as the upward-sloping curves ωc(k) and ωi(k). We draw
them as straight lines, but this is not necessarily the case. The only things that
are posited are (i) that the relationship between k and ω c and ω i is unique and
monotonically increasing (by assumptions of strict concavity and constant
returns to scale for the production function) and (ii) that the investment goods
boundary ωi(k) will always lies above the consumer goods boundary ωc(k) (from
the assumption that consumer goods industry is more capital-intensive than
the investment goods industry). Naturally, if we change the assumption of
67
Fig. 2 - Factor Prices and Quantities
There is, of course, another way of depicting the curves. Specifically, as ωi(k)
and ωc(k) are monotonically increasing and unique, then they are invertible, i.e.
we can specify the identical curves ki(ω) and kc(ω), where for a given factor
price ratio (e.g. ω *), we have the resulting capital-labor ratios in both sectors
(ki and kc respectively). This is identically depicted in Figure 2, but now we read
ω as the independent variable and k as the dependent..
To see this inversion in production function space, examine Figure 3. The given
factor-price ratio ω* will set a point on the horizontal axis from which emanate
two rays, one with slope rc and another with slope ri, corresponding to the
marginal products of capital for the consumer goods and investment goods
industries. These rays form tangencies with the intensive production functions
of both the consumer goods and capital goods industries at points e c and ei
respectively, which translate into resulting capital-labor ratios kc and ki. Thus,
this indicates the relationships kc(ω) and ki(ω) that we find depicted in Figure 2.
(notice also that kc > ki, which is another indicator of the relative factor
intensity of the sectors -- use the same formula, k/ω , and notice that kc/ω * >
ki/ω *).
Notice in Figure 3 that although the factor-price ratio is the same for both
sectors, so ω * = ri/wi = rc/wc, we have it that ri ≠ rc and wi ≠ wc, so it seems that
the rates of return to capital and wages are not equal across sectors. But we
must not forgot the price of investment goods, p. Specifically, p will be such
that p·ri = rc and p·wi = wc.
Finally, notice that the amount that will be produced when factor prices are ω *
can be deciphered from yc/λ c and yi/λ i on the vertical axes. Notice that both
industries have positive output per capita (yc, yi > 0), so we are not specializing
68
exclusively in either of them. Both sectors are allowed to operate and the
particular amounts they produce will be dictated by the equilibrium factor price
ratio we begin with, ω *.
This, of course, does not end the story. If we allow both ω and k to vary, then
the entire shaded region in Figure 2 becomes available as a solution. This is
where the rest of the Uzawa model comes into play. Stepping ahead of
ourselves a little bit, it might be worthwhile to sketch out what we are aiming
for.
We will proceed in reference to Figure 4. Suppose we are given an initial
aggregate capital-labor ratio k0. As we say nothing about allocation between
sectors, then there is a whole range of equilibrium factor prices ω which are
consistent with that allocation (within some maximum/minimum range). Pick
one of these factor price ratios. This will then determine a sectoral allocation of
kc(ω) and ki(ω). But this may not be necessarily market-clearing, i.e. it may be
that λ ckc(ω ) + λ iki(ω ) ≠ k0, so that our demands for factors are not equal to our
initial supplies of the factor, which implies that the factor price ratio we have
chosen is not appropriate. So, for the initial k0, we must search for a market-
clearing factor price ratio that makes demands equal to supplies.
The line k(ω) in Figure 4 maps out the locus of equilibrium factor prices for
every aggregate capital-labor ratio. The fact that this is upward-sloping
everywhere and lies between the boundaries is important. Suppose we begin at
k0. The locus k(ω ) tells us that ω 0 is the market-clearing factor price ratio.
Thus, the corresponding sectoral allocations kc(ω 0) and ki(ω 0) are equilibrium
allocations, i.e. λckc(ω0) + λiki(ω0) = k0. In contrast, for initial capital stock k 0, the
wage-profit ratio ω 1 is not market clearing, so λ ckc(ω1) + λ iki(ω1) ≠ k0. However,
for initial capital stock k1, ω1 is the market-clearing wage-profit ratio (i.e. λ ckc(ω
1) + λ iki(ω1) = k1). So positions a and c represent factor market equilibrium,
while positions such as b and d are factor market disequilibrium. The curve k(ω)
is merely the locus of equilibrium positions.
69
reasons, we call this simply a "factor market equilibrium". However, this says
nothing about the long-run position of the system. Specifically, any capital-
labor ratio k is in factor market equilibrium if we have the right factor prices for
it. But a factor market equilibrium does not presuppose or imply steady-state
growth.
Suppose (k0, ω0) is a factor market equilibrium today but it is not a steady-
state. Consequently, the production of goods will proceed, and factors will grow
at their own rates (labor by the natural rate, n, capital by the sectoral
allocation to investment goods production). Nothing we have said so far
presupposes that capital and labor will grow at the same rate. Thus, it is likely
that tomorrow the capital-labor ratio may be different, say, it may increase
from k0 to k1. Consequently, tomorrow, new factor equilibrium prices will be
obtained (ω1), which determines sectoral allocation, which in turn determines
capital growth, etc.
If labor is growing at an exogenous natural growth rate n and we posit some
exogenous propensity to save, then (hopefully) there exists a capital-labor ratio
k* consistent with steady-state. This is going to be one of the points along the
horizontal axis in Figure 4. But, more importantly, as long as our factor
equilibrium locus is monotonic, there will be associated with k* is a unique set
of "steady-state" equilibrium factor prices, ω * and consequently, steady-state
sectoral capital-labor ratios kc(ω*) and ki(ω*). If there is not, we have serious
problems.
The questions before us are several. Firstly, can we define a factor market
equilibrium locus k(ω ) that possesses nice properties? By "nice" we mean that
it sits "within" the shaded area of Figure 4, and is upward-sloping and
monotonic, so that we can define a unique factor price equilibrium ω for every
aggregate capital-labor ratio k. This is crucial. Suppose not. Suppose we have a
situation like the one depicted in Figure 5, where we have a bizarre-looking k(ω)
locus. For capital-labor ratio k0, we have three factor market equilibrium prices,
ω a, ω b and ω c, which means that from k0, we are not sure which factor market
equilibrium will obtain. As each is associated with a different sectoral allocation
(kc(ω ), ki(ω )), we do not know which direction we are heading in!
70
Fig. 5- Troubling Equilibria
Let us analyze this troubling case in a bit more detail. As we know, under
competitive conditions, r = p·ƒi′ = ƒc′. As a result, we can express p, the price of
investment goods in terms of consumer goods, as the ratio of marginal
products of capital:
p = ƒc′(kc)/ƒi′ (ki)
Now, as ƒc′′ < 0 and ƒi′′ < 0, then p is proportional to the relative capital-labor
ratios ki/kc, i.e.
p ∝ ki/kc
so if ki is high relative to kc, then ƒi′(ki) is low relative to ƒc′(kc), which implies
that p is relatively high. Thus, the higher ki/kc, the higher p will be.
Now, consider Figure 5 again, where we have three factor market equilibrium.
Each of these equilibria will be associated with a different price p. So, consider
combination a = (k0, ωa). We can deduce diagrammatically that at this point,
ki/kc is quite high and thus the corresponding price, call it pa, will be quite high.
Conversely, consider combination c = (k0, ωc), which has a relatively low ki/kc,
and thus the corresponding price pc will be relatively low. So, rather loosely, we
can infer that pa > pb > pc.
Plotting the production possibilities frontier associated with k0 (call it PPF0) in
Figure 6, which plots the feasible combinations of yc and yi associated with the
capital-labor ratio k0. We see that the three equilibrium prices p a, pb and pc
associated with the respective factor market equilibrium combinations a, b and
c in Figure 5 are represented by three price lines with different slopes. In
71
general, as total output is yc + pyi and k0 is fixed, then the slope of a price line
in PPF space is -p. Recalling that pa > pb > pc, then we see that pa is the
steepest and pc is the flattest line. All three lines form equilibrium points at
their tangency with the PPF, i.e. factor market equilibrium conditions a, b and c
in Figure 5 have their corresponding equilibrium output allocations a, b and c in
Figure 6.
So, now let us note the following: as pa is the steepest, then equilibrium a
corresponds to an output allocation where the output produced by investment
goods sector is relatively high, while the output of the consumer goods sector
is relatively low. Conversely, as pc is the flattest, then at equilibrium c, the
output of investment goods relative to consumer goods is small. Thus, factor
72
be a steady-state. Similarly, if factor market equilibrium c holds, then y ic < yib
and thus gK < gL, so the capital-labor ratio falls and we are not in steady-state
either.
Thus, in sum, multiplicity of factor market equilibria at a given aggregate
capital-labor ratio k0 causes real problems. Even if k0 is a steady-state capital-
labor ratio, it is only steady-state if factor equilibrium price ω b obtains. If, for
some reason, the other factor market equilibrium prices ω a or ω c happen to
hold at k0, then k0 is not a steady-state capital-labor ratio.
The implications of this can be understood by examining the resulting
differential equation in Figure 7. Notice that at k0, only equilibrium b
corresponds to dk/dt = 0. Equilibrium a corresponds to dk/dt > 0 and
equilibrium c corresponds to dk/dt < 0. So, at k 0, we can have steady-state
growth, increasing growth or decreasing growth. All three are possible.
A further issue made stark in Figure 7 is that we can also have multiple steady-
state capital-labor ratios. There is no reason to assume that k 0 and associated
factor market equilibrium b is the only steady-state. A different capital-labor
ratio, k1, with associated factor market equilibrium d can also be a steady-
state. The implications for the stability of steady-state are knotty: it could be
that some steady-states will be stable, some unstable, some may be only half-
stable. In fact, as we shall see later, a differential equation as depicted in Figure
7 can actually yield us a limit cycle, so that we oscillate continuously, but never
quite approach a steady-state.
These sorts of trouble will implicate the ability of the economy to approach
steady-state growth. We would like to place sufficient restrictions on our model
such that most of these difficulties are ruled out. What we want to end up with
a k(ω ) locus that looks more like the nice, monotonic one in Figure 4 rather
than the squiggly one in Figure 5. However, as it turns out, these restrictions
are actually quite severe.
73
So, in sum, the two-sector model poses two essential questions: (1) can we
guarantee a unique factor-market equilibrium ω for every capital-labor ratio k?
(2) is there a unique steady-state growth path k* and can we guarantee that,
starting from any capital-labor ratio, the system will approach k* over time?
Much of what follows is an attempt to establish sufficient conditions so that we
can answer these questions in the affirmative.
The solution to the two-sector model depends crucially on the kind of savings
assumptions we make. In his first version, Uzawa (1961) considered the
"Classical hypothesis" that workers consume all their income and capitalists
save all their profits. In other words, wL is the demand for consumer goods and
rK the demand for new capital goods. In equilibrium, demand equals supply for
the consumer goods and investment goods markets, i.e.
wL = Yc
rK = pYi
w = yc
rk = pyi
Recalling that yc = λ cƒc(kc) and yi = λ iƒi(ki) by definition and that w = ƒc′ (kc) and
r = pƒi′ (ki) by competition, these conditions become:
λ c = ƒc′(kc)/ƒc(kc)
λ i = k·ƒi′(ki)/ƒi(ki)
But we also know from the wage-profit ratio equation that ω + kc = (ƒ c(kc)/ƒc′
(kc)) and ω + ki = (ƒi(ki)/ƒi′(ki)), so:
74
λ c = 1/(ω + kc)
λ i = k/(ω + ki)
Now, recall that the capital-market clearing condition was k = λckc + λiki.
Consequently, rewriting this for λ c:
λc = (k - λiki)/kc
λc = [k - kik/(ω + ki)]/kc
or:
and rearranging:
75
(dk/dω )·(1/k) = (dkc/dω )·[1/kc - 1/(ω +kc)] + (dki/dω )/(ω +ki) + [1/(
ω +ki) - 1/(ω +kc)]
Now, as dkc/dω , dki/dω > 0 and as [1/(ω +ki) - 1/(ω +kc)] = ω /(ω +kc) > 0, then
the sign of this equation depends crucially on the sign of [1/(ω +ki) - 1/(ω +kc)]
= (kc - ki)/[(ω +ki)·(ω +kc)]. So, if we assume that consumer goods are more
capital-intensive than investment goods, so that kc > ki, then we are
guaranteed that dk/dω > 0. If, in contrast, ki > kc, so that investment goods are
more capital intensive, then dk/dω is ambiguous.
We shall refer to the assumption that kc > ki for all ω , i.e. that consumer goods
are more capital-intensive that investment goods, as the "Uzawa capital-
intensity condition". As we have seen, this is a sufficient condition for
uniqueness of factor market equilibria, i.e. only by adopting the Uzawa capital-
intensity condition can we conclude with certainty that dk/dω > 0 for all ω ∈ [ω
min, ω max]. In other words, the Uzawa capital-intensity assumption makes the
dk/dt = yi - nk
As we have made the "Classical hypothesis" that all profits are saved and all
wages are spent, remember that this implies that rk/p = yi, and by the marginal
productivity assumption, rk/p = ƒ i′ (ki), so this reduces to:
dk/dt = ƒ ′ (ki) - nk
Is this stable? Stability of the steady-state capital-labor ratio, k*, requires that
d(dk/dt)/dk < 0 around k*. But the capital-labor ratio of the investment goods
sector, recall, is a function of equilibrium factor prices, ω , and these, in turn,
are determined by the aggregate capital-labor ratio, k. So by intuitive chain
rule logic:
Determining the sign of this is the crucial step. Obviously, ƒ′(ki) is negatively
related to ki by simple diminishing marginal productivity, i.e. dƒ′(ki)/dki < 0. We
know already that dki/dω > 0. So the question boils down to dω/dk. We have
76
proved that if Uzawa's capital-intensity condition holds, then dω /dk > 0, and
thus we are home free because this implies that for values of k above n,
d(dk/dt)/dk < 0 and thus our system is stable.
This is interesting. The Uzawa capital-intensity condition was imposed to
guarantee uniqueness of factor-market equilibrium for every k. But it also
implies uniqueness and stability of the steady-state growth path. Why does
relative capital-intensity matter for stability of growth equilibrium? Because of
the infamous "Wicksell Effects". To see why, recall that by the Classical
hypothesis, wL = Yc and rK = pYi, then rK/wL = p·(Yi/Yc), or:
K/L = (w/r)·p·(Yi/Yc)
k = ω ·p·(yi/yc)
k(ω ) = ω ·p(ω)·(yi(ω)/yc(ω))
Evidently, lots of things can happen now. So let us try it again: suppose ω rises,
then k(ω ) must rise unless p(ω) falls and/or yi(ω)/yc(ω) falls sufficiently, in which
case k(ω ) falls. So, what rules this possibility out? Uzawa's capital-intensity
assumption. To see this, let us begin by proving that p cannot fall in response
to a rise in ω if the consumer goods sector is more capital-intensive. As we
know the price of investment goods p can be expressed as:
p = ƒc′ /ƒi′
But as the marginal products are themselves functions of k c and ki (which are,
in turn, functions of ω ), then we can write p as a function of ω :
which seems pretty ugly. But recall that, from before, dk c/dω = -ƒc′ 2/(ƒc′′ ·ƒ c) and
dki/dω = -ƒi′ 2/(ƒi′′ ·ƒ i). So, plugging in:
dp/dω = {-ƒc′′ ·(ƒc′ 2/(ƒc′′ ·ƒ c))·ƒi′ + ƒi′′ ·(ƒi′ 2/(ƒi′′ ·ƒi))·ƒc′ }/[ƒi′ ]2
77
dp/dω = {(ƒi′ 2/ƒi)·ƒc′ - (ƒc′ 2/ƒ c)·ƒi′ }/[ƒi′ ]2
or:
Recalling that p = ƒc′ /ƒi′ , then multiplying through by 1/p = (ƒi′ /ƒc′ ):
or:
so, finally, remembering from before that ω + ki = ƒi/ƒi′ and ω + kc = ƒc/ƒc′ , then:
So, if kc > ki (consumer goods more capital-intensive), then dp/dω > 0, while if
ki > kc (investment sector more capital-intensive), then dp/dω < 0. This should
not be a surprising result. It is reminiscent of the famous Stolper-Samuelson
theorem: specifically, that a rise in the price of a good is positively related with
a rise in the return to the factor in which that good in intensive. So, if the
investment-goods industry is capital-intensive, a relative rise in the return to
capital (a fall in ω) will be associated with a rise in the price of investment
goods (p). Conversely, if the investment-goods industry is labor-intensive, then
a relative rise in the wage (a rise in ω ) will be associated with a rise in p.
So, Uzawa's capital-intensity assumption implies that dp/dω > 0. That's half the
problem solved. Now, we only need to make sure that yi/yc cannot fall
sufficiently in response to a rise in ω to make k fall as well. We resort to basic
intuition: a rise in ω is automatically related to an increase in the capital-
intensity of both sectors, i.e. ki and kc rise and thus yi and yc rise. Furthermore,
by Uzawa's capital-intensity assumption, then if yi/yc falls, we are releasing
factors from a labor-intensive investment goods industry into a capital-
intensive consumer goods industry, so we are increasing the average capital-
intensity of the economy. Thus k simply cannot fall in response to a rise in ω
because not only do both sectors become more capital-intensive, but we are
transferring factors from labor-intensive to more capital-intensive industries.
The aggregate capital-labor ratio k must rise.
However, if we violate Uzawa's capital-intensity condition, and assume that
investment goods are more capital intensive, then note that dp/dω < 0 and if
yi/yc falls, we are releasing factors from a capital-intensive sector into a labor-
intensive one. Consequently, there is a strong countervailing tendency: it is
quite possible that the fall in pyi/yc more than outweighs the rise in ω so that k
falls. In other words, an increase in the wage-profit rate can lead to a fall in the
78
capital-labor ratio, i.e. we are employing more of the factor which has become
relatively more expensive! In geometric terms, the market demand for capital
is not everywhere negatively related to profit and/or the market demand for
labor is not negatively related to wages. The market factor demand curves will
have curious shapes with upward-sloping portions. This is the kind of thing that
yields multiplicity of factor market equilibria and make our system
indeterminate. In other words, at k, we do not know which factor prices will
result, and thus we may end up at a higher or lower capital-labor ratio
tomorrow, which means that we cannot tell whether we will be moving towards
or away from the steady-state capital-labor ratio.
Of course, the Uzawa capital-intensity condition is sufficient for stability, but
not necessary. One can construct many examples where the investment goods
sector is more capital-intensive than consumer goods and still have stability of
the steady-state. But, and this is more important, without it, we can also
construct many reasonable examples with instability. And these are more
reasonable.
[Note: This intuition is derived largely from Solow (1961) and Hahn (1965).
Uzawa (1963) traces the source of his capital-intensity condition to Knut
Wicksell's discussion of "Åkerman's problem" (Wicksell, 1923). Morishima
(1969: p.45) calls this the "Shinkai-Uzawa" condition, in recognition of Y.
Shinkai's (1960) work on two-sector models with fixed coefficients of
production, where he obtained the result that growth equilibrium is stable if
and only if the consumer goods industry is more capital intensive than the
investment goods sector. When we have flexible production functions, as in the
Uzawa model, this is merely a sufficient, but not necessary, condition for
stability. John Hicks (1965) finds this condition, but does not dwell much over it.
He finds it again and gives it more attention in his Neo-Austrian model (Hicks,
1973).]
(1-sw)wL + (1-sr)rK = Yc
79
(1-sw)w + (1-sr)rk = yc
Dividing through by p and recalling that yi = λ iƒi(ki) by definition and r/p = ƒi′
(ki) by competition then:
dk/dt = yi - nk
λi+λc=1
λckc + λiki = k
which are the conditions for labor market clearing and capital market clearing
respectively. Combining:
k = (1-λi)kc + λiki
80
-srƒi′ s ωƒ ′
ƒi(ki) k w i
(ki) (ki)
By Cramer's Rule, the solution k is:
k = |A2|/|A|
and:
so:
or:
so:
Which is, effectively, the form of our factor market equilibrium locus k(ω).
Notice that if we take Uzawa's (1963) special case that sw = sr = s (which we
shall call the Uzawa savings assumption), then this reduces to:
81
k + ω = [(kc + ω )(ki + ω )]/[(ki + ω ) - s(ki - kc)]
= s(kc+ω ) + (1-s)(ki+ω )
dg(ω)/dω = [(zc′zi + zczi′ )·(szc + (1-s)zi) - (szc′ + (1-s)zi′ )·zczi] /[szc + (1-
s)zi]2 - 1
or:
Now, recall that zc = kc + ω and zi = ki + ω , which implies that zc′ = kc′ + 1 and
zi′ = ki′ + 1 where, as kc′ > 0 and ki′ > 0, implies then that zc′ > 1 and zi′ > 1.
Thus:
But examine the fraction in this expression. Now, we know that (zc - zi)2 ≥ 0, so:
82
or:
Or as s(1-s) + s2 = s and s(1-s) + (1-s)2 = (1-s), and noticing that the term on
the right is merely [szc + (1-s)zi]2, then:
83
(swω+ srk)ƒi′(ki)/k = n
Now, recall that dki/dω = -(ƒi′)2/ƒ i·ƒi′′ , so ƒi′′·(dk/dω ) = -(ƒi′ )2/ƒi, so:
and:
so:
rearranging:
So the sign of d(dk/dt)/dk depends crucially on the sign of [(s w - λi)·(dω /dk) - sw(
ω /k)]. If:
84
(sw - λ i)(dω /dk) < sw(ω /k)
then d(dk/dt)/dk < 0 around equilibrium and we will have local stability.
Let us now deduce the implications of this. We know that dk/dt = y i - nk, or
simply:
dk/dt = λiƒi(ki) - nk
λi = nk*/ƒi(ki*)
Now, we know that d(dk/dt)/dk < 0 if (sw - λi)(dω /dk) < sw(ω /k). So, plugging in,
we see that around steady-state
So, if sw < nk*/ƒi(ki*), then this inequality holds for certain, so we will be assured
that dk/dt < 0 around the steady-state k*.
Alternatively, consider the following. As (swω + srk)ƒi′(ki) = λiƒi(ki) by
macroeconomic equilibrium, then:
sw = λiƒi(ki)/(ωƒi′(ki)) - srk/ω
or:
sw - λi = (λi ki - srk)/ω
Thus, if λiki < srk, then it must be that sw - λi < 0. But we know that (sw - λi)(dω
/dk) < sw(ω /k) is sufficient condition for (dk/dt)/k < 0. So, if λiki < srk, then
(dk/dt)/k < 0. Now, as we know, at steady-state, λi = nk*/ƒi(ki*), thus
substituting in, if:
sr > nki*/ƒi(ki*)
(i) sw ≤ nk*/ƒi(ki*)
85
or:
(ii) sr ≥ nki*/ƒi(ki*)
rearranging:
nk*/ƒi(ki*) > sw
which is precisely the condition (i) for stability. Condition (ii) follows by
extension. Thus, the first Drandakis mixed condition (sr > sw and kc > ki) is
indeed sufficient for stability.
An interesting observation is to realize that the sufficient condition for stability,
(sw - λi)(dω /dk) < sw(ω /k), can be rewritten as:
But recall from our discussion of production theory the elasticity of substitution
between factors is σ = (dk/dω)·(ω /k). Notice, then that if σ ≥ 1, then this
inequality is guaranteed. Thus, σ ≥ 1 is also a sufficient condition for stability.
86
Finally, an even more famous sufficiency condition derived by Drandakis (1963)
is that the elasticity of substitution of the consumer-goods sector be greater
than 1, i.e. that
Let us summarize what we have found. Remember that the Uzawa two-sector
growth model poses two essential questions: (1) can we guarantee a unique
factor-market equilibrium ω for every capital-labor ratio k? (2) can we
guarantee that k approximates a steady-state growth path k* over time? We
have found sufficient conditions for (1) and (2).
In Uzawa's (1961) Classical hypothesis case (all wages spent, all profits saved),
the sufficient condition for (1) was that the Uzawa capital-intensity condition
hold true (i.e. that the consumer goods industry is more capital-intensive that
the investment goods industry). This, as we saw intuitively, was also sufficient
to guarantee uniqueness and stability of the steady state (i.e. sufficient for (2)).
In the more flexible savings case, we found that a sufficient condition for (1)
was that sr = sw = s (Uzawa's (1963) savings condition). Another sufficient
condition (stated, but not proved) was that sr > sw and kc > ki and still another
was that sr < sw and kc < ki (the Drandakis's (1963) mixed conditions). To
guarantee (2), we must add the Uzawa capital-intensity assumption to Uzawa's
savings condition (not shown, but deducible). Alternatively, we have shown
that the first of Drandakis' mixed conditions is also sufficient for (2).
Of course, all the conditions we have outlined thus far are all sufficient
conditions for stability, but not necessary conditions. In other words, other
configurations of savings and capital-intensity could yield us uniqueness and/or
stability, but it these would by no means be guaranteed.
87
In Figure 8, there are two steady-states, b and d. Immediately, we have
indeterminacy of steady-state equilibrium as either could obtain. It is
immediately noticeable that both steady states b and d are locally unstable, in
the sense that any slight nudge away from them and we will return to them.
Thus, we have no stable steady-states. At k0, we also have indeterminacy of
factor market equilibria: points a, b and c are all possible factor market
equilibria at k0. There is no reason to accept one over the other.
A cycle can be traced from the passage along the points (1, 2, 3, 4).
Specifically, suppose we begin at point c, with capital stock k0 and factor price
equilibrium represented by c. In this case, dk/dt < 0, and so the capital-labor
ratio declines below k0. We continue declining until we reach kL (point 1 in
Figure 8). At kL, we still have it that dk/dt < 0, so there will be a tendency to
continue to reduce k. But notice that the slightest reduction in k below k L leads
to an immediate jump from point 1 to point 2, i.e. there is a catastrophic jump
from dk/dt < 0 to dk/dt > 0.
Thereafter the capital-labor ratio begins rising from kL towards k0. At k0,
however, we will be at factor equilibrium point a: we have no reason to jump
down towards b to get steady-state. But staying at a implies that dk/dt > 0, so
k continues to rise. It will do so until we reach kU. At this capital-labor ratio, a
slight increase in k will lead to an immediate catastrophic jump from point 3 to
point 4, and thus a drastic reversal from dk/dt > 0 to dk/dt < 0 around k U. Now,
the capital-labor begins falling, from kU towards k0. But at k0, we will have the
factor market equilibrium implied by c, and not b. Thus, we continued declining
towards kL.
88
Thus, as we see in Figure 8, we have a constant cycle fluctuating back and
forth between kL and kU, passing over k0 in the process but never actually
stopping there because the steady-state factor market equilibrium b never gets
the chance to realize itself.
How anomalous is this case? Inada (1963) provides a precise example of the
kind of cyclical behavior we see. Naturally, none of the uniqueness conditions
we imposed above hold here -- e.g. Uzawa's capital-intensity assumption must
have been violated in order to obtain this kind of differential equation. Uzawa
(1963) considered the possibility of cycles. He noted that the kind of jumps
required (from 1 to 2, from 3 to 4) implied discontinuous jumps in the factor
price equilibria. What if these were not accommodated? In other words, what if
we forced factor prices to adjust only slowly? This requires that we allow for
unemployment of capital and/or labor for extended periods of time. This
modification will not be pursued here.
89
However, if we were to impose the Uzawa capital-intensity condition on top of
the Uzawa savings assumption, then Figure 9 could no longer hold: we would
have uniqueness and stability of steady-state equilibrium.
There are other modifications to the standard Uzawa two-sector model. For
instance, Drandakis (1963) endogenized the labor supply by making it
responsive to wages (thus getting actually closer to the Classical way of
thinking). Morishima (1969) did the same, but found that the Uzawa capital-
intensity condition was no longer sufficient for stability.
Other modifications include the following. Yasui and Uzawa (1964) make capital
depreciation endogenous. Inada (1966) introduced fixed capital which cannot
be transferred from one use to another. Two-sector models with fixed
proportions technology, as noted, have also been considered by Y. Shinkai
(1960) and John Hicks (1965). Inada (1963) has allowed for the savings rate
itself to be a function of income, so s = s(y).
Technological change has been one of the bigger headaches because now we
must ensure that technological change is such that all sectors grow at the
same rate. Diamond (1965) and Takayama (1965) developed a classification
scheme is analogous to the one-sector case where we have Harrod-neutral
technical change where there is constant capital per effective labor unit in each
sector as well as in the aggregate. Charles Kennedy (1962) proposed a different
type of neutral technical progress where it is the value of capital per labor unit
that remains constant, which has different implications. We consider this
elsewhere.
Finally, we should try to decipher the main lesson from all this. Robert Solow
claims that it "seems paradoxical to me that such an important characteristic of
the equilibrium path [i.e. stability] should depend on such a casual property of
90
the technology." (Solow, 1961). Perhaps it is not that paradoxical. As Frank
Hahn notes, the Uzawa capital-intensity condition and attendant savings
hypotheses "are all terrible assumptions" (Hahn, 1965) which probably never or
only rarely hold in the real world. It is thus reasonable to conclude from this
exercise that, if anything, these models have taught us that the real world is a
bit more complicated than one-sector models let on. Rather than the smooth
convergence to a single steady-state path we find in the Solow-Swan model,
the Uzawa two-sector model indicates that we ought to expect a good amount
of indeterminacy, instability and cyclical behavior in growth paths. Now that
sounds a little bit more like the world we live in.
The central result of the Uzawa (1961, 1963) two-sector growth model is that
the uniqueness of factor market ("momentary") equilibria and the stability of
the steady-state growth path depends crucially on the relative factor-intensities
of the two sectors and attendant savings hypotheses. The most celebrated
result is that we are only assured stability if the consumer goods sector is more
capital-intensive than the investment goods sector. Alternative configurations
are generally not sufficient to guarantee nice results. Consequently, Hirofumi
Uzawa (1964) and T.N. Srinivasan (1964) sought to find a way of pinning things
down without such a radical capital-intensity assumption. They did so looking
for the "optimal" growth path, specifically the growth path that maximized the
integral of the consumption path.
We shall not bother to repeat definitions and notations here. We shall merely
set out the basic equations for the Uzawa two-sector model:
w = ƒc - kcƒc′
- labor market prices (6)
= p·(ƒI - kiƒi′ )
91
r = ƒc′ = p·ƒi′ - capital market prices (7)
s.t.
dk/dt = yi - nk
k(0) = k0
yc = λcƒc(kc)
yi = λiƒi(ki)
λc+λi=1
λ ckc + λ iki = k
kc, ki, λ c, λ i ≥ 0
Notice that consumption per capita is merely the output of the consumer goods
sector, yc. The term ρ is the time preference of the social planner (and notice
that there is no other "utility" element involved). The first constraint, dk/dt = yi
- nk is the fundamental differential equation and this is obtained by recognizing
that gk = gK - gL, and then substituting in (8) and (9).
So far so good. But things can be reduced further. Notice that combining (4)
and (5), we obtain:
λ c = (k - ki)/(kc - ki)
We can then plug these terms into (1) and (2) to obtain
Finally, we have not used (6) or (7) yet. These will yield (as we showed in the
last section) the functions:
kc = kc(
where kc′ = (ƒc′ )2/(ƒc′′ ·ƒc) > 0
ω)
92
ω)
which will form boundaries of our factor market equilibrium. As we know, the
boundaries imply that the wage-profit ratio ω ∈ [ω min, ω max], where:
ωmin = ωc(k), ω max = ω i(k) if kc(ω ) ≥ ki(ω ) for all ω
i.e. if the consumer goods sector is more capital-intensive. Alternatively:
ωmax = ωc(k), ωmin = ωi(k) if kc(ω ) ≤ ki(ω ) for all ω
i.e. if the investment goods sector is more capital-intensive. All this is explained
in greater detail in the previous section.
Plugging our new terms for yc and yi, omitting the equations already used and
adding in our boundaries, our program becomes:
s.t.
The control variable is the wage-profit ratio, ω ; the state variable is k. Notice
that the last line indicates that we have assumed that the consumer-goods
industry is more capital-intensive. If the investment-goods industry was more
capital-intensive, we would replace that line with ωi(k) ≤ ω ≤ ωc(k). Therefore, let
us divide our analysis into two parts, one for each case.
where λ is the current-value costate variable. Notice that kc and ki are implicitly
functions of ω . First order conditions are (after a lot of ugly algebra):
dH/dω = [λ ƒ i′ - ƒ c′ ]·
{(dki/dω )·[(k - ki)(ω + ki)/(kc - ki)2] + (dkc/dω )·[(kc - k)(ω + kc)/(kc - ki)2]} = 0
recalling our definitions of λ c and λ i, this can be written as:
[λƒi′ - ƒc′ ]·[(dki/dω )·λ c·(ω + ki)/(kc - ki) + (dkc/dω )·λ i·(ω + kc)/(kc -
ki)] = 0
Now, if we assume consumer goods are more capital-intensive, kc > ki, then (kc
- ki) > 0. Then as λ c, λ i ≥ 0 and dkc/dω > 0 and dki/dω > 0, then obviously the
entire second term is positive. Alternatively, if we assume that investment
93
goods are more capital intensive, then (kc - ki) < 0, and the entire second term
is negative. In either case, it must be that:
λƒi′ - ƒc′ = 0
(corner solutions would allow this to be different, but then λi or λc would be set
to zero). Notice that this means:
λ = ƒc′ /ƒi′
which should be familiar to us. Recall that p = ƒc′ /ƒi′ , thus the costate variable
λ is nothing other than the (shadow) price of the investment goods.
Continuing with our Hamiltonian, notice that:
or, rearranging:
so rearranging:
or simply:
dλ /dt = λ (n + ρ ) - ƒc′
dλ /dt = λ (n + ρ - ƒi′ )
or:
94
dω /dt = (dλ /dt)/(dp/dω )
The question that emerges is what is p/(dp/dω )? Well, we know from before
that:
n + ρ = ƒi′ [ki(ω)]
ω * = ki-1[ƒi′ -1(n + ρ )]
As n+ρ and ƒi′ (.) and ki(·) are given and do not vary with k, then there is a
unique ω * for which this holds true. Thus, the dω /dt = 0 is a horizontal line in
(ω , k) space. The implict dynamics can be found as follows. Defining ∏ = (dp/d
ω )·(1/p), then notice that (10) can be rewritten as:
But, evaluated near ω *, we know that n + ρ = ƒi′ [ki(ω *)], thus this reduces to:
95
d(dω /dt)/dω |ω * = -ƒi′′ /∏ > 0
which is positive by assumption that ƒi′′ < 0 and by the Uzawa capital-intensity
hypothesis, ∏ > 0. Thus, a small increase in ω above ω * will lead to a rise in ω ,
while a fall in ω below ω * will lead to a further fall. Thus, the vertical directional
arrows moving away from the dω /dt = 0 isokine in Figure 1.
What about (11)? The isokine dk/dt = 0 is established as follows. Note that
dk/dt = 0 implies that:
or:
so:
k(ω) = ƒ i(ki)kc/{n(kc - ki) + ƒi(ki)} (12)
This will form the shape of our dk/dt = 0 isokine. It is necessary to decipher
what isokine looks like. Notice that we can rewrite (12) as:
96
(kc - k) = [nk/ƒi(ki)]·(kc - ki)
Now, nk/ƒi(ki) > 0 by assumption, thus the sign of (kc - k) depends critically on
the sign of (kc - ki), i.e. on which sector is more capital-intensive. Now, by our
assumption that consumer goods are more capital intensive than investment
goods, then kc(ω ) > ki(ω ) for all admissable ω . Consequently, we necessarily
have it that:
so the isokine of dk/dt = 0 will lie everywhere to the left of the kc(ω ) cure (see
Figure 2).
However, the dk/dt = 0 isokine does not necessarily lie everywhere to the right
of ki(ω). Specifically, notice that the dk/dt = 0 isokine intersects the ki(ω) line at
ωn, so that for all ω > ωn, we have it that k(ω) > ki(ω).
Fig. 2 - The dk/dt = 0 Isokine
97
So, k lies to the right of ki whenever it is the case that ƒi(ki) > nk. Of course, it is
not true that this holds for all ω . Nonetheless, we know that for low ω this will
be true. To see why, let us proceed slowly. We first want to prove that if ƒi(ki) >
nki then ƒi(ki) > nk. To see this, note that (12) can be rewritten as:
so assuming neither k nor kc are zero, then necessarily, ƒi(ki) > nki implies ƒi(ki)
> nk. The condition ƒi(ki) > nki can be depicted in Figure 3 where we have the
intensive production function for investment goods ƒi(ki) depicted as well as nk,
a ray from the origin with slope n. The point e n depicts the intersection of the
intensive production function and the nk ray. At this intersection, the capital-
labor ratio in the investment goods industry is kin, thus ƒ i(ki) = nkin. So, for all ki
< kin, we have it that ƒi(ki) > nki, but for all ki > kin, we have it that ƒi(ki) < nki.
Now, associated with this critical point is a factor price ratio ωn. We can obtain
this by extending a curve tangent to the intersection point en to the horizontal
axis. Where this tangent line intersects the axis is the maximum factor price
ratio, ω n. If ω > ω n, then notice that this implies that the corresponding k i is
greater than kin, or ki > kin, but then ƒ i(ki) < nki and thus the condition that
dk/dt = 0 isokine lies to the right of ki(ω ) no longer holds. Thus, for all factor
price ratios ω up to ω n, we have it that k(ω ) > ki(ω ). For factor price ratios ω
above ω n, we have it that k(ω ) < ki(ω ), thus the dk/dt = 0 locus has exceeded
the left boundary. This is what we see in Figure 2. The point kn is the aggregate
capital-labor ratio that corresponds to the maximum factor price ratio, ω n.
Thus, we have established that dk/dt = 0 a locus k(ω ) where:
98
Now, let us examine the dynamic properties. From equation (11), we can see
immediately that:
The stable arm is actually a bit more complex, due to the boundaries formed by
the kc(ω ) and ki(ω ) loci. We can trace it as follows: for values of k between 0 to
kL, the stable arm is the ki(ω ) locus; for k values between kL to kU, the stable
arm is the dω /dt = 0 isokine, and for k above kU, the stable arm is the kc(ω )
locus. Thus, the thick black line in Figure 5 denotes the full stable arm of the
economy.
The logic is the following. If k < k L, then the economy needs to grow quickly to
catch up to k*. Consequently, it will specialize completely in the production of
investment goods -- thus we "jump" to the ki(ω ) locus. In contrast, if k > kU, the
economy needs to slow down on capital-accumulation so that k declines, thus it
jumps to complete specialization in consumer goods and cuts production of
investment goods to zero, thus for such high values of k we jump to the kc(ω )
locus.
For capital-labor ratios in between kL and kU, we do not "jump" to complete
specialization in either consumer goods or investment goods, but produce a
little bit of both -- thus for k ∈ (kL, kU), we will choose points in the interior of
the space in Figure 5.
But why choose ω * in particular? Because from ω *, dω /dt = 0, so there is no
change in ω over time and the dynamics are such that we glide smoothly and
99
asymptotically to the balanced growth point, e = (ω *, k*). If we chose a ω
higher than ω * but still in the interior of the area, notice that the underlying
dynamics would push ω upwards over time, even if k approached k* over time
(which it might not!). Eventually, when k finally hits k* (or if we hit a boundary),
ω would be so far above ω * that there would have to be a sudden and drastic
correction in ω , an enormous jump down to ω *. Similarly, if we initially choose
a ω below ω *, ω would be pushed further downwards, so that there would have
to be an eventual drastic correction in factor prices. Such late catastrophic
jumps in factor prices are not necessarily "optimal" things. Far better to jump
early onto ω * and just let the natural dynamics of the economy keep ω
constant at ω * while we gradually approach k*. That is why the stable arm will
be chosen for k ∈ (kL, kU). This can be deduced from the transversality
conditions of the solution.
n + ρ = ƒi′ [ki(ω )]
because ∏ = (dp/dω )·(1/p) < 0 when investment goods are more capital-
intensive. Thus, unlike before, the dω /dt = 0 isokine is stable in ω , so if ω > ω
*, then ω declines, while if ω < ω *, then ω rises. The vertical directional arrows
thus approach the dω /dt = 0 isokine.
100
How about the dk/dt = 0 isokine? Setting (11′ ) to zero, we can resolve this for
k to yield:
So, since nk/ƒ i(ki) > 0 by assumption, and since investment goods are more
capital-intensive than consumer goods, then (kc - ki) < 0 and thus (kc - k) < 0,
i.e.
so the isokine of dk/dt = 0 will lie everywhere to the right of the k c(ω ) curve
(see Figure 5).
However, like before, the isokine does not lie everywhere on one side of the ki(
ω ) curve. There is an intersection point between the dk/dt = 0 isokine and the
ki(ω ) locus at a critical wage-profit ratio ω n. This is in fact identical to before,
i.e. ω n solves ƒ i(ki(ω n)) = nki(ω n), so if ω < ω n, then ƒ i(ki) > nki and so k(ω ) <
ki(ω ), so that the dk/dt = 0 isokine lies to the left of the k i(ω ) locus and thus
withing the bounds. In contrast, if ω > ω n, then ƒ i(ki) < nki and therefore k(ω )
> ki(ω ) so that the isokine lies to the right of the ki(ω ) locus and thus outside
the bounds. We see this in Figure 5.
The dynamics of the dk/dt = 0 isokine are easy to decipher. Specifically, note
that:
101
Fig. 5 -Dynamics of Optimum Growth -- ki(ω) > kc(ω) case
Effectively, the same analysis applies as before. Notice that the stable arm of
the saddlepoint intersects the ki(ω) locus at kL and the kc(ω ) locus at kU. Now, if
k < kL, then k is so much below k* that it makes sense to specialize completely
in the production of investment goods (thus jumps to the k i(ω ) locus) so that k
climbs quickly. If, in contrast, k > kU, then k is so much higher than k*, that we
want to stop accumulating capital and specialize completely in the production
of consumer goods, so that k falls quickly. Finally, if we start at a k between kL
and kU, we will jump onto the saddlepoint stable arm, and glide slowly towards
the steady-state equilibrium, e = (k*, ω *).
4.2.4. Conclusion
As we see from the Uzawa-Srinivasan exercise, adding optimality criterion
removes many of the difficulties we found in the conventional Uzawa two-
sector growth model. Specifically, we no longer have the Uzawa capital-
intensity requirement for stability. Consumer goods can be more or less capital-
102
intensive than the investment goods, but that will not affect the "stability" of
the system. The system, after all, is driven by the social planner, and his sole
criterion is the optimality of the consumption path. Thus, the social planner will
drive us straight to the balanced growth path, and by-pass all the "real-world"
difficulties we had in our simpler two-sector growth model. As it happened, the
Uzawa-Srinivasan attempt to find "optimal growth" in a two-sector model
preceded and was in fact the impetus for the resurrection of the Ramsey one-
sector optimal growth model by David Cass (1965) and Tjalling C. Koopmans
(1965).
However, before declaring victory, we should note some peculiarities about the
social planner. Firstly, the social planner is maximizing consumption per capita
and not utility. Thus, the traditional Benthamite justification of social utility is
not really used (or, rather, we have replaced a diminishing marginal utility with
constant marginal utility for the social welfare function). Secondly, we obtain
"saddlepoint" stability, which is not quite "stability". In principal, beginning with
any given k and ω , we will not go to balanced growth, but rather move away
from it. Thus, there needs to be a guide to set initial wage-profit ratio on the
stable arm to ensure that we go to steady-state.
Before the lamentable rise of the "representative agent" reasoning we have
today, it used to be argued that the government could perform many of the
functions of the social planner for these intertemporal optimization problems.
Specifically, by manipulating various fiscal, monetary and pricing policy
instruments, the government could attempt to guide us to the steady-state
growth path.
In fact, the two-sector model lends itself rather nicely to treatment of
government activity. As Hirofumi Uzawa (1969) and Kenneth J. Arrow and
Mordecai Kurz (1970) demonstrate, we can think of a mixed economy as one
where there is a private sector producing one kind of good and a public sector
producing another (roads, bridge, dams, etc.) which can be used by the first
sector and vice-versa. Add a government objective to the story, and this is
effectively an optimal two-sector growth model.
Models of monetary growth, stemming from the contributions of James Tobin
(1965) onwards, for instance, can be considered to be a type of two-sector
model with room for government activity -- but now "money creation" is our
second "sector". However, before we proceed with these models, it is
necessary to consider multi-sectoral models where we have more than one
type of capital good. These "heterogeneous capital" growth models shall be
taken up in our next section.
103
5. Optimal Growth
128
5.1. Optimal Growth: Introduction
128
5.2. The Ramsey Exercise 132
5.3. Golden Rule Growth 142
5.4. Intertemporal Social Welfare 146
5.4.1. Intertemporal Social Welfare Functions 147
5.4.2. The Defense of Discounting 152
5.4.2.1. The Tastes Defense 152
5.4.2.2. The Dynastic Defense 155
5.4.2.3. The Decentralization Defense 158
5.4.3. The Koopmans Axiomatization 159
5.4.4. Population Growth 168
5.4.5. Overlapping Generations 170
5.4.6. Varying Time Preference 173
104
5.4.7. Intertemporal Justice 176
5.4.7.1. Rawlsian Social Welfare 176
5.4.7.2. Rawlsian Altruism 179
5.5. (The Cass-Koopmans Optimal Growth Model) -
5.6. (Optimal Two-Sector Growth) 111
5.7. Optimal Growth: Conclusion 181
5.8. Selected References 184
5. Optimal Growth
________________________________________________________
___________________________________________________________
At least as far back as Eugen von Böhm-Bawerk (1889), economists had
entertained the idea that people are "myopic" in the sense that they tend to
underestimate their future needs and desires and therefore "discount" their
future utilities. This was seen by Böhm-Bawerk and many of his contemporaries
as an irrationality, a result of a deficient cognitive process.
From this proposition, the Cambridge economist Arthur C. Pigou (1920) posed
an interesting conundrum: if, indeed, agents tend to underestimate their future
utility, they will probably not make proper provision for their future wants and
thus personally save less than they would have wished had they made the
calculation correctly. In other words, Pigou proposed, the very fact that people
possess defective "telescopic faculties" probably means that savings, as a
105
whole, are less than what is "optimal". This, Pigou conjectured, implies that
there is a "market failure" of sorts in the market for savings.
Yet in order to confirm that the rate of savings thrown up by a market system
with myopic agents was indeed suboptimal, one must first determine what the
optimal savings rate might be. It is at this point that the Cambridge philosopher
Frank P. Ramsey (1928) picked up on Pigou's pregnant suggestion. Ramsey
applied standard Benthamite utilitarian calculus to derive the "optimal rate of
savings" for a society. He proposed an intertemporal social welfare function and
then tried to obtain the "optimal" rate of savings as the rate which maximized
"social utility" subject to some underlying economic constraints. Of course,
Ramsey deliberately excluded discounting of future utility form this social
welfare function: just because people are individually short-sighted, does not
mean that society should be similarly "short-sighted". This is a normative, not a
positive exercise.
Ramsey's conclusion was to confirm Pigou's suggestion: the optimal rate of
savings is higher than the rate that myopic agents in a market economy would
choose. Yet Ramsey's arguments fell largely on deaf ears for three reasons.
Firstly, Ramsey's use of the calculus of variations in his argument was quite
beyond the mathematical understanding of most contemporary economists.
Secondly, even the economic parts of the argument were subtle and unfamiliar.
Recall that Irving Fisher's Theory of Interest was only written in 1930, so most
economists understanding of the concept of "intertemporality" was still
rudimentary. Thirdly, and perhaps more importantly, Ramsey's exercise was an
unfashionable one. The 1930s were the years of the Paretian revival of
ordinalism and the "unholy alliance" of economics and Benthamite
utilitarianism was gradually unraveling. The "New Welfare Economics" stayed
clear of anything which implied any sort of interpersonal comparisons of
cardinal utility. Ramsey's social utility function was certainly regarded as a
damnable construction.
During the 1950s and 1960s, capital and growth theory was emerging into its
own and questions about "efficient" programs of accumulation were being
asked (e.g. Malinvaud, 1953). Some people seemed to recollect that Ramsey
had a thing or two to say about this. In a rather anachronistic, but highly
106
commendable effort, Paul Samuelson and Robert Solow (1956) brought forth an
extension of Ramsey's original model to a multi-commodity scenario.
Overall, it was practical considerations that resurrected the question of the
"optimal" savings rate. For development economists -- who were in those days
obsessed with government planning and accumulation -- the question was both
natural and urgent. Jan Tinbergen (1956) was perhaps the first to try his hand.
Yet, almost from the outset, the exercise was attacked. As Peter T. Bauer (1957)
and Jan de Van Graaff (1957) argued, the determination of the "optimal"
savings rate is irrelevant for policy on at least two grounds. Firstly, it is not
implementable -- people save what they will save, period. Secondly, even if
could be implemented (e.g. via Social Security schemes and what not), it is not
up for a "social planner" (i.e. government) to dictate it according to some
simple ethical criterion set up by economists. If anything, it is a political
decision, and will be the outcome of the political culture of a nation. And the
"social planner", envisaged in optimal growth theory, is frighteningly
authoritarian.
However, these objections were largely overruled by the dirigiste spirit of the
times. The optimal savings question was first applied on Keynesian (Harrod-
Domar) growth models by Jan Tinbergen (1956, 1960) and Richard Goodwin
(1961). But the Neoclassical (Solow-Swan) growth model had recently become
available too. This model was particularly interesting because its steady-state
path was generally "consumption-inefficient". Since the rate of savings is one
of the critical parameters in determining the Solow-Swan steady-state, the
question of "what is the optimal savings rate?" emerged quite naturally.
In the early 1960s, numerous researchers independently examined the
question of optimal savings for the Neoclassical model. The answer seemed
simple: the optimal rate of savings will be that which makes the rate of return
on capital equal to the natural rate of population growth. This "Golden Rule" for
efficient growth, as it has been called, was set forth simultaneously by Edmund
S. Phelps (1961), Jacques Desrousseaux (1961), Maurice Allais (1962), Joan
Robinson (1962), Christian von Weizsäcker (1962) and Trevor Swan (1963).
The derivation of the Golden Rule did not employ Ramsey's old Benthamite
trappings of "social utility" and all that. These were, however, brought back into
prominence after the subtle but influential work of Tjalling Koopmans (1960).
Taking Ramsey's construction seriously (and piling on generous coats of
"ordinalist" polish), Koopmans made it hip to consider intertemporal social
welfare functions once again. Across the hallway in the two-sector growth
model world, Hirofumi Uzawa (1964) and T.N. Srinivasan (1964) demonstrated
how intertemporal optimality and growth theory could be combined fruitfully.
After some false starts and a flurry of activity, David Cass (1965), Tjalling
Koopmans (1965), Edmond Malinvaud (1965), James A. Mirrlees (1967), Karl
Shell (1967) and others finally pieced together the canonical one-sector
optimal growth model. Although this is sometimes (and erroneously) called the
"Ramsey" model, we prefer to refer to it by its other name, the "Cass-
Koopmans" optimal growth model.
Optimal growth theory began to recede in the 1970s for a variety of reasons.
Firstly, the inconsistencies in capital theory unearthed during the Cambridge
Controversy were a source of despair for growth theorists across the board and
the optimal growth theorists were not immune to it. Furthermore, economists
realized that the "social planner" did not really exist and developments in
107
microeconomic theory indicated that any appeal to "representative agents"
should be greeted with suspicion. But, above everything, it was the
"saddlepoint dynamics" of optimal growth models made them seem inherently
inapplicable. There was no good economic reason to suppose that an economy
would "stumble" upon the optimal growth path. Consequently, as the 1970s
progressed, optimal growth models were discarded as ultimately inapplicable
constructions, however beautiful and utopian they may seem.
The tune changed in the 1980s with the rise of the rational expectations
revolution. Saddlepoint dynamics began being regarded as an asset rather than
a liability of a model. Specifically, rational expectations were precisely the
mechanism by which an economy would jump onto the stable arm of a
saddlepoint. Indeed, saddlepoints were necessary if one were to obtain a
precise solution to a model with rational expectations!
It was also during the 1980s that the "decentralization" argument began being
put forth more forcefully. As a result, optimal growth models stopped being
conceived of as "normative" exercises about the way the economy should
work, and started being regarded as a "positive" exercise about the way the
economy does work. Real business cycle theory, one of the principal
macroeconomic enterprises of the late 1980s and 1990s, built itself up
precisely on that premise.
Our brief survey of optimal growth theory concentrates almost exclusively on
its connection with one-sector, Neoclassical growth theory. We begin with
Ramsey's 1928 exercise and then jump a quarter-century to the 1960s the
"Golden Rule" of growth. We then take a rather leisurely digression on
intertemporal social welfare functions and the ethical implications of time
preference. All this leads us to the Cass-Koopmans optimal growth model, the
version of optimal growth theory that is closest to Solow-Swan. Finally, we turn
to a brief discussion of turnpikes and the infamous "decentralization"
argument.
________________________________________________________
________________________________________________________
108
As outlined in our introduction, Arthur C. Pigou's (1920) assertion that myopic
agents might "save too little" was taken up by the brilliant young Cambridge
philosopher, Frank P. P. Ramsey (1928) -- long before growth theory came into
its own. Ramsey's main concern was to determine the optimal rate of savings
and then show how myopic agents would not achieve that optimum.
But what is the optimal rate of savings? Ramsey's exercise was explicitly
grounded in Benthamite utilitarianism. He sought to find the allocation across
generations which maximized social welfare -- with "social welfare" defined as
the sum of utilities of people in a society. From the social welfare-maximizing
allocation, we will be able to determine an "optimal" rate (or rather, path) of
saving.
However, this comes up against traditional Benthamite problem of defining
exactly whom constitutes "society". This is particularly pertinent in a growth
context as the issue of balancing the interests of current and the future
members of society is a critical ingredient. It is obvious that we can maximize
the social welfare of the current generation by having them simply consume all
their income, but then there would be no savings and thus no capital to
generate income for the next generation.
Like Arthur Pigou (1920), Frank Ramsey argued that "society" is composed of
everybody in every generation, current and future, and that they all should be
given equal weight in the social welfare function. Now, as we outline later, a
direct Benthamite sum of utilities yields the problem that this sum can be
infinite -- and infinities cannot be compared, and thus an optimum might not be
found. Adding a time discount factor would solve the problem, but Frank
Ramsey considered time discounting as "a practice which is ethically
indefensible and arises merely from the weakness of the imagination" (Ramsey,
1928). Time preference, as Pigou (1920: Pt I, Ch. 2) originally asserted, is a
personal weakness which should not be imported into a normative exercise.
(see our review of "intertemporal social welfare" for more details).
However, Ramsey recognized that by omitting time preference, the problem of
non-comparability of infinite sums emerged. In its stead, he introduced the
ingenious device of "bliss points". Specifically, he defined a social unwelfare
function of the following sort:
R=∑ t=0
∞
(B - U(Ct))
R=∫ 0
∞
[B - U(Ct)] dt
109
Fig. 1 - Utility Function with Bliss Point
Consequently, for a social optimum, we wish to find the consumption allocation
dK/dt = F(K, L) - C
forms the "real economy" constraint which our society faces. Thus,
consumption paths must be feasible in this manner. Now, Ramsey (1928)
included the disutility of labor supply into his utility function, so letting U(C) be
the utility of consumption and V(L) the disutility of labor supply, Ramsey
combined them in an additively separable manner, i.e. u(C, L) = U(C) - V(L),
where U′ > 0, U′ ′ < 0 and V′ > 0 and V′′ > 0. Consequently, defining B as
"bliss", we mean that [U(Ct) - V(L)] ≤ B for all acceptable Ct and thus, note, U′
(C) = 0 when U(C) - V(L) = B (i.e. at bliss, marginal utility of further
consumption is zero). The distance from bliss for any generation t is [B - U(C(t))
+ V(L(t))]. Ramsey then set out his social welfare problem in minimization form
as:
s.t.
K(0) = K0
110
where K0 is the initial capital stock.
The problem can be thought of as following. Social welfare is maximized if
every generation is "at bliss". However the society's initial capital stock K0 (and
thus the initial income) may be too low to achieve bliss levels of consumption
immediately. Consequently, we might be interested in achieving bliss as quickly
as possible by holding back consumption and producing a lot of capital today.
The more we save now, the sooner society will reach bliss. But this is a bit
unfair for the first few generations who must sacrifice their own consumption to
ensure quicker convergence to bliss. Consequently, the "cost" of achieving bliss
quicker is the consumption utility that the initial generations lose in their
sacrifice. Balancing quick convergence to bliss with the utility cost of foregone
consumption is the heart of the Ramsey problem.
The solution follows standard calculus of variations techniques -- a
mathematical tool which, incidentally, Ramsey (1928) was among the first to
introduce into economics. For simplicity of notation, let K′ = dK/dt and L′ =
dL/dt and let the intergrand be denoted:
I = B - U(C) + V(L)
I = B - U[F(K, L) - K′ ] + V(L)
Defining IL = dI/dL and IL′ = dI/dL′ and equivalently for IK and IK′ , then:
IL = -U′ ·FL + V′
IL′ = 0
IK = -U′ ·FK
IK′ = U′
Euler equations are the solution for calculus of variations problems. For our
particular problem, we have the following pair of Euler equations:
which hold for all t ≥ 0. The first Euler equation reduces simply to:
111
FK = -(dU′ /dt)/U′ for all t
[B - U(C) + V(L)] - K′ U′ = α
But what is α ? For this we need the transversality conditions. There are two of
them. The first is merely:
limt→ ∞ (I - L′ IL′ ) = 0
limt→ ∞ (I - K′ IK′ ) = 0
limt→ ∞ α = 0
[B - U(C) + V(L)] - K′ U′ = 0
K′ = [B - U(C)]/U′
112
What does this mean? Intuitively, suppose that we decide to "speed up"
society's convergence to bliss. In other words, let us force the current
generation to save amount K′ so that one extra generation will be at bliss in the
future. The net gain of that future generation is B (its new, blissful utility) minus
U(C) (the utility it would have otherwise), i.e. B - U(C). As this is positive, it
might always seem worthwhile to do to so. But the cost of "speeding up", as we
stipulated, was the consumption foregone by the initial generation. If they save
amount K′ , then U′ ·K′ is the cost of speeding up convergence by one
generation. Optimally, then, the marginal (net) benefit of speeding up, B - U(C),
must be equal to the marginal cost of doing so, U′ ·K′ . That is the Keynes-
Ramsey rule.
This can be illustrated heuristically by appealing to Figure 2. We have depicted
two utility paths I and II, both of which achieve bliss B after some time. Path I
(solid line) converges to bliss at time T, but Path II (dashed line) converges to
bliss one period earlier, at T-1. But notice that Path I also starts at a higher
consumption and thus utility than Path II, U(C0I) > U(C0II). Path II converges to
bliss faster than Path I, but it sacrifices the utility of the early generations.
Thus, the "gain" of using Path II is the speeded-up convergence to utility and
thus the utility gains of the generations around T-1 (captured by the lightly
shaded area in Figure 2), while the "loss" of using Path II is the foregone higher
utility of the earlier generations (captured by the darkly-shaded area in Figure
2). The Keynes-Ramsey rule asserts merely that the optimal path will be that
where the marginal gain of speeding up is equal to the marginal cost of doing
so.
[Note: the prefix "Keynes" is added to this result because of Ramsey's gracious
acknowledgment to John Maynard Keynes for coming up with the interpretation
we just provided].
113
this Ramsey problem does not fulfill the traditional transversality condition of
infinite-horizon problems, limt→ ∞ λ (t) = 0. This is not quite right, however.]
There are a few interesting features worth noting. Firstly, note that the Keynes-
Ramsey rule is independent of the pricing of factors, specifically it is
independent of marginal product of capital (as Ramsey (1928) was quick to
emphasize). Intuitively, all we need to know for the rule is the initial capital
stock, K0, the shape of the utility function, U(C), and the bliss level, B, and we
can derive the optimal consumption path. To see this, examine Figure 3, where
we have redrawn our utility function. As K′ = Y - C, then the Keynes-Ramsey
rule becomes
Y - C = [B - U(C)]/U′
So, for a given Y, we can deduce the optimal consumption C* by finding the
level of consumption for which this is true. This is shown in Figure 3.
114
dU′ /dt = -FK·U′
dK/dt = F(K) - C
so above the dU′ /dt = 0 isokine, U′ has a tendency to fall, whereas below the
isokine, U′ will rise. Thus, the vertical directional arrows in Figure 4.
115
sloping dK/dt = 0 isokine in Figure 4. The disequilibrium dynamics for the dK/dt
= 0 isokine can be deciphered by noting that:
d(dK/dt)/dK = FK > 0
so that to the right of the dK/dt = 0 isokine, K rises, while to the left of it, K
falls. Thus the horizontal directional arrows in Figure 4.
Together, the isokines dK/dt = 0 and dU′ /dt = 0 have underlying dynamics
which resemble a "saddlepoint": there is a single stable arm going to the
steady-state equilibrium (0, KB), but all other paths move away from it. Notice
that the steady-state (U′ , K) = (0, KB) is "bliss" as U′ = 0, so KB is the level of
capital corresponding to bliss. Saddlepoint dynamics imply that, beginning at
some initial capital level K0, we can define a unique level of marginal utility, U′ 0
which puts us on the stable arm (and this chosen U′ 0 corresponds to choosing
an optimal initial consumption, C0, as we had earlier in Figure 3). As time
progresses, and we continue choosing consumption rates and thus utilities so
that we stay on the stable arm, we move towards the steady-state (0, K B), so
that K → KB and U′ → 0, i.e. we are approaching bliss asymptotically.
The Ramsey (1928) exercise remained dormant for the next half-century. It
was resurrected in the 1950s and 1960s, when the emergence of growth theory
resurrected some of the questions Ramsey had asked. The first result, the
simplest result, was the determination of the "Golden Rule" of growth.
________________________________________________________
116
the ideal investment policy reduces to finding the best value of s,
the fixed investment ratio." "It's fair," Solovians all said. The King
agreed. So he established a prize for discovery of the optimum
investment ratio."
________________________________________________________
117
to this problem the "Golden Rule" of growth. This is in reference to the old
Biblical adage to "do unto others as you would have them do unto you" --
where the "others", in this case, are the future generations of society.
Obviously, if a society could choose a savings rate that maximized its own
consumption, it would save nothing and consume everything. But that would
leave future generations in a lurch as no capital would have been built to
enhance future output and consumption. If, conversely, the current generation
saved so much that future generations would in fact be better off than the
current, then we are also violating "Golden Rule" as we are not doing unto
ourselves what we have done for posterity. Thus, the "Golden Rule" condition is
that the collectively-chosen or policy-imposed savings propensity is such that
future generations can enjoy the same level of consumption per capita as the
initial one.
Mathematically, finding the conditions for "Golden Rule" growth translates itself
into finding the saving propensity that maximizes consumption per capita
which is consistent with steady-state growth. The procedure is simple. Recall
that consumption per capita is merely the difference between output per capita
and investment/savings per capita, i.e. c = y - sy, or:
c = ƒ (k) - sƒ (k)
max c = ƒ (k) - nk
dc/dk = ƒ k - n = 0
In other words, we are at the Golden Rule when the steady state capital-labor
ratio, k*, is such that the marginal product of capital is equal to the natural
growth rate, ƒ k = n.
118
Diagrammatically, we can see this immediately (Figure 1). Remember that our
choice variable is the savings rate, s, thus the actual investment function i = sƒ
(k) is not imposed upon us but can be chosen. In Figure 1, we see two savings
rates, s1 and s2, yielding two different steady-state capital-labor ratios, k1* and
k2*. Which is better? Our criteria is to maximize consumption per capita at the
steady-state, thus we seek to compare c1* and c2*. Diagrammatically, c1* > c2*
so obviously choosing the savings rate s1 is superior to choosing s2.
<Fig. 1 - Golden Rule Growth>
We know this is true because maximum consumption will be where the
difference between the intensive production function y = ƒ (k) and required
investment per capita line, ir = nk, is greatest. Thus, the Golden Rule exercise is
to choose s such that the steady-state k* will be such that these two curves are
at their greatest difference. This can be found simply by placing a line parallel
to the ir = nk line at a tangency with the y = ƒ (k) curve in Figure 1. In terms of
Figure 1, this is at k1*, the steady-state capital-labor ratio associated with the
savings rate s1. Any other savings rate, even those that yield higher output per
capita (like s2), nonetheless yield a lower consumption per capita. Notice that
as the slope of ir is equal to the slope of ƒ (k) at the Golden Rule capital-labor
ratio, k1*, then ƒ k = n.
If we interpret ƒ k as the rate of return on capital, then we see that the "Golden
Rule" condition ƒ k = n is quite familiar. We encountered it, for instance, in the
growth model of John von Neumann (1937). Joan Robinson (1962) referred to
this as the "Neo-neoclassical Theorem".
Although the Golden Rule of growth is simple to derive, it avoids some of the
more intricate questions of the Ramsey exercise. Firstly, note, we are
119
determining the optimal rate by choosing between Solowian steady-states, not
the optimal rate from any initial position. Secondly, it is not clear that the
Golden Rule of growth is "socially optimal" in a wider sense. We were
concerned with maximizing steady-state consumption per person in every
generation. But, in economics, a person's welfare is attached not to the
quantity that he consumes but rather the utility that he attains. How might
the solution be different if we attempted an explicit utilitarian exercise? This
was what the Ramsey exercise was aiming at.
The marriage of the Solow-Swan growth model and Benthamite utilitarianism
was accomplished by David Cass (1965) and Tjalling C. Koopmans (1965) --
what has become known as the Cass-Koopmans optimal growth model. But a
lot of philosophical groundwork on the meaning and construction of
intertemporal social welfare had to be done beforehand.
________________________________________________________
________________________________________________________
120
geared to address Arthur Pigou's (1920: Part I, Ch. 2) concerns about the
implications for society's savings of the personal "irrationality" of discounting
future utility. Effectively, Ramsey demonstrated that an economy in which
people have positive time preference will save far below what is optimal --
exactly as Pigou predicted.
The practical implications are self-evident: if time preference does lead people
to "save too little", perhaps the government should step in and "force" them to
save. As Pigou concluded, "there is wide agreement that the State should
protect the interests of the future in some degree against the effects of our
irrational discounting and of our preferences for ourselves over our
descendants." (Pigou, 1920: p.29). Public pension systems such as the
American "Social Security" program were designed precisely with this goal in
mind.
But Ramsey's (1928) "proof" that a society composed of people with positive
time preference saved too little depended heavily on how he determined the
"optimal" level of savings. Ramsey determined this with perfectly Benthamite
instincts: he looked for the optimal allocation of consumption across
generations which maximized intertemporal "social welfare", defined as the
sum of individual utilities across generations. This has led to much debate and
what follows is a rather lengthy, but by no means deep, digression on
"intertemporal" social welfare functions, with particular attention paid to the
concept of "time preference"
S=∑ t=0
∞
(∑ h=1
H
uth (cth))
where cth is the consumption of person h of generation t and u th(·) is his utility
function. Thus we are defining social utility as the (unweighted) sum of utilities
of all people, current and future.
Let us make the traditional Benthamite "equal capacity for pleasure"
assumption, so that all people, across generations and within generations,
possess the same utility function, i.e. u(.) = uth(·) for all h = 1, 2, ... H and t = 1,
2, .... This then reduces the social welfare function to S = ∑ t=0∞ (∑ h=1H u(cth)).
Furthermore, to get rid of the problem of allocation within a generation (or,
given our assumption of "equality"), we can also assume that every household
at time t gets the same consumption, i.e. cth = ct for all h = 1, 2, .., H, so that
our social welfare function is further reduced to S = ∑ t=0∞ (H·u(ct)). Now, as ct is
consumption per person, then we can define Ct = H·ct as the aggregate
consumption of the generation at t and define U(·) as the "aggregate" utility of
that generation, so U(Ct) = H·u(ct). After all these maneuvers, we end up with a
new social welfare function:
121
S=∑ t=0
∞
U(Ct)
consumption bundles, i.e. C ={C0, C1, C2, ..., }. An example is shown in Figure 1.
The "social optimum" is the allocation or sequence of consumption bundles
that maximizes the social welfare function S.
∑ t=0
T
U(Ct) > ∑ t=0
T
U(Ct′ )
122
In which case we will be tempted to argue that, at least up to T, the path C is
socially better than the path C′ . Now, if we can prove that this continues to
hold true if we extend the end-period T to T+1, T+2, T+3,... etc., then we can
actually come around to concluding that C is a socially better consumption
allocation than C′ , even though both are infinitely long. Even if we cannot
compare both infinite sums directly, we can compare their finite equivalents,
and then approximate the infinite case by gradually increasing the horizon.
Formally, by the "overtaking criterion", path C is said to be "better" than C′ if
there is a time period T* such that for all T ≥ T*, ∑t=0T U(Ct) > ∑t=0T U(Ct′ ). We
say a path C* is "socially optimal" if there is a T* such that, for all T ≥ T*, ∑ t=0T
U(Ct*) ≥ ∑ t=0T U(Ct) for all other feasible paths C.
However, the overtaking criterion does not solve all our problems. The issue of
possible non-comparability of paths continues to lurk. For instance, it is quite
possible that there is no T* such that the inequality holds for all T ≥ T*. For
instance, suppose consumption path C yields the utility stream {1, 0, 2, 0, 3, 0,
..., } and path C′ yields utilities {0, 2, 0, 3, 0, 4, ...}. These are not comparable
by the overtaking criterion: if we set the final horizon, T, at an odd time period,
then C is better than C′ ; but if we set T at an even time period, then C′ is
better than C. Thus, there is no T* for which one path will be consistently better
than another for all T after T* (cf. Koopmans, 1965).
The overtaking criterion only permits a partial ordering over consumption
paths. However, a partial ordering might be enough for most purposes. Put
more precisely, as our example has shown, the overtaking criterion may find us
an "optimal" path C* in the sense that it cannot be bettered by another path,
but that does not imply that C* is itself better than every other feasible path.
[Note: A variation on this theme is the "agreeable criterion" proposed by Peter
J. Hammond and James A. Mirrlees (1973). Loosely, given two infinite-horizon
paths, C and C′ , if we can agree that whatever happens in these paths after a
particular time period T is "inconsequential" to us, then we can order these
paths according to their truncated values. A survey of criteria can be found in
McKenzie (1986).]
Partial ordering only does a partial job -- and when considering issues like
"social optimality", that may not be good enough. We want a complete
ordering. Tjalling Koopmans's (1965) suggestion was to introduce the notion of
utility discounting -- what has become known as "time preference". By this he
meant that the sum of social utility should be weighed so that earlier
generations are "more socially valuable" than latter generations.
Specifically, the social welfare function could be rewritten in the form first
suggested by Paul Samuelson (1937):
S = ∑t=0∞ βt U(Ct)
where 0 < β < 1 is the time discount factor. The further away a generation is,
the less its utility matters for social utility. This maneuver yields the result that
all consumption allocations over an infinite horizon will yield finite sums for S,
i.e. S < ∞ for all possible paths C. With time discounting, then, all paths become
arithmetically comparable and we can thus find a social optimum simply by
comparing the sums.
123
However, following Pigou, Frank Ramsey (1928) considered time discounting as
"a practice which is ethically indefensible and arises merely from the weakness
of the imagination" (Ramsey, 1928). But Ramsey recognized that by omitting
time discounting, the problem of non-comparability of infinite sums emerged.
In its stead, as we have seen, he introduced the reverse social welfare function
in the following manner:
R = ∑t=0∞ (B - U(Ct))
∑ t=0
∞
(B - U(Ct)) < ∑ t=0
∞
(B - U(Ct))
124
device U(Ct)] dt
Table 1 - Intertemporal Social Welfare Functions
Including the time discount rate into our social welfare function would
certainly make things easier. But how are we to justify it? If we follow
Pigou, Ramsey and company in their reasoning, there seems to be no
"ethical" justification for putting a utility discount into the social welfare
function. How might this be disputed?
125
5.4.2.1. The Tastes Defense
But there are articulate defenses in the opposite direction as well (e.g. Ludwig
von Mises, 1949: Ch. 18). One could say that positive time preference is just
that: a preference and not a personal weakness or defect that ought to be
"corrected". It need not be justified, it just "is" and de gustibus non est
disputandum (we cannot quarrel over tastes). In this view, for someone to say
it is an "irrational preference", as Pigou (1920: p.25) did, is oxymoronic.
So the simplest, defense for including a discount factor in the social welfare
function is that, well, people have positive time-preference -- and a preference
is a preference is a preference. It should be respected by the social planner.
Removing time preference from the social welfare function, far from being
"ethical", can in fact be deemed unduly authoritarian as it disregards people's
tastes. Arguments to this end were forwarded by Peter T. Bauer (1957) and
Otto Eckstein (1957).
Yet this is not a perfect argument for if we are going to stick to "preferences"
argument, then should not the preferences of future generations be taken into
account? If their opinion had any bearing on the present, then it would be
precisely to discount the utility of the earlier generations. Clearly, we are at an
impasse.
Of course, all this is wordplay. As Maurice Allais (1947: Ch. 4) and Jan de Van
Graaff (1957: p.103) note, the optimal level of savings is a political and ethical
question, for which the market's solution is only one among many. Bauer
(1957) would probably agree -- but would cast his vote in favor of non-
interference nonetheless.
But if we agree to this, then are we not conceding too much to posterity? If we
are to explicitly consider the question as a political one, one must wonder
whether future generations should have any claim at all! In effect, only living
members are involved in the political process and those are, effectively, the
only ones social welfare functions should respect. As Stephen Marglin explains:
126
Benthamite, game, in which individuals are assumed to have well-
defined preferences that are identical to their utilities, I want to
play the rest of the bourgeois-democratic game of philosophical
liberalism as well: in particular, I want the government's social
welfare function to reflect only the preferences of present
individuals. Whatever else a democratic society may or may not
imply, I consider it axiomatic that a democratic government
reflects only the preferences of the individuals who are presently
members of the body politic." (Marglin, 1963).
And why not? After all, we cannot second-guess the desires of people who are
not born yet and perhaps we should not even try. Who is to argue on their
behalf and should we believe them? Why should we make room in our polity for
current political representatives for future generations that do not exist? To do
so might be as undemocratic as, say, allowing clergymen a dominant position
in current political affairs because they are the "representatives" of
supernatural beings and human afterlife -- concepts which are no more vague
and speculative than "future generations". Of course, the clergy have had such
power in the past, but it is clearly not part of the modern "bourgeois-
democratic" conception of political life.
If we were to agree that only "present members of the body politic" should
count, this might seem to turn the balance towards the Eckstein-Bauer corner
of the debate, restoring the ethical legitimacy of positive time preference. But
perhaps future generations do have political representatives in the present --
namely, that the living individuals themselves are their advocates, however
imperfect. In other words, current people do have "social tastes" which
incorporate the interests of future generations. They actually want the social
welfare function to reflect these.
A strict behaviorist would contend that this is nonsense. If people's tastes
incorporate this advocacy for future generations, their behavior should reflect
this. People's high time preference rate demonstrates that they do not really
care much about them -- and that is the only accurate measure of their
concern.
But what if, contends the opposite camp, living individuals have a discrepancy
between their "personal tastes" and "social tastes". Might people really want
zero (or at least low) discount rates for society, even while possessing a
personal high time preference rate? This does not necessarily dismiss the
behavioral argument, but rather sharpens it by dividing people's behavior into
two: people's political choices (e.g. voting for recycling and environmental
protection laws) reflect their "social tastes", but people's personal economic
choices (e.g. how much recycling they themselves do) reflect their "personal
tastes". Any behaviorist would be forced to admit that, indeed, people's
political behavior usually does not match their personal behavior.
But should not these two types of tastes be consistent with each other? Not
necessarily. As outlined by Stephen Marglin (1963), there are at least two ways
to argue this. The first argument, credited to Gerhard Colm, is simply that the
"frame of reference" is different in personal and social considerations. People
wear two hats on the relevant time discount rates. Individuals may have
defective telescopic faculties when making decisions about what they want to
save individually, but when asked what "society" should save, the individual
127
might recommend a different (i.e. a much lower) discount rate. Again, think of
the penchant for Westerners to individually generate enormous amounts of
waste while condemning Western wastefulness at the same time.
The second argument, articulated by William J. Baumol (1952) and Amartya
Sen (1961), is only subtly different in that it emphasizes the "free-rider"
aspects of the problem. Specifically, people will vote for policies with low
discount rates (e.g. municipal recycling programs) in the expectation that
others will comply with them, while personally they will perform actions which
reflect their personal high discount rate (e.g. not bother to recycle their own
garbage). They might not feel they are being "inconsistent" in their personal
and social tastes because they expect others to comply with the laws that they
have voted for.
Both these cases reinforce Pigou's arguments for disregarding personal
discounting. People's social tastes are for zero or very low discounting and this
is what should be included in the intertemporal social welfare function. Their
personal taste for high discounting, as revealed by their individual actions,
should not be considered sufficient justification for its inclusion. Viewed from
this prism, the Pigou-Ramsey social welfare function is not authoritarian at all,
but complies with what people "really" want when they are in a "social" frame
of mind or when they are voting.
However, this puts us right back in our dilemma. The "tastes" defense does not
seem, on its own, to be capable of justifying the inclusion of positive time
preference in our social welfare function. Pigou, Ramsey, Harrod and company
would be overjoyed.
S=∑ h=1
H
u0h(c0h)
where u0h(·) and c0h are the utility function and consumption plan of the hth
living household, H are the total number of households alive at the initial time
period t = 0. Thus, backing away from Pigou and Ramsey and moving towards
Marglin, we now have "society" defined merely as the living individuals and not
future ones.
Although Marglin's argument would imply future generations have no
"legitimate claims" on the current generation, that does not mean that they
cannot have "emotional claims". The trick is to take this to the extreme and
argue that currently living individuals have "dynastic" utility functions. By this
we mean that a living individual is altruistic towards his "dynasty", i.e. his
utility takes into account the utility of his progeny. Thus, the future is
128
reintroduced into the story not because it is "ethical" to do so, but merely
because that is what current living individuals do anyway.
The implications of this become interesting. The utility of the current
generation depends upon not only their own consumption but also on the utility
of their children. But, by the same logic, the utility of their children depends, in
turn, on the utility of their children and so on ad infinitum through the ensuing
dynasty. To see this clearly, let h denote the "dynasty" stemming from the
living agent h and suppose that there is only one child per adult (no population
growth). Then we can stipulate that u0h = u0h(c0h, u1h), so the utility of the
household h living at t = 0 depends on the utility of their direct descendent, the
household h that is living at t = 1. As u1h itself is a function of consumption at t
= 1 and the utility of their progeny (generation h at t = 2), then u1h = u1h(c1h,
u2h), which we can plug that back into the original generation's utility so u 0h =
u0h (c0h, u1h(c1h, u2h)). Iterating further, the utility of generation t = 2 of dynasty
h is a function of their consumption (c2h) and the utility of their progeny, u3h, i.e.
u2h = u2h(c2h, u3h). We can proceed in this manner for all future generations of
dynasty h. Thus, recursing all the utilities of a dynasty into themselves, the
utility of household h at the initial time period t = 0 is the stream of utilities
achieved by the entire ensuing dynasty in the future, i.e.
or simply:
u0h = ∑ t=0
∞
βt uh(cth)
The rest is simplicity itself. Taking our social welfare function, as Marglin (1963)
suggests, over currently living people alone, we see that:
S=∑ h=0
H
u 0h = ∑ h=0
H
∑ t=0
∞
βt uh(cth)
129
S=∑ t=0
∞
∑ h=0
H
βt uh(cth)
Finally, assuming that all households within a generation have the same
capacity for pleasure, then uh(·) = u(·) for all h = 1, 2, .., H, and therefore (for
Benthamite fairness) the same contemporaneous consumption allocation, cth =
ct for all h = 1, 2, .., L, so our social welfare function becomes S = ∑t=0∞ L·βt
u(ct). Letting L·u(ct) = U(Ct), where U and Ct are the aggregate utility and
aggregate consumption of generation t, then we obtain:
S=∑ t=0
∞
βt U(Ct)
S=∫ 0
∞
U(Ct)e-ρ t dt
where ρ is the rate of time preference. These are exactly the Samuelson social
welfare functions we were hoping for earlier.
In sum, from the dynastic perspective, the burden of taking the utility of future
generations into account is shouldered by currently-living individuals rather
than the social planner. But since the social planner takes the utility of current-
living individuals and because these take the utility of their descendants into
account, then we can regard the resulting social welfare function with
intertemporal discounting, S = ∫0∞ U(Ct)e-ρ t dt, to be "ethically defensible".
Including time preference into the social welfare function does not imply that
our social planner is a moral desperado, but merely that our households have
"defective telescopic faculties".
130
Although the decentralization argument stretches credulity to an enormous
degree, it is the most widely accepted argument today for including time
preference. It has interesting implications for it has modified the nature and
significance of optimal growth theory.
131
Notice there was a second purpose to Koopmans's construction: namely,
the restoration of ordinality into the intertemporal utility construction.
Obviously, if we take the Samuelson function, ∑ t=0∞ βt U(Ct), as our social
welfare function, then it is a cardinal function: the particular numerical
values we give our realized generational utilities, U(C t), U(Ct+1), etc.
matter very much as they will be subsequently added up. But like von
Neumann-Morgenstern, Koopmans was aiming at a "ordinal utility which
is cardinal". The real and only utility function in the Koopmans world is
S(·) and it is defined as a representation of social planner's preferences
over infinite-horizon consumption paths. This utility function S(·) is
ordinal, i.e. the numerical values we assign to S(C), S(C′ ), etc. do not
matter, as long as the preference ordering over paths is maintained. So,
in this way, Koopmans makes the whole intertemporal utility exercise
"acceptable" to radical Paretians.
Koopmans (1960) proceeded as follows. Let C denote an entire infinite-
horizon consumption path and Ct denote the tth element of that path, so
C = {C1, C2, C3, .., Ct, ...}. We will let the term 2C denote the path C
excluding the first entry, i.e. 2C = {C2, C3, C4, ..., Ct, ..,}. Thus, the path C
can be written as C = {C1, 2C}.
Let ∆ denote the "commodity space", in this case the set of infinite-
horizon consumption paths, so C ∈ ∆ . It is assumed that the social
planner's preferences can be captured by a nice social utility function S:
∆ → R, where if S(C) ≥ S(C′ ), then the social planner prefers path C to
path C′ . Koopmans then imposes the following axioms (he calls them
"postulates") on S(·):
132
(P.5) Boundedness: There exist paths Ca and Cb such that
S(Ca) ≤ S(C) ≤ S(Cb).
Note that axioms (P.1)-(P.5) are assumed directly on the social planner's
utility function S: ∆ → R -- which is assumed to exist and somehow
represent the planner's "preferences" over intertemporal allocations. Of
course, we should say something about whether the social utility
function S represents these preferences, and how these utility postulates
might be connected to other, more primitive axioms on preferences.
Koopmans (1972) forges this connection, but we shall skip it here.
The meaning of the axioms can be briefly explained. Postulate (P.1) is a
simple continuity axiom: any slight variation in the path does not lead to
drastic changes in the social planner's utility. It is expressed in this funny
way for technical reasons. Axiom (P.2) means that every generation
counts, i.e. if there are two paths that are identical in every respect
except for the first period where it is drastically different, then the social
welfare of those paths will be different. Postulate (P.3) is a kind of
independence axiom. Specifically, it argues that if two paths are
identical except in one period, then it does not matter what the identical
part looks like when comparing them. There is non-complementarity
between periods in the sense that the social planner's preference over
what generation t achieves is independent of his preferences over what
generation t+1 achieves.
Axiom (4), stationarity, is the famous and most debatable one in the
Koopmans array. What it says explicitly is that preferences between two
events remain the same, even if we push these events forward. For
instance, consider the baseline consumption path (1, 1, 1, 1..., ), so
there is steady consumption of one unit of the consumption good every
period. Now suppose that the following adjustments are possible.
Namely, we can either add X to the first time period or add amount Y to
the second time period, so that the two alternative paths are now: C = (1
+ X, 1, 1, 1, ...) or C′ = (1, 1 + Y, 1, 1, ....). Suppose that the social
planner's preferences are such that:
133
and vice-versa. So, what the stationarity axiom effectively says is that
intertemporal preferences between two periods remain the same even if
we shunt those two periods further ahead in time. Intuitively, if the
social planner prefers one apple today to two apples tomorrow then he
will prefer one apple in 30 days to two apples in 31 days. Interestingly,
Diamond (1965) attempted to redo this analysis without the stationarity
axiom.
Koopmans (1960) shows that a social welfare function which possesses
these five axioms will necessarily exhibit positive time preference. We
shall not attempt to prove this here. But we can give an idea of why this
is so with an example. Suppose there are four utility paths A, B, C and D,
as depicted in Figure 1 (A and B are black, C and D are red). Let us
suppose that the consumption path A is superior to consumption path B
every step of the way, so, socially, S(A) > S(B). Now, suppose that paths
C and D are constructed in the following manner. For the first period (t =
1), path C maintains consumption level c0, but subsequently, for t = 2, 3,
4, .. it follows precisely the same path A did. Thus, C is merely A with a
one-period lag. Similarly, D keeps consumption c0 for the first period and
follows path B with a one-period lag thereafter. Thus, C = {c0, A} and D
= {c0, B}.
As far as the first period is concerned, paths C and D are identical and
only thereafter does the difference begin. Obviously, then, S(C) > S(D)
for the same reasons that S(A) > S(B). In fact, it is a straightforward
application of the stationarity axiom.
134
But what about the utility difference? Intuitively, the utility difference
between C and D should be less than the utility difference between A
and B because for a little while (i.e. during period t = 1), paths C and D
yield the same consumption and thus utility. Because paths C and D are
less different than paths A and B, their utility difference should be
smaller. Heuristically, then, S(A - B) > S(C - D). This is what Koopmans,
Diamond and Williamson (1964) labelled "time perspective" in the sense
that the difference between utility streams is smaller the further away in
time it is..
The necessity of time preference follows from this observation. Suppose,
for example, that A = {3, 3, 3, ..., } and B = {2, 3, 3, 3....}, so (A - B) =
{1, 0, 0, 0, ...}. At the same time, C = {c0, 3, 3, 3, ...} and D = {c0, 2, 3,
3, ...} so (C - D) = {0, 1, 0, 0, ...}. Then obviously the statement S(A-B)
> S(C - D) implies that S(1, 0, 0, ...) > S(0, 1, 0, 0, ..). This is time
preference.
To see this explicitly, notice that from the third period onwards, (A-B) and
(C-D) are identical. Let us just lop of all the remaining periods except the
first two from consideration and so consider the plot in a simple
indifference map (Figure 2). Now, by the non-complementarity axiom,
S(A-B) > S(C - D) implies that {1, 0} is preferred to {0, 1}. In Figure 3,
this translates into saying that the social indifference curve S(A-B) which
passes through (1, 0) lies above the social indifference curve S(C - D)
which passes through (0, 1).
135
indifference curve V. Notice that V (dashed curve in Figure 2) passes
through both (1, 0) and (0, 1). We see then that V is an indifference
curve which has no time preference -- it is indifferent between (1, 0) and
(0, 1). Now, let us assume that S and V are additively separable. This
means that we can decompose the utility of consuming bundle (C1, C2)
into two utility components, i.e.
But notice, in Figure 2, that S at the 45° line is steeper than V at the 45°
line. Thus:
But as MRSV(v) = 1, then it must be that MRSS(s) > 1, or S1′ (s) > S2′ (s). In
English, if we are consuming the same amount in both periods, we would
make a gain in utility by reallocating a unit of consumption from period 2
to period 1, i.e. the present is "preferred", in pure utility terms, to the
future. Notice that the arguments in the indifference curve S are
identical because we are evaluating it on the 45° line. So the difference
between S1′ (s) and S2′ (s) must arise purely from the difference in the
utility parts, S1(·) and S2(·), and not in the amounts consumed. There is
pure time preference. By the Archimedean property of numbers, there is
some factor β ∈ [0, 1] such that β S1′ (·) = S2′ (·). So, our social utility
function can be rewritten as
136
S(·, ·) = U(·) + β U(·)
i.e. our social planner's utility function defined over an infinite horizon is
identical to the Samuelson (1937) discounted intertemporal social
welfare function. The pure rate of time preference, call it ρ , can be
defined as ρ = (1-β )/β , implying that:
β = 1/(1+ρ )
so, as long as ρ > 0, then β < 1. If ρ = 0 (no time preference rate), then β
= 1 (no time discount factor).
Let us step back and think about what we have just done. A positive
time preference rate on the social planner's utility has been deduced
(with the assistance of additive separability) from the simple observation
of the property of "time perspective", i.e. that S(A - B) > S(C - D). This
assertion was purely intuitive: C and D share the first time period, so C
should be "less different" from D than A is from B. This, in a nutshell, is
the crux of the argument. What exactly makes it work?
Critical in this intuition is the sensitivity axiom (P.2). The implications of
this axiom can be thought through as follows. Suppose that a
consumption path can be permanently improved forever if we merely
starve the first generation to death. In the grand scheme of things -- i.e.
in an infinite horizon -- the utility loss of the generation in the first period
seems to be negligible when compared to the increased utility of all the
remaining generations. What Koopmans was seeking with his sensitivity
axiom was to prevent the social planner from making such a calculation.
He should not ignore the utility loss of one generation just because it is
one period out of infinity. Sensitivity is thus not merely a mathematical
condition, it is also an "ethical" condition.
We begin the see how "sensitivity" is related to the time perspective
property. In our simple example, if we could ignore the first period, then
the difference between paths A and B could be considered identical to
the difference between paths C and D. But, by sensitivity, we cannot
ignore the first period. The fact that C and D share the same
consumption in the first period must be accounted for in the social
planner's preferences. It is from this assertion that we can conclude that
S(A - B) > S(C - D) and, as we have seen, thereafter go all the way to
positive time preference and representation of the planner's social utility
137
by a Samuelson function. For details, consult Koopmans, Diamond and
Williamson (1964).
There is an observation worth making at this point. Koopmans's analysis
suggests that if we attempt to derive a social welfare function without
time preference, then we are implicitly violating sensitivity somehow.
This makes sense. Without time preference, the utilities achieved by any
finite number of generations can be "ignored" as the remaining infinity
overwhelms them completely. Indeed, in continuous time, such
negligibility is automatic.
But it is the ethical implications of this that are interesting. What
Koopmans has highlighted with his exercise is the interlinkage between
sensitivity with time preference. On the one hand, discounting is
unethical because, say, it could be used to justify current
environmentally-destructive activities whose effects would only be felt
millions of years in the future. On the other hand, discounting is ethical
because without it, the infinite future overwhelms the finite present.
More acutely, without discounting, we could justify savagely destroying
one generation now if it yielded a series of minuscule gains (no matter
how small) that would accrue forever after. An ethical balance must be
struck: either we count every generation equally (in which case, no
single generation counts at all), or we allow discounting (in which case
every generation counts, but some less than others). Which is more
ethical? See Kenneth Arrow (1979) for some reflections on this.
L(t) = L0ent
where L0 is the initial population. The growth rate of labor can also be written
as gL = (dL/dt)/L = n.
As the population is now increasing, we must make appropriate adjustments to
our social welfare function. Let us move away from Ramsey (1928) by
supposing that labor is supplied inelastically, so that the labor supply at time t
is L(t), which is also the number of living individuals in the economy. Thus, let
us begin with the discounted social welfare function:
S=∫ 0
∞
[∑ h=1
L(t)
uth (cth)]e-ρ t dt.
138
Making the Benthamite "equal capacity for pleasure" assumption, so that uth(·)
= u(·) for all people h in every time period t, and that every person in the same
generation receives the same consumption, i.e. cth = ct for each h = 1, 2, .. L(t).
Then this social welfare function reduces to:
S=∫ 0
∞
[L(t)·u(ct)]e-ρ t dt
S=∫ 0
∞
[ent·u(ct)]e-ρ t dt
or:
S=∫ 0
∞
u(ct)e-(ρ -n)t dt
thus social welfare is discounted by the time preference rate ρ adjusted by the
rate of population growth. Thus, we can think of (ρ -n) as the "actual" or "net"
discount rate.
The logic for this is a bit subtle, but can be understood as follows: if we did not
incorporate population growth into our discount factor then we would be
punishing a single individual in the future twice -- once because he is in the
future (and his forefathers were "myopic"), and twice because he belongs to a
generation which is larger in number. In order to keep some sense of equal
treatment across individuals in this social welfare function, we must adjust the
discount rate for the population growth rate.
We should note that this form is not generally adhered to. David Cass (1965),
for instance, employed only ρ in the discount, i.e. he used:
S=∫ 0
∞
u(ct)e-ρ t dt
as his social welfare function in a model with growing population.. Although this
treats "individuals" unjustly (by punishing people for being part of large
generations), it treats "generations" justly. In contrast, the (ρ -n) discount rate
treats individuals justly, but generations unjustly (larger generations have a
relatively greater weight).
Which to choose? This choice of discount factor is not entirely inconsequential
as they yield different solutions for the optimal growth path. Still, we come
down heavily in favor of the (ρ -n) discount as this is more consistent with
Benthamite logic. In the construction of social welfare, we cannot really think of
a good reason to accept the "generation" as the fundamental unit. The use of
(ρ -n) as the discount rate is defended convincingly by Kenneth J. Arrow and
Mordecai Kurz (1970: p.11-14).
However, there is a downside to the Arrow-Kurz (ρ -n) formulation. Specifically,
for S < ∞ , we need it that ρ > n, i.e. the rate of time preference must exceed
the rate of population growth for the integral to converge. This is a necessary
assumption, but not necessarily a very reasonable or intuitive one. If, as it
turns out in the solution, ρ is equated with the rate of interest, then this
convergence condition says that we need the rate of interest to exceed the
139
natural rate of growth. Effectively, this implies that anyone who takes on debt
at some point but whose real income grows at the natural rate will necessarily
be in the quandary of never really being able to pay back his debt without loss
of income.
One of the critical and most debatable assumptions we have maintained thus
far in our arguments is the assumption of successive generations. In other
words, we have assumed that, every period, a new generation arises and the
old one dies off. Generations precede and follow each other, but they do not
overlap at any point. This is a very restrictive and unrealistic assumption but
one that, unfortunately, is difficult to dispose of.
Models which allow successive generations to overlap with each other were
first proposed by Maurice Allais (1947) and, independently, Paul Samuelson
(1958). They noticed immediately that such a structure has some intriguing
implications for intertemporal social welfare.
There are many ways of modeling overlapping generations. The simplest is the
"two-period-life" version. In this case, each generation lives for two periods --
call it "youth" and "old age". At any time period, one generation of youths
coexists with one generation of the elderly. At the beginning of the next period,
the elderly die off, the youths themselves become elderly and a new
generation of youths is born. Thus, there are two "overlapping" generations of
people living at any one time.
Although we cover this in more detail elsewhere, our interest is in the social
welfare implications of overlapping generations. To see this, let us attempt to
construct a social welfare function when generations overlap. We assume a
generation born at time period t (call it "generation t") lives for two periods: t
and t+1. Let ctt and ct+1t denote the consumption in periods t and t+1
respectively by generation t. Let us denote by ut(ctt, ct+1t) the intertemporal
(two-period) utility function of generation t. Allowing for additive separability
utility and personal myopia, we can write:
where β is the personal discount factor. Now, this is for a single generation that
is born at time t. As a new generation is born every time period t, then the
intertemporal social welfare function is:
where u0(0, c10) is the utility of the first generation of elderly people (born at t =
0), who have had no "youth". Notice that this is intertemporal, so every
generation, present and future, is given equal weight in this social welfare
function (there was a small controversy between Abba Lerner (1959) and Paul
140
Samuelson (1959) over this). Thus, assuming the same personal discount rate
across generations, we can plug in our explicit form:
S = β u0(c10) + ∑ t=1
∞
[ut(ctt) + β ut(ct+1t)]
or, rearranging:
S=∑ t=1
∞
ut(ctt) + β ∑ t=0
∞
ut(ct+1t)
By the Benthamite "equal capacity for pleasure" argument, let ut(·, ·) be the
same across generations. This permits us to drop the t superscripts and rewrite
the social welfare function simply as:
S=∑ t=1
∞
u(ct) + β ∑ t=0
∞
u(ct+1)
This is revealing. For any positive consumption path, this social welfare function
S is not a finite sum, i.e. S = ∞ for any {ct} > 0. Thus, not only are paths "non-
comparable", but we cannot find a "social optimum". The old problem re-
emerges.
The overlapping generations construction yields interesting implications. Firstly,
even when we incorporate personal myopia, we do not end up with finite social
welfare sums. We cannot appeal to the reality of individual discounting to solve
the incomparability problem. To make the sums finite, to make consumption
paths comparable, we require that the social planner start making evaluations
of the relative social worth of different generations. Personal discounting will
not do as a substitute. Thus, letting γ be the social planner's discount rate per
generation, then we end up with:
S=∑ t=1
∞
γ u(ct) + β ∑
t-1
t=0
∞
γ t-1
u(ct+1)
where, assuming 0 < γ < 1, then S becomes finite and paths are now
comparable. But γ is an explicitly unethical discount. There is nothing obvious
we can pluck out of society that can justify it. We must simply accept that our
social planner is "morally challenged".
Secondly, the decentralization thesis does not hold in overlapping generations.
Specifically, it can be easily shown that in an overlapping generations model,
the competitive equilibrium is not Pareto-optimal. This means that a social
planner (or a government) can achieve a superior allocation than that yielded
by the market. The social planner's solution (if we can find one) will be different
from the market solution. The decentralization thesis breaks down.
However, there is a trick that is possible: namely, if we follow the "dynastic"
logic employed earlier. Including intergenerational altruism and "bequests" in
an overlapping generations model, as Robert Barro (1974) did, we can
effectively replicate the traditional Ramsey-style infinite-horizon problem with
successive generations and restore the decentralization thesis.
141
If we add time preference, why assume it is the same across dynasties and
constant across time?
Ramsey (1928) also argued that if a single person is discounting his future
utility, he must discount all of it at the same constant rate. Otherwise, what he
planned for the future will be changed when that future get closer. This has
become known as dynamic inconsistency.
To see why, first we need to convince ourselves that a single discount rate and
Koopmans's stationarity axiom are effectively the same thing. Recall that the
stationarity axiom claims that preferences between two periods remain the
same even if we shunt those periods forward. So, suppose that we have a
baseline consumption path (1, 1, 1, .., 1) and one of the following two
adjustments can happen: either we add X to consumption in time period t or
we add amount Y to consumption to time period t + h. Suppose we have
chosen adjustments X and Y in such a manner than the individual is indifferent
between the two alternative paths, thus, assuming a constant discount rate, β ,
then:
β tU(1 + X) + β t+h
U(1) = β tU(1) + β t+h
U(1+Y)
Notice that t disappears from this expression; only the absolute time difference,
h, remains. In other words, we remain indifferent between the two adjustments
X and Y as long as these two adjustment happen with a difference of h periods
of each other. It doesn't matter whether X happens in period 3 and Y in period
3 + h or whether X happens in period 45 and Y in period 45 + h.
But now suppose that we have two different discount factors. Namely, let β t be
the discount for period t and γ t+h be the discount for period t+h. Then, once
again, choose X and Y so that the agent is indifferent:
t+h
U(1 + X) - U(1) = (γ /β t)[U(1 + Y) - U(1)]
142
Now, notice that t remains in the expression. Thus, our preferences over the
adjustments are dependent not only on the absolute time difference, h, but
also on the actual reference time t when the events happen. Stationarity is
broken.
The absence of stationarity means that we can have dynamic inconsistency,
i.e. plans that are made at one point in time, are contradicted by later behavior.
The identification of this possibility is often credited to Robert Strotz (1956). Its
implications are teased out in Bezalel Peleg and Menachem Yaari (1973).
Intuitively, recall that if the stationarity axiom is violated, then we can have it
that we prefer one apple today to two apples tomorrow, but, at the same time,
prefer two apples in 31 days to one apple in thirty days. Why this leads to
inconsistency is obvious. If I make a consumption plan according to these
preferences, I will plan to receive two apples in 31 days, but then, as time
passes and that day approaches, I'll change my mind and choose to get the
one apple one day earlier. My initial plans are inconsistent with my subsequent
actions.
Interestingly, the old economists who came up with time preference allowed for
the discount rate to change over time. For instance, we find William Stanley
Jevons arguing that:
"I should like to call special attention, further, to the fact, that the
undervaluation which resutls from these causes is not at
allgraduated harmoniously, in the subjective valuation of the
individuals, according to the length of the time that intervenes. I
mean, it is not graduated in this way, for example, that the man
who discounts a utility due in one year by 5%, must discount a
utility due in two years by 10%, or one due in three months by
1¼%. On the contrary, the original subjective undervaluations are,
in the highest degree, unequal and irregular. In particular, so far
as the undervaluation is caused by defects of the will, there may
be a strong difference between an enjoyment hich offers itself at
the very moment and one which does not; while, on the other
hand, there may be a very small difference, or no difference at all,
between an enjoyment which is pretty far away, and one which is
further away." (Böhm-Bawerk, 1889: p.257-8).
143
different time preference rates, e.g. the generation of time 0 discounts at rate β
, but the generation of time 1 discounts at rate γ , where β ≠ γ . This means that
the "plans" that generation 0 sets for generation 1 (and all subsequent
generations) are not followed by generation 1 when the time arrives, who go on
to develop their own distinct plans instead. The ethical implications of this
"time inconsistency" in a dynasty are, however, a bit harder to distentangle.
These discussions have kicked off a series of experimental studies on time
discounting in recent years. Although the results are mixed, they suggest that
people often use non-constant "hyperbolic" discounting rather than constant
"exponential" discounting. See Shefrin (1998) and Rubinstein (2000) for
surveys and critical evaluations of this literature.
However, we should note that for normative purposes, changing time
preferences are not necessarily a deep challenge. Ethically, there is no case for
supporting hyperbolic discounting in the social planner's utility function and so,
in that case, we can ignore whether people personally discount hyperbolically
or not. However, if we derive the social welfare function via the "dynastic"
utility argument, then hyporbolic discounting matters very much indeed, even
if the ethical implications are unclear. Changing time preference matters most
if we were to adopt the positive "decentralization" thesis for our intertemporal
social planner.
Finally, there is no need to assume that time preference is completely
exogenous. Tjalling Koopmans, Peter Diamond and Richard Williamson (1964)
and Hirofumi Uzawa (1968a, 1968b), for instance, have argued quite
persuasively that the time preference factor should be dependent on the levels
of consumption. This is what we obtain with non-additively separable utility. But
the outcome of allowing this is unclear. Harl Ryder and Geoffrey Heal (1973)
have incorporated changing time preferences into an optimal growth model
and shown that the resulting dynamics can be quite complicated.
S = mint Ut(Ct}
144
max S = max{C} {mint Ut(Ct)}.
In other words, the social planner wants to improve the lot of the generation
that is worst off.
The Rawlsian social welfare function may be commendable for its highly
egalitarian structure within a generation, but it is trickier in its intertemporal
form. Recall that, in the Ramsey exercise, savings reduce the utility of the
current generation, but, via capital accumulation and growth, that implies
higher utility for future generations. Thus, a Benthamite social welfare function
will lose on one end but gain in the other. The Rawlsian social welfare function,
however, just loses. If the worst-off generation saves anything, then social
welfare as a whole is lower because the consequent gains in utility by other
better-off generations will not be counted. Thus, the main peculiarity of the
intertemporal Rawlsian social welfare function is that it cannot balance the
utilities of current and future generations. As a result, in the absence of
population growth or technical progress, it predicts that the optimal -- or "just"
-- rate of savings will be zero.
[Note: this result was already alluded to by Tinbergen (1960) and Solow (1974).
Phelps and Riley (1978) create intergenerational externalities by considering a
Rawlsian overlapping generations model. In this case, the savings of one
generation when young can be partly compensated in the future (when they
are old), so net savings will not be zero necessarily.]
Now, Rawls (1971: p.284-93) was quite aware of this dilemma. Although he
refrains from applying his "maximin" principle in the simple manner given
above, he makes suggestive remarks to the effect that it might be applicable if
"dynastic" utilities were incorporated. If "[t]he parties are regarded as
representing family lines, say, with ties of sentiment across generations", then
"the characterization of justice remains the same. The criteria for justice
between generations are those that would be chosen in in the original
position." (Rawls, 1971: p.292).
So, incorporating dynastic considerations once again, we now have a social
utility function:
145
S = mint ∑τ =t
∞
βτ Uτ(Cτ)
So, maximizing this social welfare function means maximizing the smallest
discounted infinite stream of utility. Note that it is the starting point of the
stream, t, that matters.
Kenneth J. Arrow (1973) and Partha Dasgupta (1974), who started on this track,
claimed that the dynastic Rawlsian form would yield dynamic inconsistency and
"it is at least questionable that the sawtooth pattern [of dynastic inconsistency]
corresponds to any intuitive idea of justice" (Arrow, 1973). We should note here
that Arrow-Dasgupta result relies on the faulty manner in which they
incorporate "dynastic" considerations. Their analysis was criticized by
Guillermo Calvo (1978), who provided the form and analysis we use here.
To see dynamic inconsistency à la Calvo, examine Figure 1, where we have two
consumption plans, C (in black) and C′ (in red). Obviously, the worst off
generation in both cases is the first.
Now, if we take the dynastic perspective, from the first generation's point of
view, it may very well be that C′ is better than C. This is because the higher
utility gained by generations 2 and 3 via C′ will, in generation 1's altruistic
calculation, be weighted heavier than the relatively lower utility generations 4,
5, 6, etc. will consequently get (if you think this is not obvious in Figure 1, you
know we can easily adjust the path so that it is so). So, by the maximin criteria,
C′ has greater social welfare than C.
But now examine the same paths from the perspective of generation 3
onwards. In generation 3's view, the utility of generations 4 and 5 are given
much more weight than they had in generation 1's perspective. So, from
generation 3's perspective, even though they themselves get higher utility with
C′ than with C, the immediacy of the drop right after them means that C will be
better than C′ . This is dynamic inconsistency. Generation 1 will plan for path C′
, but by generation 3, that plan will be dropped by generation 3, and path C will
be adopted.
146
Now, Calvo (1978) shows that while dynamic inconsistency is possible it is not
necessarily the result. He proves that if we use the dynastic Rawlsian social
welfare function with a standard Neoclassical optimal growth model, we can
obtain a dynamically-consistent solution.
In this case, we actually get the result of infinite patience! Specifically, as 0 < β
< 1, then τ → ∞ implies βτ U(Ct+τ ) → 0, so the utility of the father at time t (i.e.
the minimum of his dynasty) is the near-zero utility of the last descendent, far
into the infinite future. He will thus consume nothing himself and allocate all of
his income into the far future!
147
We see, then, that the time discount factor, in and of itself, does not mean the
father is "selfish". It is really only when we combine it with a Benthamite
altruism, i.e. an additively separable dynastic utility function, that we achieve
that "unjust" result. Time discounting can mean that the older generation is
selfish (Benthamite altruism) or that it acts like a "mother hen" towards its
progeny (Rawlsian altruism). Which case arises turns out to depend on the
functional form of the dynastic utility function.
One can counter, of course, that one should not incorporate time discounting
into the Rawlsian maximin function as it doesn't yield impatience. In other
words, when we place the discount factor β in the Rawlsian utility function we
are not really incorporating "time-preference" but something else. Perhaps. But
the original definition of time-preference, as posited by Böhm-Bawerk (1889),
was constant underestimation of future utility. That is enough justification to
include β , regardless of the functional form of the utility function. It yields
impatience in the additively separable utility function, but altruistic "patience"
in the Rawlsian maximin form.
5.7. Optimal Growth: Conclusion
Optimal growth theory has changed more than we have indicated here. The
"Real Business Cycle" research program, initiated in the early 1980s by Finn
Kydland, Edward Prescott, Robert King and Sergio Rebelo, took Neoclassical
growth theory -- and optimal growth theory in particular -- as its basic
underlying model. It heavily and unabashedly relies upon the "decentralization
thesis" for methodological justification. It prides itself in its tremendous efforts
to "calibrate" the ghostly parameters of the Cass-Koopmans model -- time
preference, utility functions, productivity, etc. -- so that the optimal solution
path of this normative model is matched to the actual empirical data of
economies around the world. Empirical accounts of growth and fluctuations of
output, employment and growth are regarded as the optimal paths derived as
solutions to appropriately-calibrated optimal growth models.
148
Today, optimal growth theory and other variations on the Ramsey exercise are
marketed as actual representations of how the economy works -- despite the
fact that reality tells us the social planner does not exist. And if we insist on
interpreting him as a "representative agent", then we must keep in mind that
microeconomic theory (specifically, the Sonnenschein-Mantel-Debreu theorem),
tells us that he would misbehave. But, for some mystifying reason, modern
economics persists with this fiction, despite the overwhelming theory and
evidence against it. Appealing to "representative agents" is a deplorable, but
sadly common and ancient habit in economics.
There are more paradoxes that could be drawn out of this -- which ought to be
particularly delicious for Austrian School economists. For instance, the
decentralization thesis basically argues that the centralized economy of (a
benevolent) Stalin "represents" or "achieves the same solution as" the
decentralized economy of wild capitalist markets. The credibility of this
assertion is worth contemplating for a moment or two, particularly in light of
the Socialist Calculation debates of the 1930s.
One may also wish to reflect upon the meaning of "prices" in such an economy.
One of the remarkable justifications put forth for the use of an infinitely-lived,
perfect foresight, representative agent in modern modeling is that agent
heterogeneity, imperfect foresight, etc. would make the "optimal solutions"
indeterminate. Put another way, modern theorists can only derive equilibrium
prices when there are no incentives for exchange among agents; but when
such an incentive exists, they cannot obtain equilibrium prices! A naughty wag
could certainly get a lot of mileage out of that.
As this survey should make evident, this transformation in motivation for and
application of optimal growth theory is one which the original constructors
would find surprising, if not appalling. Frank Ramsey certainly conceived of his
contribution more as an exercise in Benthamite utilitarian philosophy than in
descriptive economics. At any rate, recall that his main point was to
demonstrate that the market solution would be suboptimal. The early builders
of the 1960s -- Tinbergen, Goodwin, Koopmans and others -- were eager to put
it to good use in development planning, perhaps naively believing that the
ought of optimal growth theory could be deliberately planned into becoming an
is, after all.
As late as mid-1970s, Tjalling Koopmans was still arguing that the principal
clientèle of optimal growth theory should be "policy economists who may find it
useful to have the more abstract ideas of this field in the back of their mind
when coping with the day-to-day pressures for outcomes rather than criteria."
(Koopmans, 1977). One of the classics of optimal growth theory, the famous
treatise of Kenneth J. Arrow and Mordecai Kurz (1970) was written almost as a
handbook for policy economists. The interface between government policy and
optimal growth was also explored by other pioneering spirits, such as Hirofumi
Uzawa (1969) and Edmund S. Phelps (1974). The conclusion of Tjalling
Koopmans's Nobel lecture captures the spirit behind the original construction of
optimal growth theory:
149
"The economist as such does not advocate criteria of optimality
He may invent them. He will discuss their pros and cons,
sometimes before, but preferably after trying out their
implications. He may also draw attention to situations where
allover objectives, such as productive efficiency, can be served in
a decentralized manner by particularized criteria such as profit
maximization. But the ultimate choice is made, usually only
implicitly and not always consistently, by the procedures of
decision making inherent in the institutions, laws and customs of
society. A wide range of professional competences enter into the
preparation and deliberation of these decisions. To the extent that
the economist takes part in this decisive phase, he does so in a
double role, as economist, and as a citizen of his polity: local
polity, national polity or world polity."
150
A. Abel and O.J. Blanchard (1983) "An Intertemporal Equlibrium Model of Saving
and Investment", Econometrica, Vol. 51 (3), p.675-92.
K.J. Arrow (1979) "The Trade-Off Between Growth and Equity", in H.I. Greenfield,
A.M. Levenson, W. Hamovitch and E. Rotwein, editors, Theory for Economic
Efficiency: Essays in honor of Abba P. Lerner. Cambridge, Mass: M.I.T. Press.
K.J. Arrow and M. Kurz (1970) Public Investment, the Rate of Return and
Optimal Fiscal Policy. Baltimore: The Johns Hopkins University Press.
R.J. Barro (1974) "Are Government Bonds Net Wealth?", Journal of Political
Economy, Vol. 82 (6), p.1095-1117.
W.J. Baumol (1952) Welfare Economics and the Theory of the State. Cambridge,
Mass: Harvard University Press.
R.A. Becker (1980) "On the Long-Run Steady State in a Simple Dynamic Model
of Equilibrium with Heterogeneous Households", Quarterly Journal of
Economics, Vol. 95, p.;375-82.
D.J. Brown and L.M. Lewis (1981) "Myopic Economic Agents", Econometria, Vol.
49, (2), p.359-68.
151
D. Cass (1965) "Optimum Growth in an Aggregative Model of Capital
Accumulation", Review of Economic Studeis, Vol. 32, p.233-40.
F.H. Hahn (1985) Money, Growth and Stability. Cambridge, Mass: MIT Press.
F.H. Hahn and R.M. Solow (1995) A Critical Essay on Modern Macroeconomic
Theory. Cambridge, Mass: M.I.T. Press.
P.J. Hammond and J.A. Mirrlees (1973) "Agreeable Plans", in J.A. Mirlees and N.
Stern, editors, Models of Economic Growth. New York: Wiley.
152
R.F. Harrod (1948) Towards a Dynamic Economics: Some recent developments
of economic theory and their application to policy. London: Macmillan.
J.M. Keynes (1930) "F. P. Ramsey: An obituary", Economic Journal, Vol. 40,
p.153-4. As reprinted in Keynes, 1933, Essays in Biography. London: Macmillan.
T.C. Koopmans, P.A. Diamond and R.E. Williamson (1964) "Stationary Utility and
Time Perspective", Econometrica, Vol. 32, p.82-100.
G. Loewenstain and J. Elster (1992), editors, Choice Over Time. New York:
Russell Sage Foundation.
R.E. Lucas and N. Stokey (1984) "Optimal Growth with Many Consumers",
Journal of Economic Theory, Vol. 32, p.139-71.
153
E. Malinvaud (1965) "Croissances optimales dans un modèle
macroéconomique", Pontificae Academiae Scientiarum Scripta Varia, Vol. 28,
p.301-384.
S.A. Marglin (1963) "The Social Rate of Discount and the Optimal Rate of
Investment", Quarterly Journal of Economics, Vol. 77 (1), p.95-111.
B. Peleg and M. Yaari (1973) "On the Existence of a Consistent Course of Action
when Tastes are Changing", Review of Economic Studies, Vol. 40 (3), p.391-
401.
E.S. Phelps (1961) "The Golden Rule of Accumulation: A Fable for Growthmen",
American Economic Review, Vol. 51, p.638-43.
E.S. Phelps (1966) Golden Rules of Economic Growth: Studies of efficient and
optimal investment. New York: Norton.
E.S. Phelps (1974) Fiscal Neutrality Toward Economic Growth. New York:
McGraw-Hill.
E.S. Phelps and R.A. Pollack (1968) "On Second-Best National Saving and
Game-Equilibrium Growth", Review of Economic Studies, Vol. 35, p.
E.S. Phelps and J.G. Riley (1978) "Rawlsian Growth: Dynamic programming of
capital and wealth for intergeneration "maximin" justice", Review of Economic
Studies, Vol. 45 (1), p.103-20.
A.C. Pigou (1920) The Economics of Welfare. 1952 (4th) edition, London:
Macmillan.
F.P. Ramsey (1928) "A Mathematical Theory of Saving", Economic Journal, Vol.
38, p.543-59.
154
A. Rubinstein (2000)
H.E. Ryder and G.M. Heal (1973) "Optimum Growth with Intertemporally
Dependent Preferences", Review of Economic Studies, Vol. 40 (1), p.1-32.
P.A. Samuelson and R.M. Solow (1956) "A Complete Capital Model Involving
Heterogeneous Capital Goods", Quarterly Journal of Economics, Vol. 70, p.537-
62.
A.K. Sen (1961) "On Optimising the Rate of Saving", Economic Journal, Vol. 71,
p.479-96.
155
T. W. Swan (1963) "Of Golden Ages and Production Functions", in K. Berrill,
editor, Economic Development with Special Reference to East Asia:
Proceedings of an International Economic Association conference. London:
Macmillan.
J. Timbergen (1956) "The Optimum Rate of Saving", Economic Journal, Vol. 66,
p.603-9.
M.E. Yaari (1965) "Uncertain Lifetime, Life Insurance and the Theory of the
Consumer", Review of Economic Studies, Vol. 32, p.137-50.
156