Academia.eduAcademia.edu

Human capital and earnings distribution dynamics

2006, Journal of Monetary Economics

Mean earnings and measures of earnings dispersion and skewness all increase in US data over most of the working life-cycle for a typical cohort as the cohort ages. We show that a benchmark human capital model can replicate these properties from the right distribution of initial human capital and learning ability. These distributions have the property that learning ability must differ across agents and that learning ability and initial human capital are positively correlated.

NBER WORKING PAPER SERIES HUMAN CAPITAL AND EARNINGS DISTRIBUTION DYNAMICS Mark Huggett Gustavo Ventura Amir Yaron Working Paper 9366 http://www.nber.org/papers/w9366 NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue Cambridge, MA 02138 December 2002 Earlier versions of this paper circulated under the title .Distributional Implications of a Benchmark Human Capital Model.. We thank Jim Albrecht, Martin Browning, Eric French, Jonathan Heathcote, Krishna Kumar, Victor Rios-Rull, Thomas Sargent, Neil Wallace, Kenneth Wolpin and seminar participants at NBER Consumption Group, Rochester, PSU-Cornell Macro Theory Conference, Midwest Macro Conference, Tulane, Pennsylvania, NYU, Stanford, Wharton and VCU for comments. This work was initiated when the second author was affliated with the University of Western Ontario. He thanks the Faculty of Social Sciences for financial support. The third author thanks the Rodney White Center for financial support. The views expressed herein are those of the authors and not necessarily those of the National Bureau of Economic Research. © 2002 by Mark Huggett, Gustavo Ventura, and Amir Yaron. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source. Human Capital and Earnings Distribution Dynamics Mark Huggett, Gustavo Ventura, and Amir Yaron NBER Working Paper No. 9366 December 2002 JEL No. D3, J24, J31 ABSTRACT Mean earnings and measures of earnings dispersion and skewness all increase in US data over most of the working life-cycle for a typical cohort as the cohort ages. We show that a benchmark human capital model can replicate these properties from the right distribution of initial human capital and learning ability. These distributions have the property that learning ability must differ across agents and that learning ability and initial human capital are positively correlated. Mark Hugget Gustavo Ventura Department of Economics Economics Department Georgetown University Pennsylvania State University Washington, D.C. 20057-1036 601 Kern Building [email protected] University Park, PA 16802 [email protected] Amir Yaron Department of Finance The Wharton School University of Pennsylvania 3620 Locust Walk Philadelphia, PA, 19104-6367 and NBER [email protected] 1 Introduction A wide variety of theories have been advanced to explain the general shape of the earnings distribution and the dynamics of the earnings distribution for a cohort as the cohort ages. The list includes models highlighting stochastic earnings shocks, human capital accumulation, sorting of individuals across job types and public learning of individual productivity among others.1 In this paper we assess the degree to which a benchmark human capital model is able to replicate the quantitative properties of the dynamics of the US earnings distribution. The speciÞc properties that we focus on relate to how average earnings and measures of earnings dispersion and skewness change for a typical cohort as the cohort ages. To characterize these age effects, we use earnings data for US males and employ a methodology, described later in the paper, for separating age, time and cohort effects in a consistent way for a variety of earnings statistics. Our Þndings, summarized in Figure 1, are that average earnings, earnings dispersion and earnings skewness increase with age over most of the working life cycle. [Insert Figure 1 a-c Here] We assess the ability of a benchmark human capital model to replicate the patterns in Figure 1. We list two reasons for why such an assessment is of interest. First, human capital models have been central in the earnings distribution literature and are widely used in the literatures on labor, growth, inequality and public Þnance among others. However, at present it is not clear which earnings distribution facts can be replicated, why this is the case, and which facts pose a challenge and thus motivate additional theoretical structure.2 This paper Þlls this void by providing a systematic, 1 Neal and Rosen (1999) review this literature. Earnings distribution facts have long been interpreted as being qualitatively consistent or inconsistent with speciÞc human capital models. This is standard in the earnings and wage regression literature (e.g. Card (1999)) and in the many excellent reviews of human capital theory (e.g. Weiss (1986), Mincer (1997) and Neal and Rosen (1999)). In contrast, Heckman (1975, 1976), Haley (1976), Rosen (1976) and a number of related papers did provide a quantitative assessment. However, distributional implications were not addressed because model parameters were estimated so that the age-earnings proÞle produced by one agent in the model best matched the average earnings proÞle in the data. Our work is closest to the work by Heckman, Lochner and Taber (1998) and Andolfatto, Gomme and Ferrall (2001) who use human capital models with agent heterogeneity to analyze the time variation in the skill premium and average earnings, income and wealth proÞles, respectively. 2 2 quantitative assessment of the degree to which a well-known and widely-used human capital model is able to replicate a wide variety of earnings distribution facts. Second, quantitative models of the earnings distribution should arguably be central in the positive and normative analysis of distributional questions. Currently, modern versions of the life-cycle, permanent-income hypothesis, in which either earnings or wages are taken as exogenous random processes, have been dominant in much of this literature (e.g. the consumption, saving and wealth distribution literature and the literature on social security and income tax reform). We expect that in the near future models with deeper foundations for individual earnings heterogeneity will dominate. This paper takes one step in this direction by highlighting the importance of initial conditions. Additional steps will be needed to resolve the question of the importance of initial conditions versus shocks over the life cycle.3 We assess the Ben-Porath (1967) model. This is a well-known and widely-used human capital model. In our version of this model, an agent is born with some immutable learning ability and some initial human capital. Each period an agent divides available time between market work and human capital production. Human capital production is increasing in learning ability, current human capital and time allocated to human capital production. An agent maximizes the present value of earnings, where earnings in any period is the product of a rental rate, human capital and time allocated to market work. Our assessment focuses on the dynamics of the cohort earnings distribution produced by the model from different initial joint distributions of human capital and learning ability across agents. Our Þndings are striking. We establish that the earnings distribution dynamics documented in Figure 1 can be replicated quite well by the model from the right initial distribution. In addition, the model produces the key properties of the cross-sectional earnings distribution. These conclusions are not sensitive to the precise value of the elasticity parameter in the human capital production function, nor are they sensitive to the age at which human capital accumulation begins. The initial distributions which replicate the patterns in Figure 1 rely crucially on differences in learning ability across agents. Age-earnings proÞles for agents with high learning ability are steeper than the proÞles for agents with low learning ability. This is the key mechanism for how the model produces increases in earnings dispersion and skewness for a cohort as the cohort ages. Earnings proÞles are steeper for high ability agents since early in life they allocate a relatively larger fraction of their time 3 Keane and Wolpin (1997) address this question in a model with an occupational choice decision. 3 to human capital production and thus have low earnings, while their time allocation decisions and high learning ability imply that later in the life cycle they have higher levels of human capital and, hence, earnings. This mechanism is also consistent with regularities long discussed in the human capital literature such as the fact that time allocated to skill acquisition is concentrated at young ages, that age-earnings proÞles are steeper for people with high amounts of schooling and measured learning ability and that the present value of earnings increases in a measure of learning ability.4 We also contrast the implications of the model with evidence on persistence in individual earnings. The model implies that over time both individual earnings levels and earnings growth rates are strongly positively correlated. Evidence from US data shows that earnings levels are positively correlated but that earnings growth rates one year apart are negatively correlated. This and related evidence suggests that there is potentially an important role for idiosyncratic shocks that lead to mean reversion in earnings. These shocks are by construction absent from the benchmark model. The paper is organized as follows. Section 2 describes the data and our empirical methodology. Section 3 presents the model. Section 4 discusses the parameter values. Section 5 presents the central Þndings of the paper. Section 6 concludes. 2 Data and Empirical Methodology 2.1 Data The Þndings presented in the introduction are based on earnings data from the PSID 1969-1992 family Þles. We utilize earnings of males who are the head of the household. We consider two samples. We deÞne a broad sample to include all males who are currently working, temporarily laid off, looking for work but are currently unemployed, students, but does not include retirees. The narrow sample equals the broad sample less those unemployed or temporarily laid off. We note that the theoretical model we analyze is not a model of unemployment or lay offs. This would suggest that the narrow sample is more relevant. However, since the results are not sensitive to the choice of sample we present the results for the broad sample. We consider males between the ages of 20 and 58. This is motivated by several considerations. First, the PSID has many observations in the middle but relatively fewer at the beginning or end of the working life cycle. By focusing on ages 20-58, we have at least 100 observations in each age-year bin with which to calculate age and 4 Lillard (1977) provides evidence on the last two points. 4 year-speciÞc earnings statistics. Second, near the traditional retirement age there is a substantial fall in labor force participation that occurs for reasons that are abstracted from in the model we analyze. This suggests the use of a terminal age that is earlier than the traditional retirement age. We also restrict the sample to those with strictly positive earnings. This is not essential to our methodology but it does allow us to take logs as a convenient data transformation. This restriction almost never binds.5 Finally, we exclude the Survey of Economic Opportunities (SEO) sample which is a subsample of the PSID that over samples the poor. Given all the above sample selection criteria, the average and standard deviation of the number of observations per panel-year are 2137 and 131 respectively. 2.2 Construction of Age ProÞles We focus the analysis on cohort-speciÞc earnings distributions. Let epj,t be the real earnings at percentile p of the earnings distribution of agents who are age j at time t. These agents are from cohort s = t − j (i.e., agents who were born in year t − j).6 We assume that the percentiles of the earnings distribution epj,t are determined by cohort effects αps , age effects βjp and shocks ²pj,t . The relationship between these variables is given below both in levels and in logs, where the latter is denoted by a tilde. epj,t = αsp βjp ²pj,t ẽpj,t = α̃sp + β̃jp + ²̃pj,t This formulation is consistent with the theoretical model that we present in the next section. In particular, in a steady state of the model with a constant growth rate of the rental rate of human capital, epj,t is produced by a cohort effect αsp that is proportional to the rental rate in cohort year s, a time-invariant age effect βjp and no shocks (i.e. ²pj,t ≡ 1 and ²̃pj,t ≡ 0). Expressed somewhat differently, in steady state the cross-sectional, age-earnings distribution just shifts up proportionally each period. 5 Most of those who report being laid off, unemployed or students turn out to have some earnings during the year. 6 Real values are calculated using the CPI. To calculate epj,t we use a 5 year bin centered at age j. For example, to calculate earnings percentiles of agents age j = 30 in year t = 1980 we use data on agents age 28 − 32 in 1980. We also use a 5 year bin centered at ages 20 and 58. To do this we use data on agents age 18-22 and 56-60. 5 We use ordinary least squares to estimate the coefficients α̃sp and β̃jp for various percentiles p of the earnings distribution.7 In Figure 2 we graph the age effects of different percentiles of the levels of the earnings distribution by plotting βjp . The age effects βjp are scaled so that each graph passes through the geometric average value at age j = 40 of epj,t across all cohorts.8 The percentiles considered in Figure 2 range from a low of p = .025 (earnings such that 2.5 percent of the agents are below this value) to a high of p = .99 (earnings such that 99 percent of the agents are below this value). We consider 23 different percentiles p = .025, .05, .10, ..., .90, .925, .95, .975, .99. [Insert Figure 2 Here] The Þndings in Figure 1a-c in the introduction are all calculated directly from the results graphed in Figure 2. Figure 1a shows that average earnings increase with age over most of the working life cycle. Early in the life cycle this follows because earnings at all percentiles in Figure 2 shift up with age. Later in the life cycle this follows from the strong increase with age at the highest percentiles of the earnings distribution despite the fact that earnings at the median and lower percentiles are already decreasing with age. The increase in earnings dispersion in Figure 1b, using the Gini coefficient as a measure of earnings dispersion, follows from the general fanning out of the distribution which is a striking feature of Figure 2. The increase in the skewness measure with age in Figure 1c is implied by the strong fanning out at the top of the distribution observed in Figure 2. 2.3 Alternative Views of Age Effects A more general speciÞcation of the regression equation used in the last subsection would allow the percentiles of the earnings distribution to be determined by time effects γtp in addition to age βjp and cohort αps effects as in the equation below. Once again, a logarithm of a variable is denoted by a tilde. Time effects can be viewed as effects that are common to all individuals alive at a point in time. An example would be a temporary rise in the rental rate of human capital that increases the earnings of all individuals in the period. 7 Each regression has J × T dependent variables regressed on J + T cohort dummies and J age dummies. T and J denote the number of time periods in the panel and the number of distinct age groups, which in our case equal J = 58 − 20 and T = 1992 − 1969. p 8 More speciÞcally, we plot βjp ep40 /β40 , where ep40 is the geometric average real earnings at age 40 and percentile p in the data. 6 epj,t = αsp βjp γtp ²pj,t ẽpj,t = α̃sp + β̃jp + γ̃tp + ²̃pj,t The linear relationship between time t, age j, and birth cohort s = t − j limits the applicability of the regression speciÞcation above. SpeciÞcally, without further restrictions the regressors in this system are co-linear and these effects cannot be estimated. This identiÞcation problem is well known in the econometrics literature.9 In effect any trend in the data can be arbitrarily reinterpreted as a year (time) trend or alternatively as trends in ages and cohorts. Given this problem, our approach is to determine how sensitive the age effects in Figure 1 and 2 are to alternative restrictions on the coefficients (α̃sp , β̃jp , γ̃tp ). One view, which we label the cohort dummies view, comes from constructing Figure 2 by setting time effects to zero (i.e. γ̃tp = 0) as was done in the last subsection. A second view, which we label the time dummies view, comes from constructing Figure 2 by setting cohort effects to zero (i.e. α̃ps = 0).10 A third view, which is intermediate to both previous views, comes from constructing Figure 2 after allowing age, cohort and time effects but with the restriction that time effects are mean zero and are orthogonal to a time trend.11 This restriction implies that time trends are attributed to cohort and age effects rather than time effects. We label this last view the restricted time dummies view. [Insert Figure 3 (a-c) Here] Figure 3 highlights the age effects on average earnings, earnings dispersion and earnings skewness using these three views. The results are that all three views lead to the same qualitative results. Quantitatively, the cohort dummies view is almost indistinguishable from the restricted time dummies view. The time dummies view produces a ßatter proÞle of earnings dispersion as compared to the cohort dummies or restricted time dummies view. In the remainder of the paper we focus on the results from the cohort dummies view highlighted in Figure 1. 9 See, for example, Hanoch and Honig (1985), Deaton and Paxson (1994) and Ameriks and Zeldes (2000). 10 Each regression has J ×T dependent variables regressed on T time dummies and J age dummies. This regression has J less regressors than the regression incorporating cohort effects. PT PT 11 Formally, this normalization requires that T1 t=1 γ̃t = 0 and T1 t=1 γ̃t t = 0. Appendix A provides more details on how we carry out this estimation. 7 2.4 Related Empirical Work Our empirical work is related to previous work both at a substantive and a methodological level. At a substantive level, labor economists have examined patterns in mean earnings and measures of earnings dispersion and skewness at least since the work of Mincer (1958, 1974), where the focus was on cross-section data. A common Þnding from cross-section data is that mean earnings is hump-shaped with age and that measures of earnings dispersion tend to increase with age. A number of studies (e.g. Creedy and Hart (1979), Shorrocks (1980), Deaton and Paxson (1994), Storresletten et. al. (2001)) have examined the pattern of earnings dispersion in cohort or repeated cross-section data and have found that dispersion tends to increase with age.12 Schultz (1975), Smith and Welch (1979) and Dooley and Gottschalk (1984) present evidence that dispersion proÞles are U-shaped in that a measure of dispersion decreases early in the life cycle and then later increases with age. We Þnd a slight U-shape in the dispersion proÞle when dispersion is measured by the Gini coefficient. At a methodological level, our work and a number of the studies cited above go beyond the early work based on a single cross-section. In particular, these studies separate age effects from cohort and/or time effects using panel data or repeated cross-sections. For example, Deaton and Paxson (1994) focus on how the variance of log earnings and the variance of log consumption in household-level data evolves over the life cycle. Their main results are based on regressing the variance of log earnings of a cohort on age and cohort dummies. They use the estimated age coefficients to highlight the effect of aging. The methodology that we employ is broadly similar. However, since we are interested in several earnings statistics there is the issue that if we were to employ this procedure on each separate statistic of interest then age and cohort effects would be extracted in a different way for each statistic. Our proposed solution is to employ the same procedure directly on the percentiles of the age and cohort speciÞc earnings distributions. This procedure produces the age effects graphed in Figure 2. Using Figure 2, one can calculate the resulting age effects for any statistic of interest, knowing that cohort and/or time effects have been extracted in a consistent way. 12 Creedy and Hart (1979) and Shorrocks (1980) use individual-level data, whereas Deaton and Paxson (1994) and Storresletten et. al. (2001) use household-level data. 8 3 Human Capital Theory An agent maximizes the present value of earnings over the working lifetime by dividing available time between market work and human capital production.13 This present value is given in the decision problem below, where r is a real interest rate and earnings in a period equal the product of the rental rate of human capital wj , the agent’s human capital hj and the time spent in market work (1 − lj ). The stock of human capital increases when human capital production offsets the depreciation of current human capital. Human capital production f(hj , lj , a) depends on an agent’s learning ability a, human capital hj and the fraction of available time lj put into human capital production. Learning ability is Þxed at birth and thus does not change over time. max J X wj hj (1 − lj )/(1 + r)j−1 j=1 subject to lj ∈ [0, 1], hj+1 = hj (1 − δ) + f (hj , lj , a). We formulate this decision problem in the language of dynamic programming. The value function Vj (h; a) gives the maximum present value of earnings at age j from state h when learning ability is a. The value function is set to zero after the last period of life (i.e. VJ+1 (h; a) = 0). Solutions to this problem are given by optimal decision rules hj (h; a) and lj (h; a) which describe the optimal choice of human capital carried to the next period and the fraction of time spent in human capital production as functions of age j, human capital h and learning ability a. Vj (h; a) = max wj h(1 − l) + (1 + r)−1 Vj+1 (h0 ; a) 0 l,h subject to l ∈ [0, 1], h0 = h(1 − δ) + f(h, l, a). We focus on a speciÞc version of the model described above that was Þrst analyzed by Ben-Porath (1967). In this model, the human capital production function is given by f (h, l, a) = a(hl)α . Proposition 1 below presents key results for this model. 13 We note that utility maximization implies present value earnings maximization in the absence of a labor-leisure decision and liquidity constraints. Hence, nothing is lost for the study of human capital accumulation and the implied earnings dynamics if one abstracts from consumption and asset choice over the life-cycle. 9 Proposition 1: Assume f (h, l, a) = a(hl)α , α ∈ (0, 1), the depreciation rate δ ∈ [0, 1), the rental rate equals wj = (1 + g)j−1 and the gross interest rate (1 + r) is strictly positive. Then (i) Vj (h; a) is continuous and increasing in h and a, is concave in h and hj (h; a) is single-valued. (ii) If in addition aAj (a)α + (1 − δ)Aj (a) ≥ Aj+1 (a), then the optimal decision rules are as follows: hj (h; a) = ( aAj (a)α + (1 − δ)h for h ≥ Aj (a) ahα + (1 − δ)h for h ≤ Aj (a) lj (h; a) = ( Aj (a)/h 1 Aj (a) ≡ ( X (1 + g)(1 − δ) 1 aα(1 + g) 1 J−j ) 1−α ( [ ]k ) 1−α 1+r (1 + r) k=0 for h ≥ Aj (a) for h ≤ Aj (a) Proof: See the Appendix. We now comment on the implications of Proposition 1. First, the fact that Vj (h; a) is concave in human capital means that each period the decision problem is a concave programming problem. Thus, standard techniques can be used to compute solutions regardless of any further restrictions on the parameters of the model. Second, the optimal decision rule for human capital says that, holding a Þxed, human capital production in a period is the same for agents in a cohort as long as current human capital is above a cutoff level Aj (a). Thus, time in human capital production is inversely related to the current level of human capital. Agents with human capital below the cutoff level Aj (a) spend all available time in human capital production. The parameters of the model are restricted in Proposition 1(ii) to get a simple, closed-form solution. These restrictions amount to the assumption that once an agent stops full-time schooling (i.e current human capital is above the cutoff level) then the agent never returns to full-time schooling (i.e. future human capital remains above future cutoff levels). The parameter values used in this paper satisfy these restrictions. Third, the optimal decision rule puts a number of restrictions on human capital and earnings distribution. Notice that if all agents within an age group have the 10 same learning ability and have positive earnings, then as agents age, human capital dispersion within the cohort must decrease. More precisely, the Lorenz curves for human capital can be ordered in the sense that the Lorenz curve for age j lies strictly below the Lorenz curve for age j + 1 and so on. This is easy to see from plotting the decision rule on a 45 degree line diagram since agents with the lower human capital have higher human capital growth rates. Later on we will present a parallel argument that shows that the Lorenz curve for earnings can also be ordered. The fact that decision rule hj (h; a) is increasing in both human capital and learning ability has implications for the identity of high and low earners. In particular, at the end of the working life cycle the agents who are high earners are those who started off with high initial human capital and/or ability. This is true since at the end of the life cycle earnings are proportional to human capital. In addition, the fact that hj (h; a) increases in both components means that in order to match observed earnings dispersion at the end of the working life cycle requires that there is sufficient dispersion in human capital and ability at the beginning of the life cycle. This is key for this paper as it focuses on characterizing the nature of initial agent heterogeneity that is critical for replicating observed earnings distribution dynamics. 4 Parameter Values The Þndings of this paper are based on the parameter values indicated in Table 1. The time period in the model is a year. An agent’s working lifetime is taken to be either 39 or 49 model periods, which corresponds to a real life age of 20 to 58 and 10 to 58 respectively. These two values allow us to explore different views about when the human capital accumulation mechanism highlighted by the model begins. The real interest rate is set to 4 percent. The rental rate of human capital equals wj = (1 + g)j−1 and the growth rate is set to g = .0014. This growth rate equals the average growth rate in average real earnings over the period 1968-92 in our PSID sample.14 Within the model the growth rate of the rental rate equals the growth rate of average earnings, when rental growth and population growth are constant and when the initial distribution of human capital and ability is time invariant. Given the growth in the rental rate, we set the depreciation rate to δ = 0.0114 so that the model produces the rate of decrease of average real earnings at the end of the 14 The growth rate of average wages (e.g. total labor earnings divided by total work hours) over 1968-92 in our PSID sample equals .0017. 11 working life cycle documented in Figure 1.15 The model implies that at the end of the life cycle negligible time is allocated to producing new human capital and, thus, the gross earnings growth rate approximately equals (1 + g)(1 − δ). When we choose the depreciation rate on this basis the value lies in the middle of the estimates in the literature surveyed by Browning, Hansen, and Heckman (1999). Estimates of the elasticity parameter α of the human capital production function are surveyed by Browning et. al. (1999). These estimates range from 0.5 to almost 1.0. We note that this literature estimates α so that the earnings proÞle produced by one agent in the model best Þts the earnings data. Thus, the maintained assumption is that everyone is identical at birth so that the initial distribution of learning ability and human capital across agents is a point mass.16 We note that this initial distribution is unrestricted by the theory and therefore treat it as a free parameter in our work. Thus, we remain agnostic about the value of α and assess the model for values between 0.5 and 1.0. Table 1: Parameter Values Model Periods J = 39, 49 5 5.1 Interest Rate r = .04 Rental Depreciation Growth Rate g = .0014 δ = .0114 Production Function α ∈ [0.5 − 1.0) Findings Earnings Distribution Dynamics Earnings distribution dynamics implied by the model are determined in two steps. First, we compute the optimal decision rule for human capital for the parameters described in Table 1. Second, we choose the initial distribution of the state variable to best replicate the properties of US data documented in Figure 1. The Appendix describes how these steps are carried out. We consider both parametric and non-parametric approaches for choosing the initial distribution. In the parametric approach this distribution is restricted to be 15 We use a rate of growth in earnings at the end of the life cycle equal to -0.01. The growth rate in mean earnings at the end of the life-cycle from Fig. 1 is -0.0107 and -0.0078 for age groups 55-58 and 50-58 respectively. 16 Heckman et. al. (1998) allow for agent heterogeneity. They estimate model parameters so that earnings of one agent in the model best match earnings data for individuals sorted by a measure of ability and by whether or not they went to college. 12 jointly, log-normally distributed. This class of distributions is characterized by 5 parameters. In the non-parametric approach, we allow the initial distribution to be any histogram on a rectangular grid in the space of human capital and learning ability. In practice, this grid is deÞned by 20 points in both the human capital and ability dimensions and thus, there are a total of 400 bins used to deÞne the possible histograms. In both approaches we search over the vector of parameters that characterize these distributions so as to minimize the distance between the model and data statistics for mean earnings, dispersion and skewness.17 The results are presented in Figure 4 and 5 for the parametric and non-parametric case under the assumption that human capital accumulation starts at a real life age of 10 and 20, respectively. Note that the model implications are very similar for these two different starting ages. For a better visual presentation, we graph in all cases results for only the central value of α = 0.7. We emphasize that similar quantitative patterns emerge for all values of α between .5 and .9. These Þgures demonstrate that the model is able to replicate the qualitative properties of the US earnings distribution dynamics presented in Figure 1 both when the initial distribution is chosen parametrically and non-parametrically. Moreover, the results for the nonparametric case are quite striking: the model replicates to a surprising degree the quantitative features of US earnings distribution dynamics. [Insert Figure 4 (a-c) Here] [Insert Figure 5 (a-c) Here] As a measure of the goodness of Þt, we present in Table 2 the average (percentage) deviation, in absolute terms, between the model implied statistics and the data.18 By this measure, on average the model implied statistics differ from the data by 2.5% to 3.8% in the non-parametric case for different values of the elasticity parameter of the 17 More precisely, we Þnd the parameter vector γ characterizing the initial distribution that solves the minimization problem below, where mj , dj , sj are the statistics of means, dispersion and inverse skewness constructed from the PSID data, and mj (γ), dj (γ), sj (γ) are the corresponding model statistics. min γ 18 J X ([log(mj /mj (γ))]2 + [log(dj /dj (γ))]2 + [log(sj /sj (γ))]2 ) j=1 PJ The goodness of Þt measure is [ j=1 | log(mj /mj (γ))| + | log(dj /dj (γ))| + | log(sj /sj (γ))|]/(3J). 13 production function. In the parametric case, the Þt is naturally not as good; in this case the model differs from the data by 5% to 7.5%. Graphically, the parametric case produces too much earnings skewness in each age group. Nonetheless, a parsimonious representation of the initial distribution can go a long way towards reproducing the dynamics of the US age-earnings distribution.19 Table 2: Mean Absolute Deviation (%) Case α = 0.5 α = 0.6 α = 0.7 α = 0.8 α = 0.9 Panel A: Accumulation starts at Age 10 Non-Parametric 3.5 3.2 2.6 2.5 Parametric 7.5 6.4 5.9 5.2 2.8 6.2 Panel B: Accumulation starts at Age 20 Non-Parametric Parametric 3.1 6.8 3.5 7.0 2.8 5.2 3.9 5.0 3.8 6.4 To close this section, we note that the benchmark human capital model is also successful in an alternative dimension. SpeciÞcally, features of the cross-section earnings distribution implied by the model are roughly in line with the corresponding features in cross-section data. We construct the cross-section earnings distribution implied by the data using the cohort-speciÞc earnings percentiles in Figure 2 together with the assumption that the population growth rate is 1%. The cross-sectional earnings distribution has a Gini coefficient of 0.33, a skewness measure of 1.16 and a fraction of earnings in the upper 20%, 10%, 5% and 1% of 40.2%, 25.1%, 15.5% and 4.7% respectively. The model for α = 0.7 in the non-parametric case implies a cross-sectional 19 A natural question is whether one can always exactly match the age-earnings dynamics in Figure 1 or 2, given that the theory does not restrict the initial distribution and thus, effectively offers an inÞnite number of free parameters. The answer is no. In section 5.3 we show, for the case where all agents are born with the same learning ability but different human capital levels, that the model can never match even the qualitative properties of the age-earnings dynamics in the data, despite the fact that the class of initial distributions considered is inÞnite dimensional. Intuitively, one can match any distribution of earnings in the terminal period J with an unrestricted distribution of initial conditions. Furthermore, matching the terminal earnings distribution pins down a unique initial distribution of human capital, given the monotonicity of the optimal decision rule of human capital, since this initial distribution is over a single variable (i.e., human capital). However, to match the patterns in Figure 1 or 2 one needs to match the terminal distribution and the distribution in all previous periods. 14 earnings distribution with a Gini coefficient of 0.327, a skewness measure of 1.18, with corresponding fractions of earnings in the upper tail of 40.9%, 27.0%, 17.5% and 6.1%. 5.2 Properties of Initial Distributions Tables 3 and 4 characterize properties of the initial distributions that produce the earnings distribution implications highlighted in Figures 4 and 5. Several regularities are apparent. First, the properties of means, dispersion, skewness and correlation in Table 3 for the non-parametric case are similar to those in Table 4 for the parametric case. Thus, the economic content of what the model and the data in Figure 1 impose on the initial distribution appears not to be too sensitive to whether or not one restricts this initial distribution in a parsimonious way. Second, initial human capital and learning ability are positively correlated when the human capital accumulation process articulated by the model starts at age 10 but much more highly correlated when the process starts at age 20. This Þnding is implied by the dynamics of the model. In particular, distributions which at age 10 have low correlation induce more highly correlated distributions in each successive period as agents age. This occurs, according to Proposition 1, since in each period high ability agents produce more human capital than low ability agents, holding initial human capital at the beginning of life equal. Third, mean learning ability declines as the curvature parameter α increases, while the opposite is true for mean initial human capital. To gain intuition, note that for given learning ability and initial human capital a higher value of α lowers earnings early in life and raises earnings later in life — in effect rotating individual age-earnings proÞles counter-clockwise. This follows, see Proposition 1, since as α increases time spent working early in life decreases whereas end of life human capital increases. Raising mean initial human capital and lowering mean learning ability serves to rotate the age-earnings proÞles clockwise to counteract the effect of increasing α. Finally, note that when accumulation starts at age 10, the model implies that net human capital accumulation for a cohort is positive over the life cycle.20 Finally, a prominent feature in Tables 3 and 4 is that dispersion in learning ability declines as α increases. To understand this result, focus on dispersion in earnings, and hence human capital, at the end of the life cycle. Terminal human capital equals initial 20 To see this point note that mean earnings at age 58 equals 100 and the rental rate of human capital equals wj = 1.0014j−1 . Thus, mean human capital must be slightly less than 100 at age 58 to match the earnings data at that age. 15 human capital after depreciation plus an amount due to the production of human capital over the life cycle. One can show that dispersion in the second component increases as α increases. Thus, to replicate the pattern in Figure 1, a reduction in learning ability dispersion helps counteract this increase in terminal human capital dispersion. Table 3: Ability and Human Capital at Birth (Non-Parametric Case) Statistic α = 0.5 α = 0.6 α = 0.7 α = 0.8 α = 0.9 Panel A: Accumulation starts at Age 10 Mean (a) 0.466 0.319 0.209 0.139 Coef. of Variation (a) 0.601 0.463 0.358 0.243 Skewness (a) 1.303 1.190 1.183 1.168 Mean (h1 ) 69.6 71.4 74.9 76.0 Coef. of Variation (h1 ) 0.456 0.453 0.422 0.397 Skewness (h1 ) 1.152 1.146 1.151 1.155 Correlation (a, h1 ) 0.10 0.205 0.305 0.397 0.087 0.212 1.103 83.5 0.261 1.142 0.418 Panel B: Accumulation starts at Age 20 Mean (a) 0.453 0.320 0.210 0.134 Coef. of Variation (a) 0.669 0.504 0.365 0.324 Skewness (a) 1.251 1.188 1.147 1.131 Mean (h1 ) 86.8 88.1 93.4 94.5 Coef. of Variation (h1 ) 0.475 0.486 0.510 0.457 86.8 88.1 93.4 94.5 Mean (h1 ) Skewness (h1 ) 1.148 1.163 1.167 1.135 Correlation (a, h1 ) 0.621 0.689 0.781 0.792 0.089 0.168 1.111 99.6 0.501 99.6 1.124 0.741 16 Table 4: Ability and Human Capital at Birth (Parametric Case) Statistic 5.3 α = 0.5 α = 0.6 α = 0.7 α = 0.8 α = 0.9 Panel A: Accumulation starts at Age 10 Mean (a) 0.499 0.322 0.207 0.139 Coef. of Variation (a) 0.514 0.436 0.353 0.235 Skewness (a) 1.125 1.092 1.061 1.027 Mean (h1 ) 64.0 69.2 74.7 75.1 Coef. of Variation (h1 ) 0.454 0.453 0.434 0.403 Skewness (h1 ) 1.100 1.100 1.090 1.077 Correlation (a, h1 ) 0.070 0.145 0.171 0.333 0.089 0.198 1.010 78.6 0.184 1.071 0.351 Panel B: Accumulation starts at Age 20 Mean (a) 0.467 0.321 0.209 0.136 Coef. of Variation (a) 0.613 0.474 0.347 0.257 Skewness (a) 1.191 1.109 1.058 1.033 Mean (h1 ) 86.7 89.5 92.3 96.6 Coef. of Variation (h1 ) 0.427 0.439 0.481 0.468 Skewness (h1 ) 1.088 1.092 1.109 1.105 Correlation (a, h1 ) 0.600 0.621 0.781 0.792 0.088 0.158 1.012 100.1 0.459 1.100 0.796 Importance of Ability and Human Capital Differences We now provide insights on the importance of ability differences versus initial human capital differences in producing the results in Figures 4 and 5. We concentrate on two extreme cases: agents differ initially only in human capital or only in ability. The analysis Þnds that learning ability differences are essential in reproducing the facts we focus on. However, differences in initial human capital implied by the joint distribution are also important; without them, the model cannot replicate the facts in a satisfactory way. Human Capital Differences We now argue that the model with differences only in human capital across agents produces quite generally a counterfactual implication. First, recall from section 3 that when all agents in a cohort have the same learning ability that human capital dispersion must decrease over time. Essentially, the result was due to the fact that for interior solutions all agents within a cohort produce the same amount of human 17 capital. Thus, human capital growth rates are largest for those with the smallest human capital levels. This then implies that any amount of human capital dispersion at the end of the life cycle had to be due to even greater dispersion in human capital at the beginning of the life cycle. Since human capital is not directly observable, the implication above may seem of secondary importance. We stress now a related implication for earnings dispersion which is fundamentally at odds with the data displayed in Figure 1. Within the model, earnings in period j equals ej = wj (hj −Aj (a)) = wj (hj−1 (1−δ)+aAj−1 (a)α −Aj (a)). This follows from the case of interior solutions in Proposition 1 after substituting for hj using the optimal decision rule for human capital accumulation. Using this result, we can write the growth rate of an individual’s earnings as follows: ej /ej−1 = (wj /wj−1 )[(hj−1 (1 − δ) + aAj−1 (a)α − Aj (a))/(hj−1 − Aj−1 (a))] Differentiating this equation with respect to human capital, it is straightforward to establish that earnings growth falls as human capital, and thus earnings, increase. This then implies, as for the case of human capital, that the Lorenz curve for earnings within a cohort can be ordered: the one for a given age j lies below that for age j + 1 and so on. Thus, the model implies that any measure of earnings dispersion that is consistent with the Lorenz ordering (e.g. the Gini coefficient) must decrease monotonically with age.21 This prediction is contradicted by the earnings dispersion patterns documented in Figure 1. We then conclude that the model with only differences in initial human capital cannot replicate the facts. Thus, differences in learning ability must play a key role in generating the right dynamics of the earnings distribution. Learning Ability Differences We now consider the polar case in which all agents have the same initial human capital but differ in learning ability. To explore the model implications in this case, we place a grid on values of learning ability, and search for the distribution of learning ability and the common, Þxed value of initial human capital that best reproduces the facts presented in Figure 1. Our Þndings are presented in Figure 6 where the model begins to operate when agents are at a real life age of 10. Starting the model later produces even more strongly counterfactual implications. The model in this case generates a much more pronounced U-pattern for earnings dispersion than is present in the data. To understand why this occurs recall that all 21 This result also holds for the case of non-interior solutions since in the model the fraction of agents with zero earnings declines monotonically over time. 18 individuals start life with the same level of human capital. Optimal accumulation then dictates that early in the life cycle agents with high learning ability devote most or all available time to accumulating human capital, and thus their earnings are lower than those of their low ability counterparts. The bottom of the U-shape occurs where earnings of high ability agents overtake those of lower ability agents. After this point, earnings dispersion increases as high ability agents have higher levels of human capital as they age, and devote more and more time to market work. [Insert Figure 6 a-c Here] 5.4 Persistence in Individual Earnings So far we have looked at how the earnings distribution changes as agents age. However, it is possible that different theoretical models may all be able to replicate the patterns of means, dispersion and skewness in US cohort data, but differ in their implications for earnings persistence. The latter is a topic that has spawned considerable attention in the labor, consumption, and income distribution literatures and for which the benchmark model has strong implications. In addition, it is of independent interest to investigate the performance of the benchmark model in terms of a number of facts that we did not force it to match. Such an exercise is a useful step in evaluating the benchmark model as a quantitative theory of the earnings distribution and its dynamics. We now characterize the extent to which several measures of persistence in the model are consistent or not with the corresponding measures from US data. We consider three measures of persistence in cohort data: (1) the correlation of individual earnings levels across periods, (2) the correlation of individual earnings growth rates across periods and (3) the growth rates of group earnings across periods. The Þrst two measures are standard in describing the persistence of individual earnings. One key motivation for focusing on the last measure is the following. If differences in learning ability are key in explaining the patterns of Fig. 1 as we argued earlier, then the model implies that in the middle of the life cycle individuals with high earnings will tend to be those with high learning ability , while the opposite would be true for individuals of relatively low earnings. Since this argument suggest that the slopes of earnings proÞles are increasing in learning ability, the group of individuals with relatively high earnings may also display high earnings growth rates relative to the low earnings group. We argue below that on this last point the implications of the model are at odds with the data. 19 Table 5 shows the results for various age groups within the model, when the initial distribution of human capital and ability is selected using the non-parametric methodology. For ease of exposition, we report results only for the case when accumulation starts at age 10 and α = 0.7. To report growth rates of group earnings, we divide individuals according to their labor earnings in three groups (bottom 20%, central 20% and upper 20%) at ages 40 and 45, and compute future growth rates over 5 years. Table 5 shows that the model implies high persistence in earnings levels as well as growth rates. Table 5 also clearly illustrates that individuals belonging to high earnings groups at ages 40 and 45 show higher future growth rates than their low earnings counterparts. Table 5: Persistence in Individual Earnings α = 0.7 Statistic Age (j) = 45 Panel Correlation(Ej , Ej−1 ) Correlation(Ej , Ej−5 ) Correlation(Ej , Ej−10 ) Age (j) = 40 A: Correlation - Levels 0.9999 0.9997 0.9966 0.9854 0.9679 0.8671 Panel B: Correlation - Growth Rates Correlation(zj , zj−1 ) 0.9995 0.9994 Correlation(zj , zj−5 ) 0.9960 0.9652 Correlation(zj , zj−10 ) 0.9750 0.5229 Top 20% Central 20% Bottom 20% Panel C: 5-year Growth rates (%) 2.71 8.02 -1.59 0.45 -2.34 -0.64 Ej and zj = log(Ej /Ej−1 ) denote earnings and earnings growth rates, respectively. We now compare the results in Table 5 with estimates from US data. The correlation of earnings levels has been examined in US data by Parsons (1978) and Hyslop (2001) among others. They Þnd that earnings among US males are positively correlated for all horizons considered and that the correlation typically falls as the horizon increases. Hyslop Þnds that the average correlation is 0.83 for a one year horizon 20 and 0.59 for a six year horizon. Parsons Þnds that correlations are typically higher for older age groups. These results are qualitatively consistent with those from the human capital model. A different picture emerges for the correlation of growth rates. Abowd and Card (1989) estimate the correlation in earnings growth rates for US males. They Þnd that the average correlation of earnings growth rates one year apart is negative and equal to about −0.34, and close to zero when the growth rates are more than one year apart. Baker (1997) reports similar Þndings. Storesletten, Telmer and Yaron (2001) report high but stationary persistence in log-earnings which imply slightly lower negative autocorrelations of growth rates one year apart. Processes with similar dynamics have also been estimated by McCurdy (1982) and Hubbard, Skinner and Zeldes (1994). The results estimated from the data are thus clearly inconsistent with those implied by the model. These results are suggestive of a key ingredient present in stochastic models of the earnings distribution, namely, shocks that cause earnings to be mean reverting. To study this issue in more detail we examine earnings growth rates at different percentiles of the earnings distribution. We are not aware of existing results that characterize empirically the future growth rates of high earnings groups vs. low earnings groups that could be compared to the results presented in Table 5. We therefore carry out such an analysis using US male earnings data from the PSID, and proceed in two conceptually different ways.22 Figure 7 displays 5 year growth rates in group means for a variety of initial ages. Figure 7a provides the growth rates for each group (those in the top, mid, and bottom 20% percentiles respectively) as of their age plus 5 years. Hence, this Þgure shows the growth rates of those who ended up in the bottom, mid, and bottom 20% percentiles; we label this case the “backward case”. Not surprisingly, those in the top 20% percentile had large growth rates and conversely those at the bottom had the lowest growth rates. Figure 7b provides the 5 year growth rates for these groups constructed at the initial age. Hence, this Þgure shows the growth rates of those who started at these respective percentiles; we label this case the “forward case”. These growth rates display a reverse order relative to both Figure 7a and the results in Table 5. That is, those who started at the bottom have the largest growth rates, while those at the top have the lowest growth rates. Although some of this evidence can be driven by measurement errors, the overall message is that there is some important component of mean reversion in individual earnings, even at prime age earnings years. Put differently, these observations constitute another indication 22 See the Appendix for precise details. 21 of the presence of sizeable and idiosyncratic shocks that lead to mean reversion in earnings. The model we analyze clearly abstracts from such shocks and as a result, is not consistent with this last evidence. [Insert Figure 7 a-b Here] 6 Conclusion We assess the degree to which a widely-used, human capital model is able to replicate the age dynamics of the US earnings distribution documented in Figure 1. We Þnd that the model can account quite well for these age-earnings dynamics. In addition, we Þnd that the model produces a cross sectional earnings distribution closely resembling that implied by the age-earnings dynamics documented in Figure 2. Our Þndings indicate that differences in learning ability across agents are key. In particular, in the model high ability agents have more steeply sloped age-earnings proÞles than low ability agents. These differences in earnings proÞles in turn produce the increases in earnings dispersion and skewness with age that are documented in Figures 1 and 2. These Þndings are robust to the age at which the human capital accumulation mechanism described by the model begins and to different values of the elasticity parameter of the human capital production function. We also Þnd that, despite its relative success in replicating the cohort facts, the model is inconsistent with evidence related to the persistence of individual earnings. We mention two areas in which future work seems promising. The Þrst has to do with the fact that the distribution of agents by initial human capital and ability is unrestricted by the model. Models of the family can provide restrictions on this initial distribution. For this class of models, an assessment of the ability to replicate the facts of age-earnings dynamics and intergenerational earnings correlations is a natural next step. The second area for future work deals with the fact that the model examined here abstracts from many seemingly important features. Three such features are the absence of a leisure decision, an occupational choice decision and shocks that make human capital risky. We comment on this last feature. First, allowing for risky human capital would be one way of integrating deeper foundations for earnings risk into the standard consumption-savings problem considered by the literature on the life-cycle, permanent-income hypothesis. This literature has examined in detail the determinants of consumption and Þnancial asset holdings over the life cycle, but 22 no comparable effort has been put into investigating the accumulation of human capital. Second, while there seems to be agreement that human capital is risky there is relatively little work that analyzes different sources of risk and then determines their quantitative importance.23 Two interesting questions for a theory with risky human capital are (i) can such a model account for both the distributional dynamics of earnings and consumption over the life cycle? and (ii) what fraction of the dispersion in lifetime earnings can be accounted for by initial conditions versus shocks? We plan to explore these questions in future work. 23 Within a human capital model, shocks can no longer be modeled as exogenous shocks to earnings. Instead, they must be modeled at a deeper level as shocks to the depreciation of human capital, to learning ability, to the employment match, to rental rates and so on. Each one of these alternatives poses different modeling as well as empirical challenges. 23 References Ameriks, J. and S. Zeldes (2000), How do Portfolio Shares Vary with Age?, manuscript. Andolfatto, D, Gomme, P. and C. Ferrall (2000), Human Capital Theory and the Life-Cycle Pattern of Learning and Earning, Income and Wealth, mimeo. Ben-Porath, Y. (1967), The Production of Human Capital and the Life Cycle of Earnings, Journal of Political Economy, 75, 352-65. Browning, M., Hansen, L. and J. Heckman (1999), Micro Data and General Equilibrium Models, in Handbook of Macroeconomics, ed. J.B. Taylor and M. Woodford, (Elsevier Science B.V, Amsterdam). Card, D. (1999), The Causal Effect of Education on Earnings, in Handbook of Labor Economics, Volume 3, ed. O. Ashenfelter and D. Card. Creedy, J. and P. Hart (1979), Age and the Distribution of Earnings, Economic Journal, 89, 280-93. Deaton, A. and C. Paxson (1994), Intertemporal Choice and Inequality, Journal of Political Economy, 102, 437-67. Dooley and Gottschalk (1984), Earnings Inequality Among Males in the United States: Trends and the Effect of Labor Force Growth, Journal of Political Economy, 92, 59- 89. Haley, W. (1976), Estimation of the Earnings ProÞle from Optimal Human Capital Accumulation, Econometrica, 44, 1223-38. Hanoch, G, and M. Honig (1985), “True” Age ProÞles of Earnings: Adjusting for Censoring and for Period and Cohort Effects, Review of Economics and Statistics, 67, 383-94. Heckman, J. (1975), Estimates of a Human Capital Production Function Embedded in a Life Cycle Model of Labor Supply, in Household Production and Consumption, ed. N. Terleckyj, Columbia University Press, New York. Heckman, J. (1976), A Life-Cycle Model of Earning, Learning and Consumption, Journal of Political Economy, 84, S11-S44. 24 Heckman, J., Lochner, L. and C. Taber (1998), Explaining Rising Wage Inequality: Explorations with a Dynamic General Equilibrium Model of Labor Earnings with Heterogeneous Agents, Review of Economic Dynamics, 1, 1-58. Hyslop, D. (2001), Rising US Earnings Inequality and Family Labor Supply: The Covariance Structure of Intrafamily Earnings, American Economic Review, 91, 755- 77. Hubbard G. R., J. Skinner, and S. Zeldes (1994), The Importance of Precautionary Motives in Explaining Individual and Aggregate Savings, Carnegie Rochester Conference Series on Public Policy, 40, 59-126. Keane, M. and Wolpin, K., The Career Decisions of Young Men, Journal of Political Economy, 105(3), 473-522. Lillard, L. (1977) Inequality: Earnings vs. Human Wealth, American Economic Review, 67, 42-53. MaCurdy T. (1982) The Use of Time Series Processes to Model the Error Structure of Earnings in Longitudinal Data Analysis Journal of Econometrics, 18, 83-118. Mincer, J. (1958), Investment in Human Capital and Personal Income Distribution, Journal of Political Economy, 66, 281- 302. Mincer, J. (1974), Schooling, Experience and Earnings, Columbia University Press, New York. Mincer, J. (1997), The Production of Human Capital and the Life Cycle of Earnings: Variations on a Theme, Journal of Labor-Economics, 15 (1), Part 2. Neal, D. and S. Rosen (1999), Theories of the Distribution of Earnings, in Handbook of Income Distribution, ed. A. Atkinson and F. Bourguignon, North Holland Publishers. Parsons, D. (1978), The Autocorrelation of Earnings, Human Wealth Inequality, and Income Contingent Loans, Quarterly Journal of Economics , 92, 551-69. Press, W. et. al. (1992), Numerical recipes in FORTRAN, Second Edition, Cambridge University Press. 25 Rosen, S. (1976), A Theory of Life Earnings, Journal of Political Economy, 84, S45-S67. Schultz, T. (1975), Long-Term Change in Personal Income Distribution: Theoretical Approaches, Evidence, and Explanations, in The “Inequality” Controversy: Schooling and Distributive Justice, ed. D. Levine and M. Bane, (New York, Basic Books). Shorrocks, A. (1980), Income Stability in the United States, in Statics and Dynamics of Income, ed. Klevmarken and Lybeck. Smith and Welch (1979), Inequality: Race Differences in the Distribution of Earnings, International Economic Review, 20, 515-26. Stokey, N. and R. Lucas with E. Prescott (1989), Recursive Methods in Economic Dynamics, (Harvard University Press, Cambridge). Storresletten, K., Telmer, C. and A. Yaron (2001), Consumption and Risk Sharing Over the Life Cycle, NBER Working Paper # 7995. Weiss, Y. (1986), The Determination of Life-Cycle Earnings, in Handbook of Labor Economics, Volume I , ed. O. Ashenfelter and R. Layard, Elsevier Science Publishers. 26 A A.1 Appendix Proposition 1 To prove Proposition 1 it is useful to reformulate the dynamic programming problem by expressing earnings as a function of future human capital and ability. The resulting earnings function is denoted G(h, h0 , a; j). Vj (h; a) = max G(h, h0 , a; j) + (1 + r)−1 Vj+1 (h0 ; a) 0 h h0 ∈ Γ(h, a) ≡ [h(1 − δ) + f (h, 0, a), h(1 − δ) + f (h, 1, a)] Proof of Proposition 1: (i) The existence of the value function follows by repeated application of the Theorem of the Maximum starting in the last period of life. To apply the Theorem of the Maximum, we make use of the continuity of G(h, h0 , a; j) and the fact that the constraint set is a continuous and compact-valued correspondence. These are easily veriÞed. To show that the value function increases in h and a, note that this holds in the last period since VJ (h; a) = wJ h. Backward induction establishes the result for earlier periods using the fact that G(h, h0 , a; j) increases in h and a. The concavity of the value function in human capital follows from backwards induction by applying repeatedly the argument used in Stokey and Lucas (1989, Thm. 4.8). To apply this argument, we make use of three properties. First, the graph of the 1 , h0 ∈ Γ(h, a)} is a convex set for any given ability level constraint set {(h, h0 ) : h ∈ R+ a. This follows from the fact that the human capital production function is concave in current human capital. Second, G(h, h0 , a; j) is jointly concave in (h, h0 ). This can be easily veriÞed. Third, the terminal value function VJ+1 (h; a) ≡ 0 is concave in human capital. The decision rule hj (h; a) is single-valued since the objective function is strictly concave and the constraint set, for given (h; a), is convex. The objective function is strictly concave because the value function is concave and because G(h, h0 , a; j) is strictly concave in h0 . (ii) DeÞne Vj (h; a) recursively, given VJ (h; a) = (1 + g)J−1 h, as follows: Vj (h; a) = [(1 + g)j−1 PJ−j (1+g)(1−δ) k [ ] ]h + C k=0 (1+r) 27 j (a) for h ≥ Aj (a). Vj (h; a) = 1 V (h(1 (1+r) j+1 − δ) + ahα ; a) for h ≤ Aj (a) Cj (a) ≡ (1 + r)−1 (Cj+1 (a) + Dj+1 aAj (a)α ) − (1 + g)j−1 Aj (a), where CJ (a) = 0 Dj ≡ [(1 + g)j−1 PJ−j (1+g)(1−δ) k [ ] ] k=0 (1+r) Now verify that the functions (Vj (h; a), hj (h; a)) satisfy Bellman’s equation. VeriÞcation amounts to checking that hj (h; a) satisÞes Bellman’s equation without the max operation and that it achieves the maximum in the right-hand-side of Bellman’s equation. Since the Þrst part is routine, the proof focuses on the second part. A sufficient condition for an interior solution is given in the Þrst equation below. The second equation follows from the Þrst after substituting the relevant functions evaluated at h0 = hj (h; a). Here we make use of the assumption on the cutoff values Aj (a) 0 in Prop 1(ii) since we substitute for Vj+1 (h0 ; a) assuming interior solutions obtain in future periods. Rearrangement of the second equation implies that Aj (a) is deÞned as in Prop 1(ii). 0 −G2 (h, h0 , a; j) = (1 + r)−1 Vj+1 (h0 ; a) (1 + g)j−1 (1/(aα))Aj (a)1−α = X (1 + g)(1 − δ) (1 + g)j J−j−1 [ ]k (1 + r) k=0 (1 + r) It remains to consider the possibility of a corner solution. The Þrst equation below gives a sufficient condition for a corner solution. The second equation follows from the Þrst after substitution. Since Vj+1 is concave in human capital, it is clear that 0 is bounded below by the derivative above the cutoff human capital level Aj+1 (a). Vj+1 Thus, from the interior solution case, the second equation holds whenever h ≤ Aj (a). 0 −G2 (h, h0 , a; j) ≤ (1 + r)−1 Vj+1 (h0 ; a) (1 + g)j−1 (1/(aα))h1−α ≤ 28 1 0 Vj+1 (hj (h; a); a) (1 + r) A.2 Computation We sketch the computation algorithm for the non-parametric case. Step 1: Calculate the optimal decision rule hj (h; a). Step 2: Put a grid on learning ability and initial human capital (h, a) and calculate life-cycle proÞles of human capital, hours and earnings from these grid points. Step 3: Find the initial distribution. To calculate the optimal decision rule in step 1, for any value of learning ability a, we put a non-uniform grid on human capital of 300 points on [0, h∗ ], where the choice of h∗ may be revised depending on the results of step 3. We calculate the optimal decision rule for human capital at gridpoints starting from period j = J − 1 by solving the dynamic programming problem starting from period J − 1, given VJ (h; a) = wJ h. Since the value function is concave in human capital each period, the dynamic programming problem is a concave programming problem. Golden section search (see Press et al (1992), ch. 10) is used to calculate hj (h; a) at gridpoints. To carry this out, we calculate the value function off gridpoints using linear interpolation. Backward recursion on Bellman’s equation produces hj (h; a) for j = 1, ..., J − 1. In step 2 we put a grid of 20 points on [0, a∗ ] and 20 points on [0, h∗ ]. This implies a total of 400 points (h, a). Using the decision rule from step 1, we simulate lifecycle proÞles of labor earnings from any initial pair (h, a). Since decision rules are computed at gridpoints of human capital holdings, but its values are not restricted to lie on these gridpoints, we use linear interpolation to calculate values off gridpoints. In step 3 we use the Simplex algorithm, as described by Press et al (1992, ch. 10), to Þnd the 400 values of the histogram over [0, h∗ ] × [0, a∗ ] that minimizes the distance between model and data statistics. For any trial of the vector describing the initial distribution, we calculate the mean, dispersion and skewness statistics at each age using the calculated life-cycle proÞles and the guessed initial distribution. The calculation of decision rules and the posterior life-cycle simulation are independent of the initial distribution. This reduces computation time as life-cycle proÞles are calculated only once and stored to be used later in the calculation of the relevant statistics in all the trials required by the simplex method. If the histogram that best matches the data puts strictly positive weight on (h, a) pairs where a = a∗ or h = h∗ , then the upper bounds are increased and steps 1-3 are repeated. 29 A.3 Data A.3.1 Restricted Time Effects Below we provide details for implementing the restricted time effects discussed in section 2.3. Let X = [αs , βj , γt ] be the matrix of cohort, age and time dummies with the number of rows equal to the number of {j, t} pairs available for earnings ej,t , where for simplicity we omit the dependence on percentile p. DeÞne the unrestricted regression e = Xb + ². Note, here e is the vector of all possible ej,t , and b is the vector of unrestricted dummy coefficients. The problem is that due to the linear relationship between time, age, and cohort, the matrix (X 0 X)−1 is singular, so this unrestricted version can not be implemented. Let ca be the number of age and cohort dummies available, and T be the number of time dummies available. DeÞne the matrix R and W as indicated below. The matrix R has two rows corresponding to our two restrictions of setting the mean time dummies to zero and setting time dummies be orthogonal to trend. Let b∗ be the corresponding vector of b that obeys this normalization. That is Rb∗ = 02×1 . It turns out that in spite of the fact that (X 0 X)−1 is singular the matrix W below is non-singular. R= " W = 01×ca 11×T 01×ca 1, 2, ...T " X 0 X −R0 R 02×2 # (1) # DeÞne the vectors V and s as indicated below, where λ is the Lagrange multiplier on the restricted least square residual (or moment conditions). Then the moment conditions corresponding to this minimization is V = W d. Since W is invertible we have d = W −1 V and therefore have the solution to the desired restricted set of estimates b∗ . V = " d= X 0e 02×1 " 30 b∗ λ # # A.3.2 Group Growth Rates Figure 7 is based on a subset of the data used in section 2. SpeciÞcally, for each panel year we track each agent’s earnings throughout the next Þve years. Our selection criteria is identical to that described earlier, except that now we require agents to be in the sample for the next 5 consecutive years. We deÞne age and cohort speciÞc growth rate as follows. Let the average k-period growth rate of those persons i who h P i j j+k j 1 1 PN were of age j during time period t be deÞned as ēt,t+k = log N N / e e i=1 i,t+k N i=1 i,t , where i sums over the number of people N who were age j in year t. We compute this growth rate measure for j = 36, 37, ...52, and t = 1, 2, ..T . The data ējt,t+k is then used to recover the ’age’ effects by applying the same regression with cohort and age effects as described in section 2.2. Figure 7 provides the age effects corresponding to ējt,t+k for the bottom, middle and top 20 percentiles. In Figure 7a the sum is over those who are in the bottom (middle or top) 20 percent at time t. In Figure 7b the sum is over those agents who were in the bottom (middle or top) 20 percent at time t + k. All the Þgures are based on 5 year growth rates, that is k = 5. 31 Fig. 1-a Mean Earnings: PSID Data 120 100 80 60 40 20 0 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 Age Fig 1-b Dispersion (Gini Coeff.): PSID Data 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 Age Fig 1-c Skewness (Mean/Median): PSID Data 1.4 1.2 1 0.8 0.6 0.4 0.2 0 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 Age Fig. 2 Earnings Percentiles (0.025-0.99): PSID Data 700 600 500 400 300 200 100 0 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 Age Fig 3-a Mean Earnings: PSID Data 120 100 Cohort Dummies 80 Time Dummies 60 Restricted Time Dummies 40 20 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 3-b Dispersion (Gini Coeff.): PSID Data 0.45 0.4 0.35 Cohort Dummies 0.3 0.25 Time Dummies 0.2 Restricted Time Dummies 0.15 0.1 0.05 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig 3-c Skewness (Mean/Median): PSID Data 1.4 1.2 Cohort Dummies 1 0.8 Time Dummies 0.6 Restricted Time Dummies 0.4 0.2 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 4-a Mean Earnings: Non-Parametric Initial Distribution (alpha=0.7) 120 100 80 Data 60 Age 10 Age 20 40 20 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 4-b Dispersion: Non-Parametric Initial Distribution (alpha=0.7) 0.45 0.4 0.35 0.3 Data 0.25 Age 10 0.2 Age 20 0.15 0.1 0.05 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 4-c Skewness: Non-Parametric Initial Distribution (alpha=0.7) 1.4 1.2 1 Data 0.8 Age 10 0.6 Age 20 0.4 0.2 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 5-a Mean Earnings: Parametric Initial Distribution (alpha=0.7) 120 100 80 Data Age 10 60 Age 20 40 20 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 5-b Dispersion: Parametric Initial Distribution (alpha=0.7) 0.45 0.4 0.35 0.3 Data 0.25 Age 10 0.2 Age 20 0.15 0.1 0.05 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 5-b Skewness: Parametric Initial Distribution (alpha=0.7) 1.4 1.2 1 Data 0.8 Age 10 0.6 Age 20 0.4 0.2 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 6-a Mean Earnings: Fixed Initial Human Capital (Accumulation Starts Age 10) 140 120 100 80 Data 60 alpha=0.7 40 20 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 6-b Dispersion: Fixed Initial Human Capital (Accumulation Starts Age 10) 0.5 0.45 0.4 0.35 0.3 Data 0.25 0.2 0.15 0.1 alpha=0.7 0.05 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 6-c Skewness: Fixed Initial Human Capital (Accumulation Starts Age 10) 1.4 1.2 1 0.8 Data 0.6 alpha=0.7 0.4 0.2 0 20 23 26 29 32 35 38 41 44 47 50 53 56 Age Fig. 7-a 5-year Growth Rates (Backward Case) Bottom 20% Central 20% Top 20% 40 30 20 (%) 10 0 -10 -20 -30 -40 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Age Fig. 7-b 5-year Growth Rates (Forward Case) Bottom 20% Central 20% Top 20% 25 20 (%) 15 10 5 0 -5 -10 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Age