Academia.eduAcademia.edu

Undergraduate Econometrics Instruction: Through Our Classes, Darkly

2017

A s the Stones' Age gave way to the computer age, applied econometrics was mostly concerned with estimating the parameters governing broadly targeted theoretical description of the economy. Canonical examples include multiequation macro models describing economy-wide variables like unemployment and output, and micro models characterizing the choices of individual agents or marketlevel equilibria. The empirical framework of the 1960s and 1970s typically sought to explain economic outcomes with the aid of a long and diverse list of explanatory variables, but no single variable of special interest. Much of the contemporary empirical agenda looks to answer specific questions, rather than provide a general understanding of, say, GDP growth. This agenda targets the causal effects of a single factor, such as the effects of immigration on wages or the effects of democracy on GDP growth, often focusing on policy questions like the employment effects of subsidies for small business or the effects of monetary policy. Applied researchers today look for credible strategies to answer such questions. Empirical economics has changed markedly in recent decades, but, as we document below, econometric instruction has changed little. Market-leading econometrics texts still focus on assumptions and concerns motivated by a model-driven approach to regression, aimed at helping students produce a statistically precise

Journal of Economic Perspectives—Volume 31, Number 2—Spring 2017—Pages 125–144 Undergraduate Econometrics Instruction: Through Our Classes, Darkly Joshua D. Angrist and Jörn-Steffen Pischke A s the Stones’ Age gave way to the computer age, applied econometrics was mostly concerned with estimating the parameters governing broadly targeted theoretical description of the economy. Canonical examples include multiequation macro models describing economy-wide variables like unemployment and output, and micro models characterizing the choices of individual agents or marketlevel equilibria. The empirical framework of the 1960s and 1970s typically sought to explain economic outcomes with the aid of a long and diverse list of explanatory variables, but no single variable of special interest. Much of the contemporary empirical agenda looks to answer specific questions, rather than provide a general understanding of, say, GDP growth. This agenda targets the causal effects of a single factor, such as the effects of immigration on wages or the effects of democracy on GDP growth, often focusing on policy questions like the employment effects of subsidies for small business or the effects of monetary policy. Applied researchers today look for credible strategies to answer such questions. Empirical economics has changed markedly in recent decades, but, as we document below, econometric instruction has changed little. Market-leading econometrics texts still focus on assumptions and concerns motivated by a model-driven approach to regression, aimed at helping students produce a statistically precise Joshua D. Angrist is Ford Professor of Economics, Massachusetts Institute of Technology, Cambridge, Massachusetts. Jörn-Steffen Pischke is Professor of Economics, London School of Economics, London, United Kingdom. Their email addresses are [email protected] and [email protected]. ■ † For supplementary materials such as appendices, datasets, and author disclosure statements, see the article page at https://doi.org/10.1257/jep.31.2.125 doi=10.1257/jep.31.2.125 126 Journal of Economic Perspectives account of the processes generating economic outcomes. Much of this material prioritizes technical concerns over conceptual matters. We still see, for example, extended textbook discussions of functional form, distributional assumptions, and how to correct for serial correlation and heteroskedasticity. Yet this instructional edifice is not of primary importance for the modern empirical agenda. At the same time, newer and widely-used tools for causal analysis, like differences-in-differences and regression discontinuity methods, get cursory textbook treatment if they’re mentioned at all. How should changes in our use of econometrics change the way we teach econometrics? Our take on this is simple. We start with empirical strategies based on randomized trials and quasi-experimental methods because they provide a template that reveals the challenges of causal inference, and the manner in which econometric tools meet these challenges. We call this framework the design-based approach to econometrics because the skills and strategies required to use it successfully are related to research design. This viewpoint leads to our first concrete prescription for instructional change: a revision in the manner in which we teach regression. Regression should be taught the way it is now most often used: as a tool to control for confounding factors. This approach abandons the traditional regression framework in which all regressors are treated equally. The pedagogical emphasis on statistical efficiency and functional form, along with the sophomoric narrative that sets students off in search of “true models” as defined by a seemingly precise statistical fit, is ready for retirement. Instead, the focus should be on the set of control variables needed to insure that the regression-estimated effect of the economic variable of interest has a causal interpretation. In addition to a radical revision of regression pedagogy, the exponential growth in economists’ use of quasi-experimental methods and randomized trials in pursuit of causal effects should move these tools to center stage in the classroom. The design-based approach emphasizes single-equation instrumental variables estimators, regression-discontinuity methods, and variations on differences-in-differences strategies, while focusing on specific threats to a causal interpretation of the estimates generated by these fundamental tools. Finally, real empirical work plays a central role in our classes. Econometrics is better taught by example than abstraction. Causal questions and research design are not the only sort of econometric work that remains relevant. But our experience as teachers and researchers leads us to emphasize these skills in the classroom. For one thing, such skills are now much in demand: Google and Netflix post positions flagged by keywords like causal inference, experimental design, and advertising effectiveness; Facebook’s data science team focuses on randomized controlled trials and causal inference; Amazon offers prospective employees a reduced form/causal/program evaluation track.1 1 See also the descriptions of modern private sector econometric work in Ayres (2007), Brynjolfsson and McAfee (2011), Christian (2012), and Kohavi (2015). Joshua D. Angrist and Jörn-Steffen Pischke 127 Of course, there’s econometrics to be done beyond the applied micro applications of interest to Silicon Valley and the empirical labor economics with which we’re personally most engaged. But the tools we favor are foundational for almost any empirical agenda. Professional discussions of signal economic events like the Great Recession and important telecommunications mergers are almost always arguments over causal effects. Likewise, Janet Yellen and the hundreds of researchers who support her at the Fed crave reliable evidence on whether X causes Y. Purely descriptive research remains important, and there’s a role for data-driven forecasting. Applied econometricians have long been engaged in these areas, but these valuable skills are the bread-and-butter of disciplines like statistics and, increasingly, computer science. These endeavors are not where our comparative advantage as economists lies. Econometrics at its best is distinguished from other data sciences by clear causal thinking. This sort of thinking is therefore what we emphasize in our classes. Following a brief description of the shift toward design-based empirical work, we flesh out the argument for change by considering the foundations of econometric instruction, focusing on old and new approaches to regression. We then look at a collection of classic and contemporary textbooks, and a sample of contemporary reading lists and course outlines. Reading lists in our sample are more likely to cover modern empirical methods than are today’s market-leading books. But most courses remain bogged down in boring and obsolete technical material. Good Times, Bad Times The exponential growth in economists’ use of quasi-experimental methods and randomized trials is documented in Panhans and Singleton (forthcoming). Angrist and Krueger (1999) described an earlier empirical trend for labor economics, but this trend is now seen in applied microeconomic fields more broadly. In an essay on changing empirical work (Angrist and Pischke 2010), we complained about the modern macro research agenda, so we’re happy to see recent design-based inroads even in empirical macroeconomics (as described in Fuchs-Schündeln and Hassan 2016). Bowen, Frésard, and Taillard (forthcoming) report on the accelerating adoption of quasi-experimental methods in empirical corporate finance. Design-based empirical analysis naturally focuses the analyst’s attention on the econometric tools featured in this work. A less obvious intellectual consequence of the shift towards design-driven research is a change in the way we use our linear regression workhorse. Yesterday’s Papers (and Today’s) The changed interpretation of regression estimates is exemplified in the contrast between two studies of education production, Summers and Wolfe (1977) and Dale and Krueger (2002). Both papers are concerned with the role of schools in generating human capital: Summers and Wolfe with the effects of elementary 128 Journal of Economic Perspectives school characteristics on student achievement; Dale and Krueger with the effects of college characteristics on post-graduates’ earnings. These questions are similar in nature, but the analyses in the two papers differ sharply. Summers and Wolfe (1977) interpret their mission to be one of modeling the complex process that generates student achievement. They begin with a general model of education production that includes unspecified student characteristics, teacher characteristics, school inputs, and peer composition. The model is loosely motivated by an appeal to the theory of human capital, but the authors acknowledge that the specifics of how achievement is produced remain mysterious. What stands out in this framework is lack of specificity: the Summers and Wolfe regression puts the change in test scores from 3rd to 6th grade on the left-hand side, with a list of 29 student and school characteristics on the right. This list includes family income, student IQ , sex, and race; the quality of the college attended by the teacher and teacher experience; class size and school enrollment; and measures of peer composition and behavior. The Summers and Wolfe (1977) paper is true to the 1970s empirical mission, the search for a true model with a large number of explanatory variables: We are confident that the coefficients describe in a reasonable way the relationship between achieving and GSES [genetic endowment and socioeconomic status], TQ [teacher quality], SQ [non-teacher school quality], and PG [peer group characteristics], for this collection of 627 elementary school students. In the spirit of the wide-ranging regression analyses of their times, Summers and Wolfe offer no pride of place to any particular set of variables. At the same time, their narrative interprets regression estimates as capturing causal effects. They draw policy conclusions from empirical results, suggesting, for example, that schools not use the National Teacher Exam score to guide hiring decisions. This interpretation of regression is in the spirit of Stones’ Age econometrics, which typically begins with a linear regression equation meant to describe an economic process, what some would call a “structural relation.” Many authors of this Age go on to say that in order to obtain unbiased or consistent estimates, the analyst must assume that regression errors are mean-independent of regressors. But since all regressions produce a residual with this orthogonality property, for any regressor included in the model, it’s hard to see how this statement promotes clear thinking about causal effects. The Dale and Krueger (2002) investigation likewise begins with a question about schools, asking whether students who attend a more selective college earn more as a result, and, like Summers and Wolfe (1977), uses ordinary least squares regression methods to construct an answer. Yet the analysis here differs in three important ways. The first is a focus on specific causal effects: there’s no effort to “explain wages.” The Dale and Krueger study compares students who attend moreand less-selective colleges. College quality (measured by schools’ average SAT score) is but one factor that might change wages, surely minor in an R 2 sense. This highly Undergraduate Econometrics Instruction: Through Our Classes, Darkly 129 focused inquiry is justified by the fact that the analysis aspires to answer a causal question of concern to students, parents, and policymakers. The second distinguishing feature is a research strategy meant to eliminate selection bias: Graduates of elite schools undoubtedly earn more (on average) than those who went elsewhere. Given that elite schools select their students carefully, however, it’s clear that this difference may reflect selection bias. The Dale and Krueger (2002) paper outlines a selection-on-observables research strategy meant to overcome this central problem. The Dale and Krueger (2002) research design compares individuals who sent applications to the same set of colleges and received the same admission decisions. Within groups defined by application and admission decisions, students who attend different sorts of schools are far more similar than they would be in an unrestricted sample. The Dale and Krueger study argues that any remaining within-group variation in the selectivity of the school attended is essentially serendipitous—as good as randomly assigned—and therefore unrelated to ability, motivation, family background, and other factors related to intrinsic earnings potential. This argument constitutes the most important econometric content of the Dale and Krueger paper. A third important characteristic of the Dale and Krueger (2002) study is a clear distinction between causes and controls on the right hand side of the regressions at the heart of their study. In the modern paradigm, regressors are not all created equal. Rather, only one variable at a time is seen as having causal effects. All others are controls included in service of this focused causal agenda.2 In education production, for example, coefficients on demographic variables and other student characteristics are unlikely to have a clear economic interpretation. For example, what should we make of the coefficient on IQ in the earlier Summers–Wolfe regression? This coefficient reveals only that two measures of intellectual ability— IQ and the dependent variable—are positively correlated after regression-adjusting for other factors. On the other hand, features of the school environment, like class sizes, can sometimes be changed by school administrators. We might indeed want to consider the implications of class size coefficients for education policy. The modern distinction between causal and control variables on the righthand side of a regression equation requires more nuanced assumptions than the blanket statement of regressor-error orthogonality that’s emblematic of the traditional econometric presentation of regression. This difference in roles between right-hand variables that might be causal and those that are just controls should emerge clearly in the regression stories we tell our students. Out of Control The modern econometric paradigm exemplified by Dale and Krueger (2002) treats regression as an empirical control strategy designed to capture causal effects. Specifically, regression is an automated matchmaker that produces within-group 2 We say “one variable at a time,” because some of the Dale and Krueger (2002) models replace college selectivity with tuition as the causal variable of interest. 130 Journal of Economic Perspectives comparisons: there’s a single causal variable of interest, while other regressors measure conditions and circumstances that we would like to hold fixed when studying the effects of this cause. By holding the control variables fixed—that is, by including them in a multivariate regression model—we hope to give the regression coefficient on the causal variable a ceteris paribus, apples-to-apples interpretation. We tell this story to undergraduates without elaborate mathematics, but the ideas are subtle and our students find them challenging. Detailed empirical examples showing how regression can be used to generate interesting, useful, and surprising causal conclusions help make these ideas clear. Our instructional version of the Dale and Krueger (2002) application asks whether it pays to attend a private university, Duke, say, instead of a state school like the University of North Carolina. This converts college selectivity into a simpler, binary treatment, so that we can cast the effects of interest as generated by simple on/off comparisons. Specifically, we ask whether the money spent on private college tuition is justified by future earnings gains. This leads to the question of how to use regression to estimate the causal effect of private college attendance on earnings. For starters, we use notation that distinguishes between cause and control. In this case, the causal regressor is Pi , a dummy variable that indicates attendance at a private college for individual i. Control variables are denoted by Xi , or given other names when specific controls are noteworthy, but in all cases distinct from the privileged causal variable, Pi . The outcome of interest, Yi , is a measure of earnings roughly 20 years post-enrollment. The causal relationship between private college attendance and earnings is described in terms of potential outcomes: Y1i , representing the earnings of individual i were he or she to go private (Pi = 1), and Y0i , representing i ’s earnings after a public education (Pi = 0). The causal effect of attending a private college for individual i is the difference, Y1i − Y0i . This difference can never be seen; rather, we see only Y1i or Y0i , depending on the value of Pi . The analyst’s goal is therefore to measure an average causal effect, like E(Y1i − Y0i ). At MIT (where we have both taught), we ask our private-college econometrics students to consider their personal counterfactual had they made a public-school choice instead of coming to MIT. Some of our students are seniors who have lined up jobs with the likes of Google and Goldman. Many of the people they work with at these firms—perhaps the majority—will have gone to state schools. In view of this fact, we ask our students to consider whether MIT-style private colleges really make a difference when it comes to career success. The first contribution of a causal framework based on potential outcomes is to explain why naive comparisons of public and private college graduates are likely to be misleading. The second is to explain how an appropriately constructed regression strategy leads us to something better. Naive comparisons between alumni of private and public universities will confound the average causal effect of private attendance with selection bias. The selection bias here reflects the fact that students who go to private colleges are, on average, from stronger family backgrounds and probably more motivated and better Joshua D. Angrist and Jörn-Steffen Pischke 131 prepared for college. These characteristics are reflected in their potential earnings, that is, in how much they could earn without the benefit of a private college degree. If those who end up attending private schools had instead attended public schools, they probably would have had higher incomes anyway. This reflects the fact that public and private students have different Y0i’s, on average. To us, the most natural and useful presentation of regression is as a model of potential outcomes. Write potential earnings in the public college scenario as Y0i = α + ηi, where α is the mean of Y0i , and ηi is the difference between this potential outcome and its mean. Suppose further that the difference in potential outcomes is a constant, β, so we can write β = Y1i − Y0i . Putting the pieces together gives a causal model for observed earnings Yi = α + βPi + ηi. Selection bias amounts to the statement that Y0i (potential earnings after going to a public college) and hence ηi depends (in a statistical sense) on Pi, that is, on the type of school one chooses. The road to a regression-based solution to the problem of selection bias begins with the claim that the analyst has information that can be used to eliminate selection bias, that is, to purge Y0i of its correlation with Pi. In particular, the modern regression modeler postulates a control variable Xi (or perhaps a set of controls). Conditional on this control variable, the private and public earnings comparison is apples-to-apples, at least on average, so those being compared have the same average Y0i’s or ηi’s. This ceteris paribus -type claim is embodied in the conditional independence assumption that ultimately gives regression estimates a causal interpretation: E(ηi|Pi, Xi) = E(ηi|Xi). Notice that this is a weaker and more focused assumption than the traditional presentation, which says that the error term is mean-independent of all regressors, that is, E(ηi|Pi, Xi) = 0. In the Dale and Krueger (2002) study, the variable Xi identifies the schools to which the college graduates in the sample had applied and were admitted. The conditional independence assumption says that, having applied to Duke and UNC and having been admitted to both, those who chose to attend Duke have the same earnings potential as those who went to the state school. Although such conditioning does not turn college attendance into a randomized trial, it provides a compelling source of control for the major forces confounding causal inference. Applicants target schools in view of their ambition and willingness to do the required work; admissions offices look carefully at applicant ability. We close the loop linking causal inference with linear regression by introducing a functional form hypothesis, specifically that the conditional mean of potential earnings when attending a public school is a linear function of Xi. This can be written formally as E(ηi|Xi) = γXi. Econometrics texts fret at length about linearity 132 Journal of Economic Perspectives and its limitations, but we see such hand-wringing as misplaced. In the Dale and Krueger research design, the controls are a large set of dummies for all possible applicant groups. The key controls in this case come in the form of a saturated model, that is, an exhaustive set of dummies for all possible values of the conditioning variable. Such models are inherently linear. In other cases, we can come as close as we like to the underlying conditional mean function by adding polynomial terms and interactions. When samples are small, we happily use linearity to interpolate, thereby using the data at hand more efficiently. In some of the Dale and Krueger models, for example, dummies for groups of schools are replaced by a linear control for the schools’ average selectivity (that is, the average SAT scores of their students). Combining these three ingredients, constant causal effects, conditional independence, and a linear model for potential outcomes conditional on controls, produces the regression model Yi = α + βPi + γXi + ei, which can be used to construct unbiased and consistent estimates of the causal effect of private school attendance, β. The causal story that brings us to this point reveals what we mean by β and why we’re using regression to estimate it. This final equation looks like many seen in market-leading texts. But this apparent similarity is less helpful than a source of confusion. In our experience, to present this equation and recite assumptions about the correlation of regressors and ei clouds more than clarifies the basis for causal inference. As far as the control variables go, regressor-residual orthogonality is assured rather than assumed; that is, regression algebra makes this happen. At the same time, while the controls are surely uncorrelated with the residuals, it’s unlikely that the regression coefficients multiplying the controls have a causal interpretation. We don’t imagine that the controls are as good as randomly assigned and we needn’t care whether they are. The controls have a job to do: they are the foundation for the conditional independence claim that’s central to the modern regression framework. Provided the controls make this claim plausible, the coefficient β can be seen as a causal effect. The modern regression paradigm turns on the notion that the analyst has data on control variables that generate apples-to-apples comparisons for the variable of interest. Dale and Krueger (2002) explain what this means in their study: If, conditional on gaining admission, students choose to attend schools for reasons that are independent of [unobserved determinants of earnings] then students who were accepted and rejected by the same set of schools would have the same expected value of [these determinants, the error term in their model]. Consequently, our proposed solution to the school selection problem is to include an unrestricted set of dummy variables indicating groups of students who received the same admissions decisions (i.e., the same combination of acceptances and rejections) from the same set of colleges. Undergraduate Econometrics Instruction: Through Our Classes, Darkly 133 In our analysis of the Dale and Krueger data (reported in Chapter 2 of Angrist and Pischke 2015), estimates from a regression with no controls show a large private school effect of 13.5 log points. This effect shrinks to 8.6 log points after controlling for the student’s own SAT scores, his or her family income, and a few more demographic variables. But controlling for the schools to which a student applied and was admitted (using many dummy variables) yields a small and statistically insignificant private school effect of less than 1 percent. Comparing regression results with increasing numbers of controls in this way— that is, comparing uncontrolled results, results with crude controls, and results with a control variable that more plausibly addresses the issue of selection bias—offers powerful insights. These insights help students understand why the last model is more likely to have a causal interpretation than the first two. First, we note in discussing these results that the large uncontrolled private differential in wages is apparently driven by selection bias. We learn this from the fact that the raw effect vanishes after controlling for students’ precollege attributes, in this case, ambition and ability as reflected in the set of schools a student applies to and qualifies for. Of course, there may still be selection bias in the private–public contrast conditional on these controls. But because the controls are coded from application and admissions decisions that predate college enrollment decisions, they cannot themselves be a consequence of private school attendance. They must be associated with differences in Y0i that generate selection bias. Eliminating these differences, that is, comparing students with similar Y0i’s, is therefore likely to generate private school effects that are less misleading than simpler models omitting these controls. We also show our students that after conditioning on the application and admissions variables, ability and family background variables in the form of SAT scores and family income are uncorrelated with private school attendance. The finding of a zero private-school return is therefore remarkably insensitive to further control beyond a core set. This argument uses the omitted variables bias formula, which we see as a kind of golden rule for the modern regression practitioner. Our regression estimates reveal robustness to further control that we’d expect to see in a well-run randomized trial. Using a similar omitted-variables-type argument, we note that even if there are other confounders that we haven’t controlled for, those that are positively correlated with private school attendance are likely to be positively correlated with earnings as well. Even if these variables remain omitted, their omission leads the estimates computed with the variables at hand to overestimate the private school premium, small as it already is. Empirical applications like this demonstrate the modern approach to regression, highlighting the nuanced assumptions needed for a causal interpretation of regression parameters.3 If the conditional independence assumption is violated, 3 In a recent publication, Arcidiacono, Aucejo, and Hotz (2016) use the Dale and Krueger conditioning strategy to estimate causal effects of enrolling at different University of California campuses on graduation and college major. 134 Journal of Economic Perspectives regression methods fail to uncover causal effects and are likely to be misleading. Otherwise, there’s hope for causal inference. Alas, the regression topics that dominate econometrics teaching, including extensive discussions of classical regression assumptions, functional form, multicollinearity, and matters related to statistical inference and efficiency, pale in importance next to this live-or-die fact about regression-based research designs. Which is not to say that causal inference using regression methods has now been made easy. The question of what makes a good control variable is one of the most challenging in empirical practice. Candidate control variables should be judged by whether they make the conditional independence assumption more plausible, and it’s often hard to tell. We therefore discuss many regression examples with our students, all interesting, but some more convincing than others. A particular worry is that not all controls are good controls, even if they’re related to both Pi and Yi. Specific examples and discussion questions—“Should you control for occupation in a wage equation meant to measure the economic returns to schooling?”—illuminate the bad-control issue and therefore warrant time in the classroom (and in our books, Angrist and Pischke 2009, 2015). Take It or Leave It: Classical Regression Concerns It is easiest to use the conditional independence assumption to derive a causal regression model when the causal effect is the same for everyone, as assumed above. While this is an attractive simplification for expository purposes, the key result is remarkably general. As long as the regression function is suitably flexible, the regression parameter capturing the causal effect of interest is a weighted average of underlying covariate-specific causal effects. In fact, with discrete controls, regression can be viewed as a matching estimator that automates the estimation of many possibly heterogeneous covariate-specific treatment effects, producing a single weighted average in one easy step. More generally, linearity of the regression function is best seen as a convenient approximation to possibly nonlinear functional forms. This claim is supported by pioneering theoretical studies such as White (1980a) and Chamberlain (1982). To the best of our knowledge, the first textbook to highlight these properties is Goldberger (1991), a graduate text never in wide use and one rarely seen in undergraduate courses. Angrist (1998), Angrist and Krueger (1999), and our graduate text (Angrist and Pishke 2009) develop the theoretical argument that regression is a matching estimator for average treatment effects (see also Yitzhaki 1996). An important consequence of this approximation and matchmaking view of regression is that the assumptions behind the textbook linear regression model are both implausible and irrelevant. Heteroskedasticity arises naturally as a result of variation in the closeness between a regression fit and the underlying conditional mean function it approximates. But the fact that the quality of the fit may vary does not obviate the value of regression as a summarizer of economically meaningful causal relationships. Joshua D. Angrist and Jörn-Steffen Pischke 135 Classical regression assumptions are helpful for the derivation of regression standard errors. They simplify the math and the resulting formula reveals the features of the data that determine statistical precision. This derivation takes little of our class time, however. We don’t dwell on statistical tests for the validity of classical assumptions or on generalized least squares fix-ups for their failures. It seems to us that most of what is usually taught on inference in an introductory undergraduate class can be replaced with the phrase “use robust standard errors.” With a caution about blind reliance on asymptotic approximations, we suggest our students follow current research practice. As noted by White (1980b) and others, the robust formula addresses the statistical consequences of heteroskedasticity and nonlinearity in cross-sectional data. Autocorrelation in time-series data can similarly be handled by Newey and West (1987) standard errors, while cluster methods address correlation across cross-sectional units or in panel data (Moulton 1986; Arellano 1987; Bertrand, Duflo, and Mullainathan 2004). In Another Land: Econometrics Texts and Teaching Traditional econometrics textbooks are thin on empirical examples. In Johnston’s (1972) classic text, the first empirical application is a bivariate regression linking road casualties to the number of licensed vehicles. This example focuses on computation, an understandable concern at the time, but Johnston doesn’t explain why the relationship between casualties and licenses is interesting or what the estimates might mean. Gujarati’s (1978) first empirical example is more substantive, a Cobb–Douglas production function estimated with a few annual observations. Production functions, implicitly causal relationships, are a fundamental building block of economic theory. Gujarati’s discussion helpfully interprets magnitudes and considers whether the estimates might be consistent with constant returns to scale. But this application doesn’t appear until page 107. Decades later, real empirical work was still sparse in the leading texts, and the presentation of empirical examples often remained focused on mathematical and statistical technicalities. In an essay published 16 years ago in this journal, Becker and Greene (2001) surveyed econometrics texts and teaching at the turn of the millennium: Econometrics and statistics are often taught as branches of mathematics, even when taught in business schools ... the focus in the textbooks and teaching materials is on presenting and explaining theory and technical details with secondary attention given to applications, which are often manufactured to fit the procedure at hand ... applications are rarely based on events reported in financial newspapers, business magazines or scholarly journals in economics. Following a broader trend towards empiricism in economic research (documented in Hammermesh 2013 and Angrist, Azoulay, Ellison, Hill, and Lu 136 Journal of Economic Perspectives forthcoming), today’s texts are more empirical than those they’ve replaced. In particular, modern econometrics texts are more likely than those described by Becker and Greene to integrate empirical examples throughout, and often come with access to websites where students can find real economic data for problem sets and practice. But the news on the textbook front is not all good. Many of today’s textbook examples are still contrived or poorly motivated. More disappointing to us than the uneven quality of empirical applications in the contemporary econometrics library is the failure to discuss modern empirical tools. Other than Stock and Watson (2015), which comes closest to embracing the modern agenda, none of the modern undergraduate econometrics texts surveyed below mentions regression-discontinuity methods, for example. Likewise, we see little or no discussion of the threats to validity that might confound differences-in-differences–style policy analysis, even though empirical work of this sort is now ubiquitous. Econometrics texts remain focused on material that’s increasingly irrelevant to empirical practice. To put these and other claims about textbook content on a firmer empirical foundation, we classified the content of 12 books (listed in online Appendix Table A1), six from the 1970s and six currently in wide use. Our list of classics was constructed by identifying 1970s-era editions of the volumes included in Table 1 of Becker and Green (2001), which lists undergraduate textbooks in wide use when they wrote their essay. We bought copies of these older first or second edition books. Our list of classic texts contains Kmenta (1971), Johnston (1972), Pindyck and Rubinfeld (1976), Gujarati (1978), Intriligator (1978), and Kennedy (1979). The divide between graduate and undergraduate books was murkier in the 1970s: unlike today’s undergraduate books, some of these older texts use linear algebra. Intriligator (1978), Johnston (1972), and Kmenta (1971) are noticeably more advanced than the other three. We therefore summarize 1970s book content with and without these three included. Our contemporary texts are the six most often listed books on reading lists found on the Open Syllabus Project website (http://opensyllabusproject.org/). Specifically, our modern market leaders are those found at the top of a list generated by filtering the Project’s “syllabus explorer” search engine for “Economics” and then searching for “Econometrics.” The resulting list consists of Kennedy (2008), Gujarati and Porter (2009), Stock and Watson (2015), Wooldridge (2016), Dougherty (2016), and Studenmund (2017).4 Recognizing that such an endeavor will always be imperfect, we classified book content into the categories shown in Table 1. This scheme covers the vast majority of the material in the books on our list, as well as in many others we’ve used or read. Our classification scheme also covers three of the tools for which growth in usage appears most impressive in the bibliometric data tabulated by Panhans and Singleton (forthcoming), specifically, instrumental variables, regression-discontinuity methods, 4 These books are also ranked highly in Amazon’s econometrics category and (at one edition removed) are market leaders in sales data from Nielsen for 2013 and 2014. Dougherty (2016) is number eight on the list yielded by Open Syllabus, but the sixth book, Hayashi (2000), is clearly a graduate text, and the seventh, Maddala (1977), is not particularly recent. Undergraduate Econometrics Instruction: Through Our Classes, Darkly Table 1 Topic Descriptions Topic Which includes … Bivariate regression Basic exposition of the bivariate regression model, interpretation of bivariate model parameters Regression properties Derivation of estimators, classical linear regression assumptions, mathematical properties of regression estimators like unbiasedness and regression anatomy, the Gauss–Markov Theorem Regression inference Derivation of standard errors for coefficients and predicted values, hypothesis testing and confidence intervals, R 2, analysis of variance, discussion and illustration of inferential reasoning Multivariate regression General discussion of the multivariate regression model, interpretation of multivariate parameters Omitted variables bias Omitted variables bias in regression models Assumption failures and fix-ups Discussion of classical assumption failures including heteroskedasticity, serial correlation, non-normality, and stochastic regressors; multicollinearity, inclusion of irrelevant variables, generalized least squares (GLS) fix-ups Functional form Discussion of functional form and model parametrization issues including the use of dummy variables, logs on the left and right, limited dependent variable models, other nonlinear regression models Instrumental variables Instrumental variables (IV), two-stage least squares (2SLS), and other single equation IV-estimators like limited information maximium likelihood (LIML) and k-class estimators, the use of IV for omitted variables and errors-in-variables problems Simultaneous equations models Discussion of multi-equation models and estimators, including identification of simultaneous equation systems and system estimators like seemingly unrelated regressions (SUR) and three-stage least squares (3SLS) Panel data Panel techniques and topics, including the definition and estimation of models with fixed and random effects, pooling time series and cross section data, and grouped data Time series Time series issues, including distributed lag models, stochastic processes, autoregressive integrated moving average (ARIMA) modeling, vector autoregressions, and unit root tests. This category omits narrow discussions of serial correlation as a violation of classical assumptions Causal effects Discussion of causal effects and the causal interpretation of econometric estimates, the purpose and interpretation of randomized experiments, and threats to a causal interpretation of econometric estimates including sample selection issues Differences-in-differences Differences-in-differences assumptions and estimators Regression discontinuity methods Sharp and fuzzy regression discontinuity designs and estimators 137 138 Journal of Economic Perspectives and differences–in-differences estimators.5 Our classification strategy counts pages devoted to each topic, omitting material in appendices and exercises, and omitting remedial material on mathematics and statistics. Independently, we also counted pages devoted to real empirical examples, that is, presentations of econometric results computed using genuine economic data. This scheme for counting examples omits the many textbook illustrations that use made-up numbers. Not Fade Away For the most part, legacy texts have a uniform structure: they begin by introducing a linear model for an economic outcome variable, followed closely by stating that the error term is assumed to be either mean-independent of, or uncorrelated with, regressors. The purpose of this model—whether it is a causal relationship in the sense of describing the consequences of regressor manipulation, a statistical forecasting tool, or a parameterized conditional expectation function—is usually unclear. The textbook introduction of a linear model with orthogonal or meanindependent errors is typically followed by a list of technical assumptions like homoskedasticity, variable (yet nonstochastic!) regressors, and lack of multicollinearity. These assumptions are used to derive the good statistical properties of the ordinary least squares estimator in the classical linear model: unbiasedness, simple formulas for standard errors, and the Gauss–Markov Theorem, (in which ordinary least squares is shown to be a best linear unbiased estimator, or BLUE). As we report in Table 2, this initial discussion of Regression properties consumes an average of 11 to 12 percent of the classic textbooks. Regression inference, which usually comes next, gets an average of roughly 13 percent of page space in these traditional books. The most deeply covered topic in our taxonomy, accounting for about 20 percent of material in the classic textbooks, is Assumption failures and fix-ups. This includes diagnostics and first aid for problems like autocorrelation, heteroskedasticity, and multicollinearity. Relief for most of these maladies comes in the form of generalized least squares. Another important topic in legacy texts is Simultaneous equations models, consuming 14 percent of page space in the more elementary texts. The percentage given over to orthodox simultaneous equations models rises to 18 percent when the sample includes more advanced texts. Ironically, perhaps, Assumption failures and fix-ups claims an even larger share of the classics when more advanced books are excluded. These older books also devote considerable space to Time series, while Panel data get little attention across the board. A striking feature of Table 2 is how similar the distribution of topic coverage in contemporary market leading econometrics texts is to the distribution in the classics. As in the Stones’ Age, well over half of the material in contemporary texts is concerned with Regression properties, Regression inference, Functional form, and 5 Panhans and Singleton (forthcoming) also document growth in the number of articles using the terms “natural experiment” and “randomized control trial.” Joshua D. Angrist and Jörn-Steffen Pischke 139 Table 2 Topics Coverage in Econometrics Texts, Classic and Contemporary (page counts as percentage) Topic Bivariate regression Regression properties Regression inference Multivariate regression Omitted variables bias Assumption failures and fix-ups Functional form Instrumental variables Simultaneous equations models Panel data Time series Causal effects Differences-in-differences Regression discontinuity methods Empirical examples 1970s 1970s excluding more-advanced texts Contemporary (1) (2) (3) 2.5 10.9 13.2 3.7 0.6 18.4 10.2 7.4 17.5 2.7 12.3 0.7 --14.0 3.6 11.9 13.3 3.7 0.5 22.2 9.3 5.1 13.9 0.7 15.2 0.7 --15.0 2.8 9.9 14.6 6.4 1.8 16.0 15.0 6.2 3.6 4.4 15.6 3.0 0.5 0.1 24.4 Note: We classified the content of 12 econometrics texts, six from the 1970s and six currently in wide use (see text for details): Our classic texts are Kmenta (1971), Johnston (1972), Pindyck and Rubinfeld (1976), Gujarati (1978), Intriligator (1978), and Kennedy (1979). Our contemporary texts are Kennedy (2008), Gujarati and Porter (2009), Stock and Watson (2015), Wooldridge (2016), Dougherty (2016), and Studenmund (2017). We report percentages of page counts by topic. All topics sum to 100 percent. Empirical examples are as a percentage of the whole book. Column 2 excludes Kmenta (1971), Johnston (1972), and Intriligator (1978), the more advanced classic econometrics texts. Dashes indicate no coverage. Assumption failures and fix-ups. The clearest change across book generations is the reduced space allocated to Simultaneous equations models. This presumably reflects declining use of an orthodox multi-equation framework, especially in macroeconomics. The reduced coverage of Simultaneous equations has made space for modest attention to Panel data and Causal effects, but the biggest single expansion has been in the coverage of Functional form (mostly discrete choice and limited dependent variable models). Some of the volumes on our current book list have been through many editions, with first editions published in the Stones’ Age. It’s perhaps unsurprising that the topic distribution in Gujarati and Porter (2009) looks a lot like that in Gujarati (1978). But more recent entrants to the textbook market also deviate little from the classic template. On the positive side, recent market entrants are more likely to at least mention modern topics. The bottom row of Table 2 reveals the moderate use of empirical examples in the Stones’ Age: about 15 percent of pages in the classics are devoted to illustrations 140 Journal of Economic Perspectives involving real data. This average conceals a fair bit of variation, ranging from zero (no examples at all) to more than one-third of page space covering applications. Remarkably, the most empirically oriented textbook in our 12-book sample remains Pindyck and Rubinfeld (1976), one of the classics. Although the field has moved to an average empirical content of over 24 percent, no contemporary text on this list quite matches their coverage of examples.6 BLUE Turns to Grey: Econometrics Course Coverage Many econometrics instructors rely heavily on their lecture notes, using textbooks only as a supplement or a source of exercises. We might therefore see more of the modern empirical paradigm in course outlines and reading lists than we see in textbooks. To explore this possibility, we collected syllabuses and lecture schedules for undergraduate econometrics courses from a wide variety of colleges and universities.7 Our sampling frame for the syllabus study covers the ten largest campuses in each of eight types of institutions. The eight groups are research universities (very high activity), research universities (high activity), doctoral/research universities, and baccalaureate colleges, with each of these four split into public and private schools. The resulting sample includes diverse institutions like Ohio State University, New York University, Harvard University, East Carolina University, American University, US Military Academy, Texas Christian University, Calvin College, and Hope College. We managed to collect syllabuses from 38 of these 80 schools. Each of the eight types of schools we targeted is represented in the sample, but larger and more prestigious institutions are overrepresented. Most syllabuses are for courses taught since 2014, but the oldest is from 2009. A few schools contribute more than one syllabus, but these are averaged so each school contributes only one observation to our tabulations. The appendix available with this paper at http://e-jep.org lists the 38 schools included in the syllabus dataset. For each school contributing course information, we recorded whether the topics listed in Table 1 are covered. A subset of schools also provided detailed lectureby-lecture schedules that show the time devoted to each topic. It’s worth noting that the amount of information that can be gleaned from reading lists and course schedules varies across courses. For example, most syllabuses cover material we’ve classified as Multivariate regression, but some don’t list Regression inference separately, presumably covering inference as part of the regression module without spelling this out on the reading list. As a result, broader topics appear to get more coverage. With this caveat in mind, the first column of Table 3 suggests a distribution of econometric lecture time that has much in common with the topic distribution in textbooks. In particular, well over half of class time goes to lectures on Regression 6 The average is pulled down by the fact that one book on the list has no empirical content. Our view of how a contemporary undergraduate econometrics text can be structured around empirical examples is reflected in our book, Angrist and Pischke (2015). 7 Our thanks to Enrico Moretti for suggesting a syllabus inquiry. Undergraduate Econometrics Instruction: Through Our Classes, Darkly 141 Table 3 Course Coverage Topic Bivariate regression Regression properties Regression inference Multivariate regression Omitted variables bias Assumption failures and fix-ups Functional form Instrumental variables Simultaneous equations models Panel data Time series Causal effects Differences-in-differences Regression discontinuity methods Number of institutions Lecture time (percent) Courses covering topic (percent) 11.7 8.7 12.4 10.5 1.9 20.2 15.7 3.9 0.4 3.6 5.0 2.5 2.0 1.4 15 100.0 43.4 92.1 94.7 28.5 73.7 92.1 51.8 19.3 36.8 45.6 25.4 27.2 16.7 38 Notes: The first column reports the percentage of class time devoted to each topic listed at left for the 15 schools for which we obtained a detailed schedule. This column sums to 100 percent. Column 2 reports the percentage of courses covering particular topics for the 38 schools for which we obtained a reading list. properties, Regression inference, Assumption failures and fix-ups, and Functional form. Consistent with this distribution, the second column in the table reveals that, except for Regression properties, these topics are covered by most reading lists. The Regression properties topic is very likely covered under other regression headings. Also paralleling the textbook material described in Table 2, our tabulation of lecture time shows that just under 6 percent of course schedules is devoted to coverage of topics related to Causal effects, Differences-in-differences, and Regression discontinuity methods. This is only a modest step beyond the modern textbook average of 3.6 percent for this set of topics. Single-equation Instrumental variables methods get only 3.9 percent of lecture time, less than we see in the average for textbooks, both old and new. Always looking on the bright side of life, we happily note that Table 3 shows that over a quarter of our sampled instructors allocate at least some lecture time to Causal effects and Differences-in-differences. A healthy minority (nearly 17 percent) also find time for at least some discussion of Regression discontinuity methods. This suggests that econometric instructors are ahead of the econometrics book market. Many younger instructors will have used modern empirical methods in their PhD work, so they probably want to share this material with their students. Textbook authors are probably older, on average, than instructors, and therefore less likely to have personal experience with tools emphasized by the modern causal agenda. 142 Journal of Economic Perspectives Out of Time Undergraduate econometrics instructions is overdue for a paradigm shift in three directions. One is a focus on causal questions and empirical examples, rather than models and math. Another is a revision of the anachronistic classical regression framework away from the multivariate modeling of economic processes and towards controlled statistical comparisons. The third is an emphasis on modern quasi-experimental tools. We recognize that change is hard. Our own reading lists of a decade or so ago look much like those we’ve summarized here. But our approach to instruction has evolved as we’ve confronted the disturbing gap between what we do and what we teach. The econometrics we use in our research is interesting, relevant, and satisfying. Why shouldn’t our students get some satisfaction too? ■ Our thanks to Jasper Clarkberg, Gina Li, Beata Shuster, and Carolyn Stein for expert research assistance, to the editors Mark Gertler, Gordon Hanson, Enrico Moretti, and Timothy Taylor, and to Alberto Abadie, Daron Acemoglu, David Autor, Dan Fetter, Jon Gruber, Bruce Hansen, Derek Neal, Parak Pathak, and Jeffrey Wooldridge for comments. References Angrist, Joshua D. 1998. “Estimating the Labor Market Impact of Voluntary Military Service Using Social Security Data on Military Applicants.” Econometrica 66(2): 249–88. Angrist, Joshua D., Pierre Azoulay, Glenn Ellison, Ryan Hill, and Susan Lu. Forthcoming. “Economic Research Evolves: Citations Fields and Styles.” American Economic Review. Angrist, Joshua D., and Alan B. Krueger. 1999. “Empirical Strategies in Labor Economics.” Chap. 23 in Handbook of Labor Economics, vol. 3, edited by Orley Ashenfelter and David Card, 1277–1366. Elsevier. Angrist, Joshua D., and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press. Angrist, Joshua D., and Jörn-Steffen Pischke. 2010. “The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics.” Journal of Economic Perspectives 24(2): 3–30. Angrist, Joshua D., and Jörn-Steffen Pischke. 2015. Mastering ‘Metrics: The Path from Cause to Effect. Princeton University Press. Arcidiacono, Peter, Esteban M. Aucejo, and V. Joseph Hotz. 2016. “University Differences in the Graduation of Minorities in STEM Fields: Evidence from California.” American Economic Review 106(3): 525–62. Arellano, Manuel. 1987. “Computing Robust Standard Errors for Within-Groups Estimators.” Oxford Bulletin of Economics and Statistics 49(4): 431–34. Ayres, Ian. 2007. Super Crunchers. Bantam Books. Becker, William E., and William H. Greene. 2001. “Teaching Statistics and Econometrics to Undergraduates.” Journal of Economic Perspectives 15(4): 169–82. Bertrand, Marianne, Esther Duflo, and Sendhil Mullainathan. 2004. “How Much Should We Trust Differences-in-Differences Estimates?” Quarterly Joshua D. Angrist and Jörn-Steffen Pischke Journal of Economics 119(1): 249–75. Bowen, Donald E., III, Laurent Frésard, and Jérôme P. Taillard. Forthcoming. “What’s Your Identification Strategy? Innovation in Corporate Finance Research.” Management Science. Brynjolfsson, Erik, and Andrew McAfee. 2011. “The Big Data Boom Is the Innovation Story of Our Time.” The Atlantic, November 21. Chamberlain, Gary. 1982. “Multivariate Regression Models for Panel Data.” Journal of Econometrics 18(1): 5–46. Christian, Brian. 2012. “The A/B Test: Inside the Technology That’s Changing the Rules of Business.” Wired, April 25. Dale, Stacy Berg, and Alan B. Krueger. 2002. “Estimating the Payoff to Attending a More Selective College: An Application of Selection on Observables and Unobservables.” Quarterly Journal of Economics 117(4): 1491–1527. Dougherty, Christopher. 2016. Introduction to Econometrics. 5th edition. Oxford University Press. Fuchs-Schündeln, Nicola, and Tarek A. Hassan. 2016. “Natural Experiments in Macroeconomics.” Chap. 12 in Handbook of Macroeconomics, vol. 2, edited by John B. Taylor and Harald Uhlig, 923–1012. Elsevier. Goldberger, Arthur S. 1991. A Course in Econometrics. Harvard University Press. Gujarati, Damodar. 1978. Basic Econometrics. New York: McGraw‐Hill. Gujarati, Damodar N., and Dawn C. Porter. 2009. Basic Econometrics. 5th Edition. Boston: McGraw‐Hill. Hamermesh, Daniel S. 2013. “Six Decades of Top Economics Publishing: Who and How?” Journal of Economic Literature 51(1): 162–72. Hayashi, Fumio. 2000. Econometrics. Princeton University Press. Intriligator, Michael D. 1978. Econometric Models, Techniques, and Applications. Englewood Cliffs, NJ: Prentice Hall. Johnston, J. 1972. Econometric Methods, 2nd Edition. New York: McGraw‐Hill. Kennedy, Peter. 1979. A Guide to Econometrics. Cambridge, MA: The MIT Press. 143 Kennedy, Peter. 2008. A Guide to Econometrics. 6th Edition, Malden, MA: Blackwell Publishing. Kmenta, Jan. 1971. Elements of Econometrics. New York: The Macmillan Company. Kohavi, Ron. 2015. “Online Controlled Experiments: Lessons from Running A/B/n Tests for 12 Years.” Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. Maddala, G. S. 1977. Econometrics. McGraw-Hill. Moulton, Brent R. 1986. “Random Group Effects and the Precision of Regression Estimates.” Journal of Econometrics 32(3): 385–97. Newey, Whitney K., and Kenneth D. West. 1987. “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econometrica 55(3): 703–08. Panhans, Matthew T., and John D. Singleton. Forthcoming. “The Empirical Economist’s Toolkit: From Models to Methods.” History of Political Economy. Pindyck, Robert S., and Daniel L. Rubinfeld. 1976. Econometric Models and Economic Forecasts. New York: McGraw‐Hill. Stock, James H., and Mark M. Watson. 2015. Introduction to Econometrics. 3rd Edition. Boston: Pearson. Studenmund, A. H. 2017. Using Econometrics: A Practical Guide. 7th Edition, Boston: Pearson. Summers, Anita A., and Barbara L. Wolfe. 1977. “Do Schools Make a Difference?” American Economic Review 67(4): 639–52. White, Halbert. 1980a. “Using Least Squares to Approximate Unknown Regression Functions.” International Economic Review 21(1): 149–70. White, Halbert. 1980b. “A HeteroskedasticityConsistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity.” Econometrica 48(4): 817–38. Wooldridge, Jeffrey M. 2016. Introductory Econometrics: A Modern Approach. 6th edition. Boston: Cengage Learning. Yitzhaki, Shlomo. 1996. “On Using Linear Regressions in Welfare Economics.” Journal of Business & Economic Statistics 14(4): 478–86. 144 Journal of Economic Perspectives