Academia.eduAcademia.edu

Commodities are Not Industries! A Value Chain Example

2020

Wassily Leontief received the 1973 Nobel Prize in Economics for his 1936 introduction input-output accounts and laying the foundations for decades of studies of economic structure and analyses of systemwide impacts of economic shocks (Leontief, 1936). In 1961, Richard Stone published “Input-Output and National Accounts,” which recognized and dealt explicitly with the realities of secondary production (?), and for which he received the 1984 Nobel Prize in Economics. Despite these recognitions and widespread use and acceptance internationally and in other disciplines and public sector planning applications, the Leontief and Stone framework gained little traction in mainstream U.S. economics. However, inputoutput modeling frameworks have attracted new attention in a variety of problem domains, including environmental attribution, water use, life-cycle assessment, and supply and value chains. Yet many – if not most – contemporary economists continue founding their work directly the Leon...

Regional Research Institute Working Papers Regional Research Institute 10-12-2020 Commodities are Not Industries! A Value Chain Example Randall W. Jackson Patricio Aroca Follow this and additional works at: https://researchrepository.wvu.edu/rri_pubs Part of the Business Analytics Commons, and the Econometrics Commons Regional Research Institute West Virginia University Working Paper Series Commodities are Not Industries! A Value Chain Example Randall W. Jackson Director, Regional Research Institute, West Virginia University Patricio Aroca CEPR, Business School, Universidad Adolfo Ibáñez Working Paper Number 2020-06 Date Submitted: October 12, 2020 Keywords: Input-Output, SNA, Value Chains, Primary and Secondary Products JEL Classification: F10, F14, L16, L23, C67 Commodities are Not Industries! A Value Chain Example Randall W. Jackson* Patricio Aroca† October 12, 2020 Abstract Wassily Leontief received the 1973 Nobel Prize in Economics for his 1936 introduction input-output accounts and laying the foundations for decades of studies of economic structure and analyses of systemwide impacts of economic shocks (Leontief, 1936). In 1961, Richard Stone published “Input-Output and National Accounts,” which recognized and dealt explicitly with the realities of secondary production (?), and for which he received the 1984 Nobel Prize in Economics. Despite these recognitions and widespread use and acceptance internationally and in other disciplines and public sector planning applications, the Leontief and Stone framework gained little traction in mainstream U.S. economics. However, inputoutput modeling frameworks have attracted new attention in a variety of problem domains, including environmental attribution, water use, life-cycle assessment, and supply and value chains. Yet many – if not most – contemporary economists continue founding their work directly the Leontief framework, eschewing the power and versatility of Stone’s enhancements. Moreover, many are failing to understand and appreciate the conceptual differences between the two frameworks and as a result are failing to match constructs to variables in their their newly developed analytical metrics, even in top economics journals. In this paper, we use an increasingly common approach to value chains analysis as one of many possible examples that demonstrate such conceptual misunderstandings, and by developing and implementing properly formulated value chain metrics, we demonstrate both the extent of the consequences of neglecting the Stone enhancements and the important role of reproducing results in advancing scientific knowledge. *Director, Regional Research Institute, West Virginia University, 886 Chestnut Ridge Rd, Morgantown WV, USA 26506, [email protected]. †CEPR, Business School, Universidad Adolfo Ibáñez, Av. Padre Alberto Hurtado 750, Of. 207-C Vi na del Mar, Chile., [email protected] 2/17 In 1936, Wassily Leontief introduced the world to input-output accounting (Leontief, 1936). According to the press release announcing his selection as Nobel Prize awardee, This important innovation has given to economic sciences an empirically-useful method to highlight the general interdependence in the production system of a society. In particular, the method provides tools for a systematic analysis of the complicated interindustry transactions in an economy. (Press Release: Nobel Media AB2020, 1973) In building on Leontief’s foundations, Sir Richard Stone realized that secondary production by industries created consistency problems for national accounting and developed a critical refinement to the Leontief framework that treated industry use of commodities independently from commodity composition of industry output. A set of explicit accounting identities and algebraic manipulations support the construction of flows matrices that parallel the Leontief transactions matrices. This system of accounting eventually formed the basis for the United Nations Systems of National Accounts (United Nations, 1968).1 In addition to supporting the construction of interindustry flow matrices, Stone’s framework further enabled representations of intercommodity, commodity-by-industry, and industry-by-commodity accounts. “For having made fundamental contributions to the development of systems of national accounts and hence greatly improved the basis for empirical economic analysis,” Stone was awarded the 1984 Nobel Prize in Economics (Press Release: Nobel Media AB2020, 1984). Despite these formal recognitions and widespread use in other disciplines and in public sector planning and investment decisions, and widespread acceptance and use internationally, neither the Leontief nor the Stone framework gained much traction in mainstream U.S. economics. Now, however, input-output models have attracted new attention in a variety of problem domains, such as life cycle accounting and inventorying, pollution emissions attribution, foreign content analysis, and value chains. Curiously, it is the Leontief inter-industry framework that often serves as the foundation for work by many – if not most – contemporary economists, despite the added power and versatility of Stone’s enhancements. Moreover, many of those engaged in empirical applications are failing to recognize or understand and appreciate the conceptual differences between the two frameworks and underlying data definitions and structures, and because nearly all nations now publish input-output data in the Stone framework, they are failing to match constructs to variables in the development and implementation of their new analytical metrics, even in top economics journals. In this paper, we use an approach to the analysis of value chains that has gained substantial momentum as one among many possible examples that demonstrate the consequences of such conceptual misunderstandings, and by developing and implementing properly formulated value chain metrics, we demonstrate the ramifications of neglecting the Stone enhancements and their conceptual underpinnings. We discuss these issues in the context of value chain metrics, then take advantage of the Stone framework to develop conceptual and methodological consistency and demonstrate the consequences empirically by comparing the results from correct formulations to those from the incorrect formulations being replicated in the literature. We show that incorrect formulations can have substantial empirical manifestations. Our primary objective is to draw researchers attention back to Stone’s developments and call for the effort necessary to understanding properly the conceptual and theoretical foundations of modern systems of national accounts. Not only can the neglect of the differences in conceptual underpinnings of the two accounting systems lead to theoretical, conceptual, and empirical inconsistencies, but the additional information in the Stone supply-use framework can sometimes provide excellent avenues for problem solutions that are simply not available when analysis is limited only to Leontief accounts. Our secondary objective is to highlight and support the critical role of modern publication policies that require that authors submit with their publications the code and data that enables reproduction of results. These resources can be extremely effective in supporting the knowledge building enterprise and the kinds of course corrections that ensure the integrity of and role for science in modern society. These policies are vitally important and scientists have a corresponding obligation to engage in reproducing published results and reporting issues encountered in the process. 1 The UN SNA is not alone in adopting this framework. The OECD has adopted the Supply-Use framework (OECD, 2018), and the recently developed World Input-Ouput Database (Dietzenbacher et al., 2013) organizes it data similarly. 3/17 The remainder of the paper is organized as follows. In the next section we introduce recent developments in value chain literature, and in so doing we begin to identify the confusion that can arise from a less than comprehensive understanding of the Stone framework and its underlying definitional and structural foundations. In section 2 we review the salient features of the Stone accounting framework, provide a corrected value chain measure that parallels the incorrectly implemented upstreamness measure in the literature, and demonstrate the consequences empiricially. Section 3 returns to the basic definition of value chains and demonstrates one kind of analytical approach that can only be supported by the Stone framework, reinforcing the power, flexibility, and wider range of potential applications than can be developed and implemented in a Leontief framework. Section 4 discusses implications for practical application and for science and knowledge accumulation, and the final section summarizes the paper’s contributions. 1 Value Chains The characterization of global supply chains is a topic that has gained visibility and importance in recent literature. Various authors have approached supply and value chains from perspectives that include business transaction optimization, economic development policy, and property rights. An important case in point is a paper by Antràs and Chor (2013) (AC) that addressed the problem of measuring the degree to which industries are upstream or downstream in the global value chain, including an algorithm that they developed and used for these purposes. Since the 2013 AC publication and its companion, Antràs et al. (2012) (ACFH), a growing number of publications have built upon the incorrect measurement formulation and accompanying computer code. Unfortunately, their algorithmic implementation fails to recognize and account for the differences and distinctions between historical input-output (IO) accounting frameworks based on interindustry flows matrices on the one hand, and modern supply–use, or commodity-by-industry accounting frameworks on the other. Further, the realities of modern accounting frameworks make explicit the opportunity and need for researchers to identify differences in goals and objectives that might lead to the selection of alternative flows matrix formulations for identifying value chain upstreamness and downstreamness. These two papers have introduced to the field an approach that overlooks the fundamental accounting relationships and identities of the current generation of IO accounting frameworks. In so doing, they have laid a potentially problematic foundation for a set of upstream and downstream production system linkage measures that is based on Leontief foundations but mis-applies data organized under the Stone accounting system. Whereas the availability of supporting code has facilitated the replication of the error by those who adopted it without scrutiny, it has also facilitated the critical assessment and corrective actions presented below. 1.1 Commodity Versus Industry: A Necessary Distinction The statement “dij Yj , is precisely the value of commodity i used in j’s production” (ACFH, page 414) refers to their formulation of the coefficient dij as an element of a Use table, and Yj as industry j output. However, this definition is somewhat obscured by the fact that industries produce more than one commodity. Hence, dij would have a very different interpretation, and a different value, if it were drawn from a commodityby-commodity IO table, in which the denominator would be the value of commodity rather than industry output. The choice of the denominator is critical to the interpretation and the value of dij , especially because the gross output of industry j can be very different from the gross output of commodity j. For example, the most upstream industry in their analysis based on 2002 U.S. data is the Petrochemical sector. Petrochemical commodity gross output is 22% larger than petrochemical industry gross output. This is because other industries also produce petrochemical commodities as a secondary product. The assumption that commodity and industry gross output are the same is quite clearly invalid. The definitions of measures in AC are equally imprecise. For example, their “first measure is the ratio of the aggregate direct use to the aggregate total use (DUse TUse) of a particular industry i ’s goods, where the direct use for a pair of industries (i, j ) is the value of goods from industry i directly used by firms in industry j to produce goods for final use, while the total use for (i, j ) is the value of goods from industry i used either directly or indirectly (via purchases from upstream industries) in producing industry j ’s output for final use” (p. 2131). To be accurate and correct, the direct use definition would need to refer to an 4/17 industry-by-industry intermediate transaction matrix, and the total use definition would need to refer to an industry-by-industry Leontief inverse matrix post-multiplied by a diagonal matrix of the total output by industry. Instead, they draw their data from a commodity-by-industry Use matrix to implement their measures. Neither reference to “the value of goods from industry i ” is accurate when the unadjusted Use matrix is the data source. The Use matrix is neither a flows matrix nor an industry-by-industry matrix. To emphasize the importance of these distinctions, focus again on the Petrochemicals industry and commodity. Table 1 of Antràs et al. (2012), reproduced below as Table 1, identifies Petrochemicals (325110) as the industry with the highest upstreamness measure. However, the 2002 Make matrix companion to the Use table used for that analysis indicates that petrochemical commodities (products) account for only 36% of this industry’s total gross output. The remainder of the petrochemical industry’s gross output is secondary production, i.e., production of other commodities. Likewise, the petrochemicals industry produces only 44% of petrochemicals commodity output. The rest of the petrochemicals commodity production is secondary production from other industries. Table 1: Least and Most Upstream Industries (Manuf.) US IO2002 industry Upstreamness Automobile (336111) 1.000 Light truck and utility vehicle (336112) 1.001 Nonupholstered wood furniture (337112) 1.005 Upholstered household furniture (337121) 1.007 Footwear (316200) 1.007 Alumina refining (33131A) 3.814 Other basic organic chemical (325190) 3.853 Secondary smelting of aluminum (331314) 4.064 Primary smelting of copper (331411) 4.355 Petrochemical (325110) 4.651 Source: Authors’ calculations, replicating (Antràs et al., 2012, p. 415). Secondary production is precisely the reason why the commodity-by-industry framework for IO accounting was developed. Ignoring these definitional differences can result in substantial errors, mis-attributions, and misinterpretations. Therefore, we clarify and make explicit the relevant definitions and describe the ways in which the variables should be used in studying value chains. Analysts’ choices of model structure will depend on whether the goal of the analysis is to identify commodities used to produce other commodities or in the industries involved in interindustry the production of commodities. If it is the former, then a commodityby-commodity IO specification would be appropriate, while if it is the latter, one would presumably use an industry-by-commodity IO table. Likewise, an industry-by-industry model would be used when focusing explicitly on interactions among industries.2 We review below the ACFH presentation so that we can assess whether the definition that is used in the paper relates to any of these alternatives, and if not, whether it is appropriate for their demonstration analysis. 1.2 The Upstreamness Measure To develop their upstreamness measure, ACFH “begin by considering an N-industry closed economy with no inventories. For each industry i ∈ 1, 2, · · · , N , the value of gross output (Yi ) equals the sum of its use as a final good (Fi ) and its use as an intermediate input to other industries (Zi )” (p. 412). Y i = Fi + Z i = Fi + N X dij Yj (1) j=1 2 A post-processing option for some policy purposes might be to conduct the analysis in a commodity-by-commodity framework, and then convert to industry space using the standardized Make table to transform from commodity output to industry output, as described in Section 2, below. 5/17 They define dij as “the dollar amount of sector i’s output needed to produce one dollar’s worth of industry j’s output.” This balance equation is correctly presented, provided that all variables are defined in industry space, i.e., Y , F , and Z are industry rather than commodity sectors, and that the term “sector i” in the definition of dij refers to industry sector i. They next derive equation (2), which is the Leontief inverse matrix expressed as an infinite series of terms that can be reduced to Y = (I − D)−1 F , where (I − D)−1 is the traditional Leontief inverse. Based on these relationships and definitions, they then present their upstreamness measure, and describe it as “the (weighted) average position of an industry’s output in the value chain, by multiplying each of the terms in” (p. 413) the infinite series by their distance from final use plus one and dividing by the gross output of the industry. For industry i, this yields U1i Fi =1 +2 Yi PN +4 j=1 dij Fj Yi PN j=1 PN PN j=1 +3 PN PN k=1 l=1 k=1 dik dkj Fj Yi dil dlk dkj Fj Yi (2) + ··· Note that the denominator has changed now from Yj to Yi , a step that shifts the context from purchases to sales, and that they justify by noting that consideration should be taken of industries that sell to other upstream industries. They go on to provide a computational reduced form for this measure, after establishing its equivalence to Fally’s (2011) upstreamness measure, which has the following compact expression: U1 = (I − ∆)−1 · 1 (3) d Y “where ∆ is the matrix with ijYi j in entry (i, j) and 1 is a column vector of ones.” Finally, they “take the reciprocal to obtain DownMeasure for each industry i” (Antràs and Chor, 2013, p. 2162). They go on to provide two economic interpretations for these upstreamness measures in which the emphasis appears to be linkages among industries and not products.3,4 The conceptual inconsistency arises in the implementation of the upstreamness measure, which proceeds by replacing dij Yj in matrix ∆ with Uij , the ij th cell from the Use matrix. Contrary to ACFH footnote 3, which states that “the coefficient dij is computed as the total purchases by industry j of industry i’s output,” they draw coefficients from the Use table, which depicts the total purchases by industry j of commodity i, and not simply purchases from industry i. More than one industry can produce any given commodity, and the difference between industry and commodity output – as we have seen – can be substantial. More importantly, because commodities are produced by multiple industries, the Use matrix is not a conventional flows matrix as were the historical interindustry transactions matrices. While the columns of the Use matrix can be conceived of as destinations, the rows identify the commodities that each industry uses, but not the producing industries. The inconsistency can be clarified further by returning to the derivation of the reduced form upstreamness expression PN from the power series expansion, which was derived from the accounting identity Yi = Fi + Zi = Fi + j=1 dij Yj . Consider the dil dlk dkj term of ACFH equation (2), where these d coefficients are defined as ratios of industry input dollar per industry output dollar. If we assign values of .1, .2, and .3 to these coefficients, then every dollar of output from industry j will require $0.3 of input k, which will create a requirement for $(.2)(.3) = $0.06 of input l, which will then require $(.1)(.2)(.3) = $0.006 of input i for its production. The numerators and denominators have the same dimension (industry $), so the interpretation 3 We note that Dietzenbacher and Romero (2007) developed and reported measures that are virtually identical to these measures. 4 The authors also make an open-economy adjustment, justified by noting that the data used to construct their matrix of US IO coefficients “do not distinguish between flows of domestic goods and international exchanges” (p. 414). The result is an adjustment factor for the IO coefficients matrix that transforms its interpretation from a technical relationship to a trade relationship. The adjustment factor is the ratio of domestic output of industry i to domestic use (absorption) of industry i output. 6/17 of the product is clear and consistent. If, however, the dij coefficients have industry denominators but commodity numerators, then the product now reflects commodity i required to produce industry l output times commodity l required to produce industry k output times commodity k required to produce industry j output. But because each of these industries produces secondary products, the one-to-one relationship is lost; the product of this multiplication can only make sense dimensionally, and can therefore only have an unambigouously straightforward interpretation if and only if these industries produce only their own commodities, which would mean that industry and commodity output would have to be identical. This kind of system would be reflected in a Make table with nonzero elements only on the diagonal. Were these coefficients defined with commodity terms in both numerator and denominator, they would be interpreted as commodity i required in the production of commodity l times commodity l required in the production of commodity k times commodity k required to produce commodity j, and this would be dimensionally consistent. The crux of the problem is that commodity required to satisfy industry demand results not only in the production of the industry’s primary commodity, but also the production of secondary commodities, and this happens at every term in the power series expansion, resulting in the loss of ability to trace commodities unambiguously through the supply chain. 2 The Stone Framework and Upstreamness Reformulation The values in the Use table are associated behaviorally with columns. They represent column industry requirements of row commodity inputs, so standardizing by row commodity output values is not particularly useful. This does not render the development of an ACFH-type upstreamness measure intractable, of course. The modern accounting system that is the commodity-by-industry framework was devised precisely to accommodate the need to work analytically with systems in which industries produce multiple commodities. Indeed, developing the linkage matrices in industry-by-industry, commodity-by-commodity, industry-by commodity, and even commodity-by-industry format is possible using precisely the same database that ACFH used to implement their measure. We provide below the fundamental accounting equations that support the construction of these requirements coefficients tables. 2.1 Conceptual Reformulation The commodity-by-industry framework is presented below in Figure 1. In conventional IO notation (as in Miller and Blair, 1985, 2009), the matrix partition U = [uij ] is the Use matrix, V = [vij ] is the Make matrix, e is commodity final demand expressed here as a single vector, q is commodity gross output, and g is industry gross output. Only in highly unusual cases will an industry produce no secondary commodities, so rarely will qi and gi be equal. The va term denotes value added. The traditional industry output balance equation that ACFH write as Yi = Fi +Zi actually has no simple and direct counterpart in the modern accounting framework (although one can be derived, it requires assumptions about secondary production technology and information contained in V ).5 We can, however, express a PN commodity output balance equation in this conventional notation as qi = j=1 uij + ei , and we can further PN u define dij = gijj and substitute to obtain qi = j=1 dij gj + ei , maintaining the output balance. Figure 1: The Commodity-Industry Framework Source: Adapted from United Nations (1968) 5 The industry output balance equation is conventionally denoted X=Y+Z. 7/17 In matrix notation, we have the following identities: Ui + e ≡ q (4) Vi≡g (5) V ′i ≡ q (6) where i is a summing vector, and ′ signifies the transpose operation. We define behavioral relationships as follows: B = U ĝ −1 (7) U = Bĝ (8) D = V q̂ −1 (9) V = Dq̂ (10) whereˆindicates diagonalization. Equation 7 defines the production requirements of commodities per industry output dollar, and equation 9 is a statement of the industry-based technology assumption that commodities are produced by industries in fixed proportions.6 Note that the effect of pre-multiplication of a commodity vector or matrix by D results in a transformation from commodity-space to industry-space, so V i = g = Dq. This system allows us to formulate the following commodity balance equation: q = Bg + e (11) q = BDq + e (12) q = (I − BD)−1 e (13) The BD term is a commodity-by-commodity requirements matrix counterpart to the classical, columnstandardized interindustry Leontief IO coefficients matrix. Equation 13 is the reduced form solution for q as a function of commodity final demand, e. This expression, however, is consistent with the Leontief demand driven formulation, in which the values in the Use matrix are divided by their respective column industry output values. In contrast, the upstreamness measure in AC (and ACFH) shown in equation 3 is formed by dividing the elements in the Use matrix by respective row commmodity output, although they refer to this as row industry output.7 To convert the BD matrix to an equivalent and correctly formulated (Ghoshian) matrix, we can return the BD coefficients to their transactions values using U D, then standardize the rows by commodity output, q, yielding q̂ −1 U D. The difference between the incorrect q̂ −1 U formulation and the correct reformulation is quite clearly the D term, which is the essential mechanism that transforms the industry column dimension of the Use matrix to the commodity output space of vector q. The other difference is that ACFH convert commodity output to commodity absorption by netting out trade and inventories, which can be implemented similarly in the new commodity-by-commodity requirements matrix reformulation by adjusting q for net trade and inventory adjustment before using it as a standardizing vector.8 6 The alternative is the commodity-based technology assumption, which while not used here, could be developed in parallel fashion. 7 We confirm the authors’ stated intent via reference to the publications, but we confirmed their implementations by evaluating the code that accompanies those publications. 8 The B D̃ matrix, where D̃ is the Make matrix standardized by q adjusted for net trade and inventory, would be used in the computation of the counterpart, commodity-by-commodity downstreamness measure used in Antràs and Chor (2013). Although we have not addressed explicitly the derivation and use of the counterpart interindustry rather than inter-commodity measures, the development would follow a similar path but would be based on row- or column-standardization of the interindustry transactions matrix D̃U . This formulation based on the modification of the D matrix is introduced in Jackson (1998). 8/17 2.2 Empirical Example With the conceptual distinction established, we turn in this section to a demonstration of the empirical consequences for the computed measures reported in ACFH. Table 2 displays upstreamness scores replicated using the ACFH algorithm and data and using the corrected algorithm presented in this paper and the ACFH data. The top seven rows of data report results for the commodity sectors whose scores rank among the top five using either method, and the last six rows correspond to the five lowest ranking commodity sectors using either method. Each row shows the scores for sectors that rank highest in the first data column, their ACFH ranks in the second data column, their scores from the reformulated algorithm in the third data column, and their respective ranks in the fourth. Table 2: Upstreamness measure comparisons ACFH Score Rank Corrected C×C Score Rank NAICS Sector 325110 331411 331314 325190 33131A 331200 331419 Petrochemical mfg Primary copper smelting and refining Secondary aluminum smelt & alloying Other basic organic chemical mfg Alum. primary prodn and refining Steel product from purchased steel Prim. nonfer. metal smelt & refn 4.6511 4.3547 4.0637 3.8529 3.8144 3.45 3.4186 1 2 3 4 5 16 18 4.1785 6.4031 3.9991 3.5391 4.9632 4.0065 5.6283 4 1 6 15 3 5 2 336111 336112 337122 337121 316200 336213 Automobile mfg Light truck and utility vehicle mfg Wood HH furniture mfg. Upholstered HH furniture mfg Footwear mfg Motor home mfg 1.0003 1.0005 1.0052 1.0072 1.0073 1.0123 279 278 277 276 275 274 1.0004 1.0008 1.0072 1.008 1.0454 1.0129 279 278 277 276 267 275 Source: Antràs et al. (2012) and authors‘ calculations. Largest ranking differences in bold. Note first that there is a good deal of commonality. The ranks in the lowest ranking sectors are quite similar, which would be expected a) because both formulations rank commodities and not industries, b) because of the general correspondence between industries and commodities, especially for industries producing consumer products, although even here, Footwear Manufacturing shows an 8-point difference in ranks, and c) because sectors with the lowest scores are virtually never used as intermediate inputs and are therefore much less distinguishable in their upstreamness scores. There is somewhat greater discrepancy in the highest ranked sectors, however, where rank differences are as great as 16 among only the five most highly ranked upstreamness sectors. Indeed, the average difference in ranks is 7, due to the notable ranking shifts for Other Basic Organic Chemical Manufacturing, Steel Product Manufacturing from Purchased Steel, and Primary Smelting and Refining of Nonferrous Metals (except copper and aluminum). The reformulation reveals substantial differences. Because of the multi-commodity reality of industry production, the second and fifth highest ranking upstreamness sectors using the correct method are not even in the top 15 sectors using the earlier formulation. Secondary production strongly influences the upstreamness measures, and cannot be ignored in implementation. To demonstrate the inaccuracies introduced in counterpart downstreamness measures, we replicate AC’s results for their DownMeasure (AC, page 2163) and provide in Table 3 corrected rankings for their highlighted industries. In Table 4 we provide the correctly ranked and highlighted industries that we obtain when using the correctly formulated commodity-by-commodity matrices along with the AC rankings. As with the upstreamness comparisons, there is some substantial agreement, but there also are some very substantial differences. 9/17 Table 3: DownMeasure Comparisons: AC Results Industry AC DownMeasure Rank Corrected C×C Rank Lowest 10 Values 325110 331411 331314 325190 33131A 325310 335991 325181 331420 325211 Petrochemical mfg Primary copper smelt & refining Secondary aluminum smelt & alloy Other basic organic chemical mfg Primary alumina refn and prodn Fertilizer mfg Carbon and graphite product mfg Alkalies and chlorine mfg Copper rolling, drawing, etc. Plastics material and resin mfg 0.2150 0.2296 0.2461 0.2595 0.2622 0.2658 0.2668 0.2769 0.2769 0.2800 253 252 251 250 249 248 247 246 245 244 250 253 247 241 251 248 246 224 244 219 Highest 10 values 339930 311111 337910 315230 321991 336212 336213 316200 337121 336111 Doll, toy, and game mfg Dog and cat food mfg Mattress mfg Women’s and girls’ apparel mfg Manufactured home mfg Truck trailer mfg Motor home mfg Footwear mfg Upholstered HH furniture mfg. Automobile mfg 0.9705 0.9717 0.9720 0.9762 0.9810 0.9837 0.9879 0.9927 0.9928 0.9997 10 9 8 7 6 5 4 3 2 1 20 7 8 17 6 5 4 3 2 1 Source: Antràs and Chor (2013) from authors’ calculations. Largest rank differences in bold. Table 4: DownMeasure Comparisons: Corrected CxC Results Industry Lowest 10 Values 331411 331419 33131A 325110 331200 325310 331314 335991 333612 331420 Primary copper smelt & refining Primary nonferrous smelt &refn Primary alumina refn and prodn Petrochemical mfg Steel product mfg from purch steel Fertilizer manufacturing Secondary aluminum smelt & alloy Carbon and graphite product mfg Industrial drive and gear mfg Copper rolling, drawing, etc. Corrected CxC DownMeasure Rank 0.1597 0.1804 0.2049 0.2467 0.2560 0.2698 0.2703 0.2735 0.2743 0.2750 253 252 251 250 249 248 247 246 245 244 AC Rank 252 236 249 253 238 248 251 247 205 245 Highest 10 values 311230 Breakfast cereal manufacturing 0.9629 10 15 336612 Boat building 0.9700 9 11 337910 Mattress manufacturing 0.9725 8 8 311111 Dog and cat food manufacturing 0.9786 7 9 321991 Manufactured home mfg 0.9818 6 6 336212 Truck trailer manufacturing 0.9855 5 5 336213 Motor home manufacturing 0.9874 4 4 316200 Footwear manufacturing 0.9917 3 3 337121 Upholstered household furniture mfg 0.9931 2 2 336111 Automobile manufacturing 0.9996 1 1 Source: Antràs and Chor (2013) and authors’ calculations. Largest rank differences in bold. 10/17 3 Consistency with the Value Chain Construct Although we have provided a correct commodity-by-commodity upstreamness measure, the appropriateness of interpretation remains unaddressed. According to Kaplinsky and Morris (2001), a simple value chain “describes the full range of activities which are required to bring a product or service from conception, through the different phases of production (involving a combination of physical transformation and the input of various producer services), delivery to final consumers, and final disposal after use.” In our Stone accounting framework, industries are the activities and commodities are the products. Therefore, a formulation that more precisely captures the spirit of the value chain definition would be an industry-by-commodity framework rather than a commodity by commodity framework, because rather than simply identifying the commodity production required to bring a commodity to market, the industry-by-commodity framework quantifies the range of activities by industry.9 Further, adding other dimensions of the “range of activities,” such as employment and income, or ancillary variables like water use or emissions, requires that the activity levels be quantified by industry and not by commodity. For these reasons, the industry-by-commodity formulation is preferred when quantifying activity levels. Simply knowing what commodities are required in a value chain falls short of identifying required activity. The power of the Stone framework as an analytical tool is demonstrated here yet again, as the transformation of a Ghoshian- or Leontief-type framework from commodity-by-commodity to industry-by-commodity space is straightforward and can be achieved simply by pre-multiplying the respective inverse matrices by D, or D(I − q̂ −1 U D)−1 (14) D(I − BD)−1 (15) for the Ghoshian-type formulation and for a Leontief-type inverse transformation. In Table 5, we replicate Table 3, substituting the new and properly formulated industry rankings derived from the industry-by-commodity Ghoshian formulation of equation 14 for the Corrected I×C results column. As before, while there are some similarities, the difference now are stark. Of the sectors with the ten smallest values in AC’s ranking (those with the highest upstreamness), only five are in the comparably defined new set. Further, the average absolute difference in ranks between the AC and new top ten (upstream) sectors is now 27.3. AC’s most upstream sector becomes the just 71st most upstream sector (253 − 183 + 1 = 71), and their 8th most upstream sector is the 108th in the new rankings. Table 6 replicates Table 4, where the top ten sectors ranked highest and lowest using the correct industryby-commodity formulation are shown, along with their corresponding AC rankings. Again, we see major discrepancies, including the new second lowest ranking – second most upstream – sector now corresponding to the 96th most upstream sector in the AC ranking, and an average absolute difference in ranks for the most upstream sectors of 30.3. Commodity-by-commodity rankings clearly misidentify the industry activity rankings, and any policy or programmatic actions based on the incorrect rankings will be poorly targeted. 9 Of course, no input-output formulations capture the entire value chain through distribution and post-use disposal. 11/17 Table 5: DownMeasure Comparisons: AC and I-by-C Results Code 325110 331411 331314 325190 33131A 325310 335991 325181 331420 325211 Industry Smallest 10 Values Petrochemical manufacturing Primary copper smelt \& refining Secondary aluminum smelt Other basic organic chemical mfg Primary aluminum refining \&prod Fertilizer manufacturing Carbon and graphite product mfg Alkalies and chlorine mfg Copper rolling, drawing, extruding Plastics material and resin mfg AC DownMeasure Rank Corrected IxC Rank 0,2150 0,2296 0,2461 0,2595 0,2622 0,2658 0,2668 0,2769 0,2769 0,2800 253 252 251 250 249 248 247 246 245 244 183 225 204 253 250 237 244 146 249 251 Code Highest 10 Values 339930 Doll, toy, and game manufacturing 0,9705 311111 Dog and cat food manufacturing 0,9717 337910 Mattress manufacturing 0,9720 315230 Women’s cut and sew apparel 0,9762 321991 Manufactured home mfg 0,9810 336212 Truck trailer manufacturing 0,9837 336213 Motor home manufacturing 0,9879 316200 Footwear manufacturing 0,9927 337121 Upholstered hh furniture mfg 0,9928 336111 Automobile manufacturing 0,9997 Source: Antràs and Chor (2013) from authors’ calculations. Largest 10 2 9 38 8 37 7 5 6 29 5 26 4 48 3 1 2 24 1 9 rank differences in bold. Table 6: DownMeasure Comparisons: AC with I×C Results Industry Code 325190 324110 325211 33131A 331420 331490 322120 323110 332600 335991 Corrected IxC DownMeasure Rank AC Rank 0,1921 0,1944 0,2246 0,2329 0,2330 0,2592 0,2617 0,2628 0,2657 0,2786 253 252 251 250 249 248 247 246 245 244 250 158 244 249 245 235 192 174 194 247 Smallest 10 Values Other basic organic chemical mfg Petroleum refineries Plastics material and resin mfg Alumina refining and primary prod Copper rolling, drawing, extruding Nonferrous metal rolling, drawing Paper mills Printing Spring and wire product mfg Carbon and graphite product mfg Code Highest 10 values 316900 Other leather and allied product mfg 336111 Automobile manufacturing 315900 Apparel accessories mfg 333315 Photo and copying equipment mfg 339910 Jewelry and silverware mfg 315230 Women’s cut and sew apparel mfg 315220 Men’s cut and sew apparel mfg 315290 Other cut and sew apparel mfg 339930 Doll, toy, and game mfg 316200 Footwear manufacturing Source: Antràs and Chor (2013) from authors’ calculations. 1,5720 10 88 1,5849 9 1 1,7072 8 51 1,7746 7 28 1,9089 6 34 1,9662 5 7 2,3624 4 20 3,1213 3 30 3,2026 2 10 9,1246 1 3 Largest rank differences in bold. 12/17 4 Discussion There are two primary contributions of the work reported here. First, there are the implications for those whose practical applications will be founded on the now-dominant Stone-type IO frameworks instead of the classical Leontief interindustry accounts. The second contribution lies in the demonstration of the value to science of modern publication policies that support reproducible research. In this section, we elaborate on both areas. 4.1 Implications for Practical Application The need for the correction in formulation arises from the existence of secondary commodity production by industries. Hence, for an economy whose Make table is strongly diagonal – one with very little secondary production, empirical results might well differ little as a result of our correction, but the differences will become more substantial as the ratio of off-diagonal supply-table elements to diagonal elements increases. The degree of difference for a given set of accounts is an open empirical question, in that primary and secondary production structures vary geographically. Because these metrics will most often be used in practice to identify and prioritize industries for further or attention, higher ranked industries will be of most interest and greatest value to anyone carrying out this kind of analysis. Therefore, the rank order correlations over the entire vector of ranks are not as relevant to the analyst as is the ability to correctly identify the top ranked industries. Below we present two additional perspectives on the impacts of the correct formulation that underscore its importance in practical application. First, we compute and display graphically the differences in ranks over the entire distribution of industries for which the measure has been calculated. Although correlations between the entire corrected and uncorrected upstreamness or downstreamness vectors can be quite strong, the differences in individual ranks can be quite substantial. Figure 2 presents a plot of the simple differences in ranks for vectors of correct industry-bycommodity and uncorrected commodity-by-commodity values for the upstreamness measure derived from the same 2002 U.S. data used in ACFH, with sectors ordered according to their original industry classification scheme sequence. 100 80 60 Difference 40 20 0 -20 -40 -60 0 50 100 150 200 250 Sectors Figure 2: Downstreamness: Difference in Ranks Next, we set up the following analysis, again using the same data. We first rank order sectors using the corrected measure values. We assign a value of 1 for n = 1 (where rank order calculations cannot be computed), then, we sequentially increase n by 1. As n increases, we select the top n corrected-ranked sectors from the incorrect vector, generate ranks for values within that set of n sectors, and then compute 13/17 the Spearman rank correlations between vectors of sequentially increasing lengths. The result is a set of rank correlations for sets of vectors of incrementally larger dimension. The correlations are actually best-case comparisons, because in most sets, uncorrected sector rank values from the entire set of industries will most often exceed the value of n, but for the correlations to be valid, their ranks are indexed relative to the n values in each vector in the comparison set. The result of that exercise for n = 1, ..., 40 is shown below in Figure 3, which reveals that at n = 5, there is 0.2 correlation; for the top 10 ranked correctly calculated values, the correlation rises to 0.6, and by n = 40, the correlation is still only 0.65. While there is no clear way of assessing the statistical significance of these sequential comparisons, the two ranks-vectors are clearly not correlated strongly enough to suggest that there is only insubstantial difference in the qualitative nature of the results, and certainly no support for simply ignoring the effect that the correction has on outcomes. 1 0.8 Spearman Rank Correlation 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 5 10 15 20 25 30 35 40 Number of elements in correlated vectors (N) Figure 3: Downstreamness: Spearman Rank Comparison for Top n Ranks Irrespective of the empirical implications, of course, a published use table in isolation provides only a partial description of an IO system. This alone is reason enough to base empirical analyses on the correct formulations. The correction is straightforward and the necessary Make table data are virtually always published in tandem, so there is no reason not to use the correct formulation. Yet the incorrect formulation continues to be used and are becoming increasingly entrenched in the literature, as noted in Section 1. 4.2 Implications for Science and Knowledge Accumulation “Replication is the cornerstone of science. Research that cannot be replicated is not science, and cannot be trusted either as part of the profession’s accumulated body of knowledge or as a basis for policy. Authors may think they have written perfect code for their bug-free software package and correctly transcribed each data point, but readers cannot safely assume that these errorprone activities have been executed flawlessly until the authors’ efforts have been independently verified.” (McCullough and Vinod, 2003, p. 888) In 2004, the American Economic Review joined a growing number of other academic journals and “began requiring ‘data and code sufficient to permit replication’ of a paper’s results, which is then posted on the journal’s website.”10 This paper aligns with the spirit of these publication policies11 by providing an important course correction for those who might otherwise continue to use errant code submitted with published contributions. We have presented the correct formulations and new code for the corrected algorithm, and we have identified data consistent with the new algorithmic formulation. Our corrections also identify important conceptual considerations that future analysts will overlook at peril of arriving at faulty and misleading 10 https://en.wikipedia.org/wiki/The American Economic Review#cite note-5 Accessed 9/2/2020. of accepted papers that contain empirical work, simulations, or experimental work must provide, prior to acceptance, information about the data, programs, and other details of the computations sufficient to permit replication, as well as information about access to data and programs.” https://www.aeaweb.org/journals/policies/data-code/ Accessed 9/2/2020 11 “Authors 14/17 conclusions at best, and ill-formulated policy and programmatic recommendations at worst. Had the original two papers merely faded in importance and use, such corrections would be less critical. On the contrary, however, as of late September of 2020, these two papers had been cited more than 260 times, with more than half (120) of these citations in the most recent 20 months, many in top economics journals.12 Those directly citing articles, in turn, had been cited more than 2,500 times, according to the Web of Science Citation Index. Further, although we have not checked each and every citing article, we have not identified a single one that rectifies the issues we have identified, and every application that we have been able to assess appears to use precisely the same errant code provided by the authors. Without an explicit correction, these and related errors will continue to propagate. Conconi et al. (2018) and Alfaro et al. (2019), for example, reused the errant Antras code in their application, and further replaced row and column sector names with corresponding harmonized sector product names. This not only created the false appearance of a product-by-product transactions matrix, but this naming replacement step also deepened conceptual confusion because it resulted in some ‘duplicate’ sectors, which then they dealt with by inexplicably replacing several product columns in coefficient tables with the averages of their corresponding coefficient values.13 Had the necessary attention been paid to the precise definitions and structure of the underlying supporting data upon first publication, such compounding errors could have been avoided. There is also increasing evidence throughout the economics literature that industries and products are being treated as though they were equivalent. Kee and Tang (2016), for example, have adopted this practice in research on geographical sources of value added in Chinese exports, in which they note that “Industries are defined according to the industry classification by the United Nations,” and footnote details on which products from the Harmonized Commodity Description and Coding Systems (HS) were their foci. But by assigning products to individual producing industries, they are implicitly assuming a kind of sectoral equivalence that simply does not exist. If industry-specific implications are important for such studies, then introduced bias might be quite substantial and potentially important. In this and some related contexts, there might not be particularly good ready alternatives, but the bias that is introduced via such implicit assumptions should be made explicit. The power of IO frameworks is being newly and appropriately recognized in a growing number of problem domains, but equally important to the development of conceptualizations and analytical tools in new application areas is the need to understand how and even why the supporting data were generated. Matching variables to constructs is as important to IO analysts as understanding data generating processes is to econometricians. Industries and commodities are not conceptual equivalents; organization frameworks like those of Leontief and Stone have different conceptual underpinnings; and data and classification systems themselves are designed for specific purposes, as data providing agencies strive to make clear. The U.S. Census website, for example, notes explicitly that, “NAICS is an industry classification system, not a product classification system, and therefore neither intended nor well suited for this purpose.” (US Census Bureau, 2020) The examples in this section underscore the value of publication policies that emphasize transparency and replication. At a time when science is increasingly under attack, supplying readers with the tools needed for replication is vitally important, as is carrying out such research replications and reporting on those that reveal conceptual or empirical deficiencies. Those with expertise in data generation can join algorithmic and problem domain experts to ensure the integrity of the scientific enterprise. Further, when code accompanying publications is found to be in error, journals in which it is published must curtail its propagation without delay by removing and, when possible, replacing it with corrected code or a reference to appropriate sources. If we fail to engage in replication, if we fail to acknowledge research shortcomings, and if we fail to communicate necessary course corrections, then we relinquish all rights to the defense of science and fall sadly short of our responsibilities as scientists. 12 A sample of direct citations from 2020 alone includes Antras and de Gortari (2020); Murakami and Otsuka (2020); Kostoska et al. (2020); Shen and Zheng (2020); Li et al. (2020); Choi (2020); Peng and Zhang (2020). 13 At best, averaging two or more coefficients columns implies that the transactions values all have equal weights, despite the fact that the values initially used in standardize the columns are clearly unequal. 15/17 5 Summary In this paper, we identified an important inconsistency in the formulation and implementation of upstreamness and downstreamness measures developed and presented in AC and ACFH. The lack of correspondence between construct and data has carried through to numerous subsequent publications identified in the introduction, necessarily generating inaccurate results. The problem arises because important differences between historical interindustry and modern commodity-by-industry IO accounting frameworks are overlooked. We develop and implement a correctly formulated and conceptually consistent alternative based on the commodity-by-industry framework that meets the objectives of the upstreamness and downstreamness measures. Our empirical demonstration makes clear the need to use the correct formulation when studying production linkages. The options available to those studying upstream and downstream linkages in the context of supply and value chains are also worth considering, because each of these options produces a different kind of information. First, one can formulate these measures to study either commodity or industry chains. Commodity chain analyses will reveal information about the production of selected products, while industry supply chains can reveal information about the linkages among activities, specifically industries that are engaged in the production of one or more products. Industry and commodity linkages are surely different, and depending on the goal of the analysis, one or the other classification may be preferred. Second, the coefficients that define the interactions among commodities or industries can be purged of trade as in the procedures discussed here, and this provides a focus on the within-region (in this case, nation) production chains. However, the coefficients also can be based on technical requirements irrespective of origin, and this can provide information that can be useful in assessing the production structure of an economy relative to potential development alternatives. Contrary to complicating the accounting framework, the format of modern systems of national accounts actually enriches the possibilities for meaningful analysis, and facilitates analyses of products (commodities) and activities (industries) in economic systems. Time and effort spent deepening awareness and understanding of the underlying accounting conventions and embedded relationships can yield substantial research and application dividends. 16/17 References Alfaro, L., Chor, D., Antras, P., and Conconi, P. (2019). Internalizing Global Value Chains: A Firm-Level Analysis. Journal of Political Economy, 127(2):508–559. Antràs, P. and Chor, D. (2013). Organizing the global value chain. Econometrica, 81(6):2127–2204. Antràs, P., Chor, D., Fally, T., and Hillberry, R. (2012). Measuring the upstreamness of production and trade flows. The American Economic Review: Papers and Proceedings, 102(3):412–416. Antras, P. and de Gortari, A. (2020). On the Geography of Global Value Chains. Econometrica, 88(4):1553– 1598. Choi, J. (2020). The Global Value Chain Under Imperfect Capital Markets. World Economy, 43(2):484–505. Conconi, P., Garcı́a-Santana, M., Puccio, L., and Venturini, R. (2018). From Final Goods To Inputs: The Protectionist Effect Of Rules Of Origin. American Economic Review, 108(8):2335–2365. Dietzenbacher, E., Los, B., Stehrer, R., Timmer, M., and de Vries, G. (2013). The construction of world input-output tables in the WIOD project. Economic Systems Research, 25(1):71–98. Dietzenbacher, E. and Romero, I. (2007). Production chains in an interregional framework: Identification by means of average propagation lengths. International Regional Science Review, 30(4):362–383. Jackson, R. (1998). Regionalizing national commodity-by-industry accounts. Economic Systems Research, 10(3):223–238. Kaplinsky, R. and Morris, M. (2001). A Handbook For Value Chain Research (Vol. 113). Ottawa: Idrc. Kee, H. L. and Tang, H. (2016). Domestic Value Added In Exports: Theory And Firm Evidence From China. American Economic Review, 106(6):1402–1436. Kostoska, O., Stojkoski, V., and Kocarev, L. (2020). On the Structure of the World Economy: An Absorbing Markov Chain Approach. Entropy, 22(4). Leontief, W. W. (1936). Quantitative input and output relations in the economic systems of the United States. The Review of Economic Statistics, pages 105–125. Li, Y., Sun, H., Huang, J., and Huang, Q. (2020). Low-End Lock-In of Chinese Equipment Manufacturing Industry and the Global Value Chain. Sustainability, 12(7). McCullough, B. D. and Vinod, H. D. (2003). Verifying the solution from a nonlinear solver: A case study. American Economic Review, 93(3):873–892. Miller, R. E. and Blair, P. D. (1985). Input-Output Analysis: Foundations and Extensions. Prentice-Hall, Englewood Cliffs, New Jersey, USA. Miller, R. E. and Blair, P. D. (2009). Input-Output Analysis: Foundations and Extensions. Cambridge University Press, Cambridge, UK. Murakami, Y. and Otsuka, K. (2020). Governance, Information Spillovers, and Productivity of Local Firms: Toward an Integrated Approach to Foreign Direct Investment and Global Value Chains. Developing Economies, 58(2):134–174. OECD (2018). Supply and Use Indicators. https://www.oecd-ilibrary.org/content/data/f5ff195e-en. Peng, J. and Zhang, Y. (2020). Impact of Global Value Chains on Export Technology Content of China’s Manufacturing Industry. Sustainability, 12(1). Press Release: Nobel Media AB2020 (1973). Prize in Economic Science in Memory of Alfred Nobel to Professor Wassily Leontief. https://www.nobelprize.org/prizes/economic-sciences/1973/press-release. Accessed 2020-02-27. Press Release: Nobel Media AB2020 (1984). Richard Stone – https://www.nobelprize.org/prizes/economic-sciences/1984/stone/facts. Accessed 2020-02-27. Facts. 17/17 Shen, C. and Zheng, J. (2020). Does Global Value Chains Participation Really Promote Skill-Biased Technological Change? Theory And Evidence From China. Economic Modeling, 86:10–18. United Nations (1968). A System of National Accounts, volume 2 of Series F. United Nations, New York, 3 edition. US Census Bureau (2020). North American Industry Classification System (NAICS): What is NAICS and how is it used?. https://www.census.gov/eos/www/naics/faqs/faqs.html. Accessed 2020-10-01.