Regional Research Institute Working Papers
Regional Research Institute
10-12-2020
Commodities are Not Industries! A Value Chain Example
Randall W. Jackson
Patricio Aroca
Follow this and additional works at: https://researchrepository.wvu.edu/rri_pubs
Part of the Business Analytics Commons, and the Econometrics Commons
Regional Research Institute
West Virginia University
Working Paper Series
Commodities are Not Industries! A Value Chain Example
Randall W. Jackson
Director, Regional Research Institute, West Virginia University
Patricio Aroca
CEPR, Business School, Universidad Adolfo Ibáñez
Working Paper Number 2020-06
Date Submitted: October 12, 2020
Keywords: Input-Output, SNA, Value Chains, Primary and Secondary
Products
JEL Classification: F10, F14, L16, L23, C67
Commodities are Not Industries! A Value Chain Example
Randall W. Jackson*
Patricio Aroca†
October 12, 2020
Abstract
Wassily Leontief received the 1973 Nobel Prize in Economics for his 1936 introduction input-output
accounts and laying the foundations for decades of studies of economic structure and analyses of systemwide impacts of economic shocks (Leontief, 1936). In 1961, Richard Stone published “Input-Output and
National Accounts,” which recognized and dealt explicitly with the realities of secondary production (?),
and for which he received the 1984 Nobel Prize in Economics. Despite these recognitions and widespread
use and acceptance internationally and in other disciplines and public sector planning applications, the
Leontief and Stone framework gained little traction in mainstream U.S. economics. However, inputoutput modeling frameworks have attracted new attention in a variety of problem domains, including
environmental attribution, water use, life-cycle assessment, and supply and value chains. Yet many –
if not most – contemporary economists continue founding their work directly the Leontief framework,
eschewing the power and versatility of Stone’s enhancements. Moreover, many are failing to understand
and appreciate the conceptual differences between the two frameworks and as a result are failing to match
constructs to variables in their their newly developed analytical metrics, even in top economics journals.
In this paper, we use an increasingly common approach to value chains analysis as one of many possible
examples that demonstrate such conceptual misunderstandings, and by developing and implementing
properly formulated value chain metrics, we demonstrate both the extent of the consequences of neglecting
the Stone enhancements and the important role of reproducing results in advancing scientific knowledge.
*Director, Regional Research Institute, West Virginia University, 886 Chestnut Ridge Rd, Morgantown WV,
USA 26506,
[email protected].
†CEPR, Business School, Universidad Adolfo Ibáñez, Av. Padre Alberto Hurtado 750, Of. 207-C Vi na del
Mar, Chile.,
[email protected]
2/17
In 1936, Wassily Leontief introduced the world to input-output accounting (Leontief, 1936). According to
the press release announcing his selection as Nobel Prize awardee,
This important innovation has given to economic sciences an empirically-useful method to highlight the general interdependence in the production system of a society. In particular, the method
provides tools for a systematic analysis of the complicated interindustry transactions in an economy. (Press Release: Nobel Media AB2020, 1973)
In building on Leontief’s foundations, Sir Richard Stone realized that secondary production by industries
created consistency problems for national accounting and developed a critical refinement to the Leontief
framework that treated industry use of commodities independently from commodity composition of industry
output. A set of explicit accounting identities and algebraic manipulations support the construction of flows
matrices that parallel the Leontief transactions matrices. This system of accounting eventually formed the
basis for the United Nations Systems of National Accounts (United Nations, 1968).1 In addition to supporting
the construction of interindustry flow matrices, Stone’s framework further enabled representations of intercommodity, commodity-by-industry, and industry-by-commodity accounts. “For having made fundamental
contributions to the development of systems of national accounts and hence greatly improved the basis for
empirical economic analysis,” Stone was awarded the 1984 Nobel Prize in Economics (Press Release: Nobel
Media AB2020, 1984).
Despite these formal recognitions and widespread use in other disciplines and in public sector planning and
investment decisions, and widespread acceptance and use internationally, neither the Leontief nor the Stone
framework gained much traction in mainstream U.S. economics. Now, however, input-output models have
attracted new attention in a variety of problem domains, such as life cycle accounting and inventorying,
pollution emissions attribution, foreign content analysis, and value chains. Curiously, it is the Leontief
inter-industry framework that often serves as the foundation for work by many – if not most – contemporary economists, despite the added power and versatility of Stone’s enhancements. Moreover, many of
those engaged in empirical applications are failing to recognize or understand and appreciate the conceptual
differences between the two frameworks and underlying data definitions and structures, and because nearly
all nations now publish input-output data in the Stone framework, they are failing to match constructs to
variables in the development and implementation of their new analytical metrics, even in top economics
journals.
In this paper, we use an approach to the analysis of value chains that has gained substantial momentum
as one among many possible examples that demonstrate the consequences of such conceptual misunderstandings, and by developing and implementing properly formulated value chain metrics, we demonstrate
the ramifications of neglecting the Stone enhancements and their conceptual underpinnings. We discuss
these issues in the context of value chain metrics, then take advantage of the Stone framework to develop
conceptual and methodological consistency and demonstrate the consequences empirically by comparing the
results from correct formulations to those from the incorrect formulations being replicated in the literature.
We show that incorrect formulations can have substantial empirical manifestations.
Our primary objective is to draw researchers attention back to Stone’s developments and call for the effort
necessary to understanding properly the conceptual and theoretical foundations of modern systems of national accounts. Not only can the neglect of the differences in conceptual underpinnings of the two accounting
systems lead to theoretical, conceptual, and empirical inconsistencies, but the additional information in the
Stone supply-use framework can sometimes provide excellent avenues for problem solutions that are simply
not available when analysis is limited only to Leontief accounts.
Our secondary objective is to highlight and support the critical role of modern publication policies that
require that authors submit with their publications the code and data that enables reproduction of results.
These resources can be extremely effective in supporting the knowledge building enterprise and the kinds
of course corrections that ensure the integrity of and role for science in modern society. These policies are
vitally important and scientists have a corresponding obligation to engage in reproducing published results
and reporting issues encountered in the process.
1 The UN SNA is not alone in adopting this framework. The OECD has adopted the Supply-Use framework (OECD, 2018),
and the recently developed World Input-Ouput Database (Dietzenbacher et al., 2013) organizes it data similarly.
3/17
The remainder of the paper is organized as follows. In the next section we introduce recent developments in
value chain literature, and in so doing we begin to identify the confusion that can arise from a less than comprehensive understanding of the Stone framework and its underlying definitional and structural foundations.
In section 2 we review the salient features of the Stone accounting framework, provide a corrected value chain
measure that parallels the incorrectly implemented upstreamness measure in the literature, and demonstrate
the consequences empiricially. Section 3 returns to the basic definition of value chains and demonstrates
one kind of analytical approach that can only be supported by the Stone framework, reinforcing the power,
flexibility, and wider range of potential applications than can be developed and implemented in a Leontief framework. Section 4 discusses implications for practical application and for science and knowledge
accumulation, and the final section summarizes the paper’s contributions.
1
Value Chains
The characterization of global supply chains is a topic that has gained visibility and importance in recent
literature. Various authors have approached supply and value chains from perspectives that include business
transaction optimization, economic development policy, and property rights. An important case in point
is a paper by Antràs and Chor (2013) (AC) that addressed the problem of measuring the degree to which
industries are upstream or downstream in the global value chain, including an algorithm that they developed
and used for these purposes. Since the 2013 AC publication and its companion, Antràs et al. (2012) (ACFH),
a growing number of publications have built upon the incorrect measurement formulation and accompanying
computer code. Unfortunately, their algorithmic implementation fails to recognize and account for the differences and distinctions between historical input-output (IO) accounting frameworks based on interindustry
flows matrices on the one hand, and modern supply–use, or commodity-by-industry accounting frameworks
on the other. Further, the realities of modern accounting frameworks make explicit the opportunity and need
for researchers to identify differences in goals and objectives that might lead to the selection of alternative
flows matrix formulations for identifying value chain upstreamness and downstreamness.
These two papers have introduced to the field an approach that overlooks the fundamental accounting
relationships and identities of the current generation of IO accounting frameworks. In so doing, they have
laid a potentially problematic foundation for a set of upstream and downstream production system linkage
measures that is based on Leontief foundations but mis-applies data organized under the Stone accounting
system. Whereas the availability of supporting code has facilitated the replication of the error by those who
adopted it without scrutiny, it has also facilitated the critical assessment and corrective actions presented
below.
1.1
Commodity Versus Industry: A Necessary Distinction
The statement “dij Yj , is precisely the value of commodity i used in j’s production” (ACFH, page 414) refers
to their formulation of the coefficient dij as an element of a Use table, and Yj as industry j output. However,
this definition is somewhat obscured by the fact that industries produce more than one commodity. Hence,
dij would have a very different interpretation, and a different value, if it were drawn from a commodityby-commodity IO table, in which the denominator would be the value of commodity rather than industry
output. The choice of the denominator is critical to the interpretation and the value of dij , especially
because the gross output of industry j can be very different from the gross output of commodity j. For
example, the most upstream industry in their analysis based on 2002 U.S. data is the Petrochemical sector.
Petrochemical commodity gross output is 22% larger than petrochemical industry gross output. This is
because other industries also produce petrochemical commodities as a secondary product. The assumption
that commodity and industry gross output are the same is quite clearly invalid.
The definitions of measures in AC are equally imprecise. For example, their “first measure is the ratio of
the aggregate direct use to the aggregate total use (DUse TUse) of a particular industry i ’s goods, where
the direct use for a pair of industries (i, j ) is the value of goods from industry i directly used by firms in
industry j to produce goods for final use, while the total use for (i, j ) is the value of goods from industry i
used either directly or indirectly (via purchases from upstream industries) in producing industry j ’s output
for final use” (p. 2131). To be accurate and correct, the direct use definition would need to refer to an
4/17
industry-by-industry intermediate transaction matrix, and the total use definition would need to refer to
an industry-by-industry Leontief inverse matrix post-multiplied by a diagonal matrix of the total output
by industry. Instead, they draw their data from a commodity-by-industry Use matrix to implement their
measures. Neither reference to “the value of goods from industry i ” is accurate when the unadjusted Use
matrix is the data source. The Use matrix is neither a flows matrix nor an industry-by-industry matrix.
To emphasize the importance of these distinctions, focus again on the Petrochemicals industry and commodity. Table 1 of Antràs et al. (2012), reproduced below as Table 1, identifies Petrochemicals (325110) as
the industry with the highest upstreamness measure. However, the 2002 Make matrix companion to the Use
table used for that analysis indicates that petrochemical commodities (products) account for only 36% of
this industry’s total gross output. The remainder of the petrochemical industry’s gross output is secondary
production, i.e., production of other commodities. Likewise, the petrochemicals industry produces only 44%
of petrochemicals commodity output. The rest of the petrochemicals commodity production is secondary
production from other industries.
Table 1: Least and Most Upstream Industries (Manuf.)
US IO2002 industry
Upstreamness
Automobile (336111)
1.000
Light truck and utility vehicle (336112)
1.001
Nonupholstered wood furniture (337112)
1.005
Upholstered household furniture (337121)
1.007
Footwear (316200)
1.007
Alumina refining (33131A)
3.814
Other basic organic chemical (325190)
3.853
Secondary smelting of aluminum (331314)
4.064
Primary smelting of copper (331411)
4.355
Petrochemical (325110)
4.651
Source: Authors’ calculations, replicating (Antràs et al., 2012, p. 415).
Secondary production is precisely the reason why the commodity-by-industry framework for IO accounting
was developed. Ignoring these definitional differences can result in substantial errors, mis-attributions, and
misinterpretations. Therefore, we clarify and make explicit the relevant definitions and describe the ways in
which the variables should be used in studying value chains. Analysts’ choices of model structure will depend
on whether the goal of the analysis is to identify commodities used to produce other commodities or in the
industries involved in interindustry the production of commodities. If it is the former, then a commodityby-commodity IO specification would be appropriate, while if it is the latter, one would presumably use
an industry-by-commodity IO table. Likewise, an industry-by-industry model would be used when focusing
explicitly on interactions among industries.2 We review below the ACFH presentation so that we can assess
whether the definition that is used in the paper relates to any of these alternatives, and if not, whether it is
appropriate for their demonstration analysis.
1.2
The Upstreamness Measure
To develop their upstreamness measure, ACFH “begin by considering an N-industry closed economy with
no inventories. For each industry i ∈ 1, 2, · · · , N , the value of gross output (Yi ) equals the sum of its use as
a final good (Fi ) and its use as an intermediate input to other industries (Zi )” (p. 412).
Y i = Fi + Z i = Fi +
N
X
dij Yj
(1)
j=1
2 A post-processing option for some policy purposes might be to conduct the analysis in a commodity-by-commodity framework, and then convert to industry space using the standardized Make table to transform from commodity output to industry
output, as described in Section 2, below.
5/17
They define dij as “the dollar amount of sector i’s output needed to produce one dollar’s worth of industry
j’s output.” This balance equation is correctly presented, provided that all variables are defined in industry
space, i.e., Y , F , and Z are industry rather than commodity sectors, and that the term “sector i” in the
definition of dij refers to industry sector i.
They next derive equation (2), which is the Leontief inverse matrix expressed as an infinite series of terms
that can be reduced to Y = (I − D)−1 F , where (I − D)−1 is the traditional Leontief inverse. Based on
these relationships and definitions, they then present their upstreamness measure, and describe it as “the
(weighted) average position of an industry’s output in the value chain, by multiplying each of the terms in”
(p. 413) the infinite series by their distance from final use plus one and dividing by the gross output of the
industry. For industry i, this yields
U1i
Fi
=1 +2
Yi
PN
+4
j=1
dij Fj
Yi
PN
j=1
PN PN
j=1
+3
PN PN
k=1
l=1
k=1
dik dkj Fj
Yi
dil dlk dkj Fj
Yi
(2)
+ ···
Note that the denominator has changed now from Yj to Yi , a step that shifts the context from purchases
to sales, and that they justify by noting that consideration should be taken of industries that sell to other
upstream industries. They go on to provide a computational reduced form for this measure, after establishing
its equivalence to Fally’s (2011) upstreamness measure, which has the following compact expression:
U1 = (I − ∆)−1 · 1
(3)
d Y
“where ∆ is the matrix with ijYi j in entry (i, j) and 1 is a column vector of ones.” Finally, they “take the
reciprocal to obtain DownMeasure for each industry i” (Antràs and Chor, 2013, p. 2162). They go on to
provide two economic interpretations for these upstreamness measures in which the emphasis appears to be
linkages among industries and not products.3,4
The conceptual inconsistency arises in the implementation of the upstreamness measure, which proceeds by
replacing dij Yj in matrix ∆ with Uij , the ij th cell from the Use matrix. Contrary to ACFH footnote 3, which
states that “the coefficient dij is computed as the total purchases by industry j of industry i’s output,”
they draw coefficients from the Use table, which depicts the total purchases by industry j of commodity
i, and not simply purchases from industry i. More than one industry can produce any given commodity,
and the difference between industry and commodity output – as we have seen – can be substantial. More
importantly, because commodities are produced by multiple industries, the Use matrix is not a conventional
flows matrix as were the historical interindustry transactions matrices. While the columns of the Use matrix
can be conceived of as destinations, the rows identify the commodities that each industry uses, but not the
producing industries.
The inconsistency can be clarified further by returning to the derivation of the reduced form upstreamness
expression
PN from the power series expansion, which was derived from the accounting identity Yi = Fi + Zi =
Fi + j=1 dij Yj . Consider the dil dlk dkj term of ACFH equation (2), where these d coefficients are defined
as ratios of industry input dollar per industry output dollar. If we assign values of .1, .2, and .3 to these
coefficients, then every dollar of output from industry j will require $0.3 of input k, which will create a
requirement for $(.2)(.3) = $0.06 of input l, which will then require $(.1)(.2)(.3) = $0.006 of input i for its
production. The numerators and denominators have the same dimension (industry $), so the interpretation
3 We note that Dietzenbacher and Romero (2007) developed and reported measures that are virtually identical to these
measures.
4 The authors also make an open-economy adjustment, justified by noting that the data used to construct their matrix of
US IO coefficients “do not distinguish between flows of domestic goods and international exchanges” (p. 414). The result is
an adjustment factor for the IO coefficients matrix that transforms its interpretation from a technical relationship to a trade
relationship. The adjustment factor is the ratio of domestic output of industry i to domestic use (absorption) of industry i
output.
6/17
of the product is clear and consistent. If, however, the dij coefficients have industry denominators but
commodity numerators, then the product now reflects commodity i required to produce industry l output
times commodity l required to produce industry k output times commodity k required to produce industry
j output. But because each of these industries produces secondary products, the one-to-one relationship
is lost; the product of this multiplication can only make sense dimensionally, and can therefore only have
an unambigouously straightforward interpretation if and only if these industries produce only their own
commodities, which would mean that industry and commodity output would have to be identical. This
kind of system would be reflected in a Make table with nonzero elements only on the diagonal. Were these
coefficients defined with commodity terms in both numerator and denominator, they would be interpreted
as commodity i required in the production of commodity l times commodity l required in the production
of commodity k times commodity k required to produce commodity j, and this would be dimensionally
consistent.
The crux of the problem is that commodity required to satisfy industry demand results not only in the
production of the industry’s primary commodity, but also the production of secondary commodities, and
this happens at every term in the power series expansion, resulting in the loss of ability to trace commodities
unambiguously through the supply chain.
2
The Stone Framework and Upstreamness Reformulation
The values in the Use table are associated behaviorally with columns. They represent column industry
requirements of row commodity inputs, so standardizing by row commodity output values is not particularly
useful. This does not render the development of an ACFH-type upstreamness measure intractable, of course.
The modern accounting system that is the commodity-by-industry framework was devised precisely to accommodate the need to work analytically with systems in which industries produce multiple commodities.
Indeed, developing the linkage matrices in industry-by-industry, commodity-by-commodity, industry-by commodity, and even commodity-by-industry format is possible using precisely the same database that ACFH
used to implement their measure. We provide below the fundamental accounting equations that support the
construction of these requirements coefficients tables.
2.1
Conceptual Reformulation
The commodity-by-industry framework is presented below in Figure 1. In conventional IO notation (as in
Miller and Blair, 1985, 2009), the matrix partition U = [uij ] is the Use matrix, V = [vij ] is the Make matrix,
e is commodity final demand expressed here as a single vector, q is commodity gross output, and g is industry
gross output. Only in highly unusual cases will an industry produce no secondary commodities, so rarely
will qi and gi be equal. The va term denotes value added.
The traditional industry output balance equation that ACFH write as Yi = Fi +Zi actually has no simple and
direct counterpart in the modern accounting framework (although one can be derived, it requires assumptions
about secondary production technology and information contained in V ).5 We can, however, express a
PN
commodity output balance equation in this conventional notation as qi = j=1 uij + ei , and we can further
PN
u
define dij = gijj and substitute to obtain qi = j=1 dij gj + ei , maintaining the output balance.
Figure 1: The Commodity-Industry Framework
Source: Adapted from United Nations (1968)
5 The
industry output balance equation is conventionally denoted X=Y+Z.
7/17
In matrix notation, we have the following identities:
Ui + e ≡ q
(4)
Vi≡g
(5)
V ′i ≡ q
(6)
where i is a summing vector, and ′ signifies the transpose operation. We define behavioral relationships as
follows:
B = U ĝ −1
(7)
U = Bĝ
(8)
D = V q̂ −1
(9)
V = Dq̂
(10)
whereˆindicates diagonalization. Equation 7 defines the production requirements of commodities per industry
output dollar, and equation 9 is a statement of the industry-based technology assumption that commodities
are produced by industries in fixed proportions.6 Note that the effect of pre-multiplication of a commodity
vector or matrix by D results in a transformation from commodity-space to industry-space, so V i = g = Dq.
This system allows us to formulate the following commodity balance equation:
q = Bg + e
(11)
q = BDq + e
(12)
q = (I − BD)−1 e
(13)
The BD term is a commodity-by-commodity requirements matrix counterpart to the classical, columnstandardized interindustry Leontief IO coefficients matrix. Equation 13 is the reduced form solution for q as
a function of commodity final demand, e. This expression, however, is consistent with the Leontief demand
driven formulation, in which the values in the Use matrix are divided by their respective column industry
output values. In contrast, the upstreamness measure in AC (and ACFH) shown in equation 3 is formed by
dividing the elements in the Use matrix by respective row commmodity output, although they refer to this
as row industry output.7 To convert the BD matrix to an equivalent and correctly formulated (Ghoshian)
matrix, we can return the BD coefficients to their transactions values using U D, then standardize the rows
by commodity output, q, yielding q̂ −1 U D. The difference between the incorrect q̂ −1 U formulation and
the correct reformulation is quite clearly the D term, which is the essential mechanism that transforms the
industry column dimension of the Use matrix to the commodity output space of vector q. The other difference
is that ACFH convert commodity output to commodity absorption by netting out trade and inventories,
which can be implemented similarly in the new commodity-by-commodity requirements matrix reformulation
by adjusting q for net trade and inventory adjustment before using it as a standardizing vector.8
6 The alternative is the commodity-based technology assumption, which while not used here, could be developed in parallel
fashion.
7 We confirm the authors’ stated intent via reference to the publications, but we confirmed their implementations by evaluating
the code that accompanies those publications.
8 The B D̃ matrix, where D̃ is the Make matrix standardized by q adjusted for net trade and inventory, would be used in the
computation of the counterpart, commodity-by-commodity downstreamness measure used in Antràs and Chor (2013). Although
we have not addressed explicitly the derivation and use of the counterpart interindustry rather than inter-commodity measures,
the development would follow a similar path but would be based on row- or column-standardization of the interindustry
transactions matrix D̃U . This formulation based on the modification of the D matrix is introduced in Jackson (1998).
8/17
2.2
Empirical Example
With the conceptual distinction established, we turn in this section to a demonstration of the empirical
consequences for the computed measures reported in ACFH. Table 2 displays upstreamness scores replicated
using the ACFH algorithm and data and using the corrected algorithm presented in this paper and the ACFH
data. The top seven rows of data report results for the commodity sectors whose scores rank among the top
five using either method, and the last six rows correspond to the five lowest ranking commodity sectors using
either method. Each row shows the scores for sectors that rank highest in the first data column, their ACFH
ranks in the second data column, their scores from the reformulated algorithm in the third data column, and
their respective ranks in the fourth.
Table 2: Upstreamness measure comparisons
ACFH
Score
Rank
Corrected C×C
Score
Rank
NAICS
Sector
325110
331411
331314
325190
33131A
331200
331419
Petrochemical mfg
Primary copper smelting and refining
Secondary aluminum smelt & alloying
Other basic organic chemical mfg
Alum. primary prodn and refining
Steel product from purchased steel
Prim. nonfer. metal smelt & refn
4.6511
4.3547
4.0637
3.8529
3.8144
3.45
3.4186
1
2
3
4
5
16
18
4.1785
6.4031
3.9991
3.5391
4.9632
4.0065
5.6283
4
1
6
15
3
5
2
336111
336112
337122
337121
316200
336213
Automobile mfg
Light truck and utility vehicle mfg
Wood HH furniture mfg.
Upholstered HH furniture mfg
Footwear mfg
Motor home mfg
1.0003
1.0005
1.0052
1.0072
1.0073
1.0123
279
278
277
276
275
274
1.0004
1.0008
1.0072
1.008
1.0454
1.0129
279
278
277
276
267
275
Source: Antràs et al. (2012) and authors‘ calculations. Largest ranking differences in bold.
Note first that there is a good deal of commonality. The ranks in the lowest ranking sectors are quite
similar, which would be expected a) because both formulations rank commodities and not industries, b) because of the general correspondence between industries and commodities, especially for industries producing
consumer products, although even here, Footwear Manufacturing shows an 8-point difference in ranks, and
c) because sectors with the lowest scores are virtually never used as intermediate inputs and are therefore
much less distinguishable in their upstreamness scores. There is somewhat greater discrepancy in the highest
ranked sectors, however, where rank differences are as great as 16 among only the five most highly ranked
upstreamness sectors. Indeed, the average difference in ranks is 7, due to the notable ranking shifts for Other
Basic Organic Chemical Manufacturing, Steel Product Manufacturing from Purchased Steel, and Primary
Smelting and Refining of Nonferrous Metals (except copper and aluminum).
The reformulation reveals substantial differences. Because of the multi-commodity reality of industry production, the second and fifth highest ranking upstreamness sectors using the correct method are not even in
the top 15 sectors using the earlier formulation. Secondary production strongly influences the upstreamness
measures, and cannot be ignored in implementation.
To demonstrate the inaccuracies introduced in counterpart downstreamness measures, we replicate AC’s
results for their DownMeasure (AC, page 2163) and provide in Table 3 corrected rankings for their highlighted
industries. In Table 4 we provide the correctly ranked and highlighted industries that we obtain when using
the correctly formulated commodity-by-commodity matrices along with the AC rankings. As with the
upstreamness comparisons, there is some substantial agreement, but there also are some very substantial
differences.
9/17
Table 3: DownMeasure Comparisons: AC Results
Industry
AC
DownMeasure
Rank
Corrected C×C
Rank
Lowest 10 Values
325110
331411
331314
325190
33131A
325310
335991
325181
331420
325211
Petrochemical mfg
Primary copper smelt & refining
Secondary aluminum smelt & alloy
Other basic organic chemical mfg
Primary alumina refn and prodn
Fertilizer mfg
Carbon and graphite product mfg
Alkalies and chlorine mfg
Copper rolling, drawing, etc.
Plastics material and resin mfg
0.2150
0.2296
0.2461
0.2595
0.2622
0.2658
0.2668
0.2769
0.2769
0.2800
253
252
251
250
249
248
247
246
245
244
250
253
247
241
251
248
246
224
244
219
Highest 10 values
339930
311111
337910
315230
321991
336212
336213
316200
337121
336111
Doll, toy, and game mfg
Dog and cat food mfg
Mattress mfg
Women’s and girls’ apparel mfg
Manufactured home mfg
Truck trailer mfg
Motor home mfg
Footwear mfg
Upholstered HH furniture mfg.
Automobile mfg
0.9705
0.9717
0.9720
0.9762
0.9810
0.9837
0.9879
0.9927
0.9928
0.9997
10
9
8
7
6
5
4
3
2
1
20
7
8
17
6
5
4
3
2
1
Source: Antràs and Chor (2013) from authors’ calculations. Largest rank differences in bold.
Table 4: DownMeasure Comparisons: Corrected CxC Results
Industry
Lowest 10 Values
331411
331419
33131A
325110
331200
325310
331314
335991
333612
331420
Primary copper smelt & refining
Primary nonferrous smelt &refn
Primary alumina refn and prodn
Petrochemical mfg
Steel product mfg from purch steel
Fertilizer manufacturing
Secondary aluminum smelt & alloy
Carbon and graphite product mfg
Industrial drive and gear mfg
Copper rolling, drawing, etc.
Corrected CxC
DownMeasure Rank
0.1597
0.1804
0.2049
0.2467
0.2560
0.2698
0.2703
0.2735
0.2743
0.2750
253
252
251
250
249
248
247
246
245
244
AC
Rank
252
236
249
253
238
248
251
247
205
245
Highest 10 values
311230
Breakfast cereal manufacturing
0.9629
10
15
336612
Boat building
0.9700
9
11
337910
Mattress manufacturing
0.9725
8
8
311111
Dog and cat food manufacturing
0.9786
7
9
321991
Manufactured home mfg
0.9818
6
6
336212
Truck trailer manufacturing
0.9855
5
5
336213
Motor home manufacturing
0.9874
4
4
316200
Footwear manufacturing
0.9917
3
3
337121
Upholstered household furniture mfg
0.9931
2
2
336111
Automobile manufacturing
0.9996
1
1
Source: Antràs and Chor (2013) and authors’ calculations. Largest rank differences in bold.
10/17
3
Consistency with the Value Chain Construct
Although we have provided a correct commodity-by-commodity upstreamness measure, the appropriateness
of interpretation remains unaddressed. According to Kaplinsky and Morris (2001), a simple value chain
“describes the full range of activities which are required to bring a product or service from conception, through
the different phases of production (involving a combination of physical transformation and the input of
various producer services), delivery to final consumers, and final disposal after use.” In our Stone accounting
framework, industries are the activities and commodities are the products. Therefore, a formulation that
more precisely captures the spirit of the value chain definition would be an industry-by-commodity framework
rather than a commodity by commodity framework, because rather than simply identifying the commodity
production required to bring a commodity to market, the industry-by-commodity framework quantifies the
range of activities by industry.9 Further, adding other dimensions of the “range of activities,” such as
employment and income, or ancillary variables like water use or emissions, requires that the activity levels
be quantified by industry and not by commodity. For these reasons, the industry-by-commodity formulation
is preferred when quantifying activity levels. Simply knowing what commodities are required in a value chain
falls short of identifying required activity.
The power of the Stone framework as an analytical tool is demonstrated here yet again, as the transformation
of a Ghoshian- or Leontief-type framework from commodity-by-commodity to industry-by-commodity space
is straightforward and can be achieved simply by pre-multiplying the respective inverse matrices by D, or
D(I − q̂ −1 U D)−1
(14)
D(I − BD)−1
(15)
for the Ghoshian-type formulation and
for a Leontief-type inverse transformation.
In Table 5, we replicate Table 3, substituting the new and properly formulated industry rankings derived
from the industry-by-commodity Ghoshian formulation of equation 14 for the Corrected I×C results column.
As before, while there are some similarities, the difference now are stark. Of the sectors with the ten smallest
values in AC’s ranking (those with the highest upstreamness), only five are in the comparably defined new
set. Further, the average absolute difference in ranks between the AC and new top ten (upstream) sectors
is now 27.3. AC’s most upstream sector becomes the just 71st most upstream sector (253 − 183 + 1 = 71),
and their 8th most upstream sector is the 108th in the new rankings.
Table 6 replicates Table 4, where the top ten sectors ranked highest and lowest using the correct industryby-commodity formulation are shown, along with their corresponding AC rankings. Again, we see major
discrepancies, including the new second lowest ranking – second most upstream – sector now corresponding
to the 96th most upstream sector in the AC ranking, and an average absolute difference in ranks for the
most upstream sectors of 30.3. Commodity-by-commodity rankings clearly misidentify the industry activity
rankings, and any policy or programmatic actions based on the incorrect rankings will be poorly targeted.
9 Of
course, no input-output formulations capture the entire value chain through distribution and post-use disposal.
11/17
Table 5: DownMeasure Comparisons: AC and I-by-C Results
Code
325110
331411
331314
325190
33131A
325310
335991
325181
331420
325211
Industry
Smallest 10 Values
Petrochemical manufacturing
Primary copper smelt \& refining
Secondary aluminum smelt
Other basic organic chemical mfg
Primary aluminum refining \&prod
Fertilizer manufacturing
Carbon and graphite product mfg
Alkalies and chlorine mfg
Copper rolling, drawing, extruding
Plastics material and resin mfg
AC
DownMeasure
Rank
Corrected IxC
Rank
0,2150
0,2296
0,2461
0,2595
0,2622
0,2658
0,2668
0,2769
0,2769
0,2800
253
252
251
250
249
248
247
246
245
244
183
225
204
253
250
237
244
146
249
251
Code
Highest 10 Values
339930
Doll, toy, and game manufacturing
0,9705
311111 Dog and cat food manufacturing
0,9717
337910 Mattress manufacturing
0,9720
315230
Women’s cut and sew apparel
0,9762
321991
Manufactured home mfg
0,9810
336212
Truck trailer manufacturing
0,9837
336213 Motor home manufacturing
0,9879
316200
Footwear manufacturing
0,9927
337121
Upholstered hh furniture mfg
0,9928
336111
Automobile manufacturing
0,9997
Source: Antràs and Chor (2013) from authors’ calculations. Largest
10
2
9
38
8
37
7
5
6
29
5
26
4
48
3
1
2
24
1
9
rank differences in bold.
Table 6: DownMeasure Comparisons: AC with I×C Results
Industry
Code
325190
324110
325211
33131A
331420
331490
322120
323110
332600
335991
Corrected IxC
DownMeasure
Rank
AC
Rank
0,1921
0,1944
0,2246
0,2329
0,2330
0,2592
0,2617
0,2628
0,2657
0,2786
253
252
251
250
249
248
247
246
245
244
250
158
244
249
245
235
192
174
194
247
Smallest 10 Values
Other basic organic chemical mfg
Petroleum refineries
Plastics material and resin mfg
Alumina refining and primary prod
Copper rolling, drawing, extruding
Nonferrous metal rolling, drawing
Paper mills
Printing
Spring and wire product mfg
Carbon and graphite product mfg
Code
Highest 10 values
316900 Other leather and allied product mfg
336111
Automobile manufacturing
315900 Apparel accessories mfg
333315
Photo and copying equipment mfg
339910
Jewelry and silverware mfg
315230
Women’s cut and sew apparel mfg
315220
Men’s cut and sew apparel mfg
315290
Other cut and sew apparel mfg
339930
Doll, toy, and game mfg
316200
Footwear manufacturing
Source: Antràs and Chor (2013) from authors’ calculations.
1,5720
10
88
1,5849
9
1
1,7072
8
51
1,7746
7
28
1,9089
6
34
1,9662
5
7
2,3624
4
20
3,1213
3
30
3,2026
2
10
9,1246
1
3
Largest rank differences in bold.
12/17
4
Discussion
There are two primary contributions of the work reported here. First, there are the implications for those
whose practical applications will be founded on the now-dominant Stone-type IO frameworks instead of the
classical Leontief interindustry accounts. The second contribution lies in the demonstration of the value to
science of modern publication policies that support reproducible research. In this section, we elaborate on
both areas.
4.1
Implications for Practical Application
The need for the correction in formulation arises from the existence of secondary commodity production by
industries. Hence, for an economy whose Make table is strongly diagonal – one with very little secondary
production, empirical results might well differ little as a result of our correction, but the differences will
become more substantial as the ratio of off-diagonal supply-table elements to diagonal elements increases.
The degree of difference for a given set of accounts is an open empirical question, in that primary and
secondary production structures vary geographically.
Because these metrics will most often be used in practice to identify and prioritize industries for further or
attention, higher ranked industries will be of most interest and greatest value to anyone carrying out this
kind of analysis. Therefore, the rank order correlations over the entire vector of ranks are not as relevant to
the analyst as is the ability to correctly identify the top ranked industries. Below we present two additional
perspectives on the impacts of the correct formulation that underscore its importance in practical application.
First, we compute and display graphically the differences in ranks over the entire distribution of industries for
which the measure has been calculated. Although correlations between the entire corrected and uncorrected
upstreamness or downstreamness vectors can be quite strong, the differences in individual ranks can be quite
substantial. Figure 2 presents a plot of the simple differences in ranks for vectors of correct industry-bycommodity and uncorrected commodity-by-commodity values for the upstreamness measure derived from the
same 2002 U.S. data used in ACFH, with sectors ordered according to their original industry classification
scheme sequence.
100
80
60
Difference
40
20
0
-20
-40
-60
0
50
100
150
200
250
Sectors
Figure 2: Downstreamness: Difference in Ranks
Next, we set up the following analysis, again using the same data. We first rank order sectors using the
corrected measure values. We assign a value of 1 for n = 1 (where rank order calculations cannot be
computed), then, we sequentially increase n by 1. As n increases, we select the top n corrected-ranked
sectors from the incorrect vector, generate ranks for values within that set of n sectors, and then compute
13/17
the Spearman rank correlations between vectors of sequentially increasing lengths. The result is a set of
rank correlations for sets of vectors of incrementally larger dimension. The correlations are actually best-case
comparisons, because in most sets, uncorrected sector rank values from the entire set of industries will most
often exceed the value of n, but for the correlations to be valid, their ranks are indexed relative to the n
values in each vector in the comparison set. The result of that exercise for n = 1, ..., 40 is shown below in
Figure 3, which reveals that at n = 5, there is 0.2 correlation; for the top 10 ranked correctly calculated
values, the correlation rises to 0.6, and by n = 40, the correlation is still only 0.65. While there is no clear
way of assessing the statistical significance of these sequential comparisons, the two ranks-vectors are clearly
not correlated strongly enough to suggest that there is only insubstantial difference in the qualitative nature
of the results, and certainly no support for simply ignoring the effect that the correction has on outcomes.
1
0.8
Spearman Rank Correlation
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
0
5
10
15
20
25
30
35
40
Number of elements in correlated vectors (N)
Figure 3: Downstreamness: Spearman Rank Comparison for Top n Ranks
Irrespective of the empirical implications, of course, a published use table in isolation provides only a partial
description of an IO system. This alone is reason enough to base empirical analyses on the correct formulations. The correction is straightforward and the necessary Make table data are virtually always published
in tandem, so there is no reason not to use the correct formulation.
Yet the incorrect formulation continues to be used and are becoming increasingly entrenched in the literature,
as noted in Section 1.
4.2
Implications for Science and Knowledge Accumulation
“Replication is the cornerstone of science. Research that cannot be replicated is not science, and
cannot be trusted either as part of the profession’s accumulated body of knowledge or as a basis
for policy. Authors may think they have written perfect code for their bug-free software package
and correctly transcribed each data point, but readers cannot safely assume that these errorprone activities have been executed flawlessly until the authors’ efforts have been independently
verified.” (McCullough and Vinod, 2003, p. 888)
In 2004, the American Economic Review joined a growing number of other academic journals and “began
requiring ‘data and code sufficient to permit replication’ of a paper’s results, which is then posted on the
journal’s website.”10 This paper aligns with the spirit of these publication policies11 by providing an important course correction for those who might otherwise continue to use errant code submitted with published
contributions. We have presented the correct formulations and new code for the corrected algorithm, and we
have identified data consistent with the new algorithmic formulation. Our corrections also identify important conceptual considerations that future analysts will overlook at peril of arriving at faulty and misleading
10 https://en.wikipedia.org/wiki/The
American Economic Review#cite note-5 Accessed 9/2/2020.
of accepted papers that contain empirical work, simulations, or experimental work must provide, prior to acceptance, information about the data, programs, and other details of the computations sufficient to permit replication, as well as
information about access to data and programs.” https://www.aeaweb.org/journals/policies/data-code/ Accessed 9/2/2020
11 “Authors
14/17
conclusions at best, and ill-formulated policy and programmatic recommendations at worst.
Had the original two papers merely faded in importance and use, such corrections would be less critical. On
the contrary, however, as of late September of 2020, these two papers had been cited more than 260 times,
with more than half (120) of these citations in the most recent 20 months, many in top economics journals.12
Those directly citing articles, in turn, had been cited more than 2,500 times, according to the Web of Science
Citation Index. Further, although we have not checked each and every citing article, we have not identified a
single one that rectifies the issues we have identified, and every application that we have been able to assess
appears to use precisely the same errant code provided by the authors. Without an explicit correction, these
and related errors will continue to propagate.
Conconi et al. (2018) and Alfaro et al. (2019), for example, reused the errant Antras code in their application,
and further replaced row and column sector names with corresponding harmonized sector product names.
This not only created the false appearance of a product-by-product transactions matrix, but this naming
replacement step also deepened conceptual confusion because it resulted in some ‘duplicate’ sectors, which
then they dealt with by inexplicably replacing several product columns in coefficient tables with the averages
of their corresponding coefficient values.13 Had the necessary attention been paid to the precise definitions
and structure of the underlying supporting data upon first publication, such compounding errors could have
been avoided.
There is also increasing evidence throughout the economics literature that industries and products are being
treated as though they were equivalent. Kee and Tang (2016), for example, have adopted this practice in
research on geographical sources of value added in Chinese exports, in which they note that “Industries
are defined according to the industry classification by the United Nations,” and footnote details on which
products from the Harmonized Commodity Description and Coding Systems (HS) were their foci. But
by assigning products to individual producing industries, they are implicitly assuming a kind of sectoral
equivalence that simply does not exist. If industry-specific implications are important for such studies, then
introduced bias might be quite substantial and potentially important. In this and some related contexts,
there might not be particularly good ready alternatives, but the bias that is introduced via such implicit
assumptions should be made explicit.
The power of IO frameworks is being newly and appropriately recognized in a growing number of problem
domains, but equally important to the development of conceptualizations and analytical tools in new application areas is the need to understand how and even why the supporting data were generated. Matching
variables to constructs is as important to IO analysts as understanding data generating processes is to econometricians. Industries and commodities are not conceptual equivalents; organization frameworks like those of
Leontief and Stone have different conceptual underpinnings; and data and classification systems themselves
are designed for specific purposes, as data providing agencies strive to make clear. The U.S. Census website,
for example, notes explicitly that, “NAICS is an industry classification system, not a product classification
system, and therefore neither intended nor well suited for this purpose.” (US Census Bureau, 2020)
The examples in this section underscore the value of publication policies that emphasize transparency and
replication. At a time when science is increasingly under attack, supplying readers with the tools needed
for replication is vitally important, as is carrying out such research replications and reporting on those that
reveal conceptual or empirical deficiencies. Those with expertise in data generation can join algorithmic and
problem domain experts to ensure the integrity of the scientific enterprise. Further, when code accompanying
publications is found to be in error, journals in which it is published must curtail its propagation without delay
by removing and, when possible, replacing it with corrected code or a reference to appropriate sources. If we
fail to engage in replication, if we fail to acknowledge research shortcomings, and if we fail to communicate
necessary course corrections, then we relinquish all rights to the defense of science and fall sadly short of our
responsibilities as scientists.
12 A sample of direct citations from 2020 alone includes Antras and de Gortari (2020); Murakami and Otsuka (2020); Kostoska
et al. (2020); Shen and Zheng (2020); Li et al. (2020); Choi (2020); Peng and Zhang (2020).
13 At best, averaging two or more coefficients columns implies that the transactions values all have equal weights, despite the
fact that the values initially used in standardize the columns are clearly unequal.
15/17
5
Summary
In this paper, we identified an important inconsistency in the formulation and implementation of upstreamness and downstreamness measures developed and presented in AC and ACFH. The lack of correspondence
between construct and data has carried through to numerous subsequent publications identified in the introduction, necessarily generating inaccurate results. The problem arises because important differences
between historical interindustry and modern commodity-by-industry IO accounting frameworks are overlooked. We develop and implement a correctly formulated and conceptually consistent alternative based on
the commodity-by-industry framework that meets the objectives of the upstreamness and downstreamness
measures. Our empirical demonstration makes clear the need to use the correct formulation when studying
production linkages.
The options available to those studying upstream and downstream linkages in the context of supply and value
chains are also worth considering, because each of these options produces a different kind of information.
First, one can formulate these measures to study either commodity or industry chains. Commodity chain
analyses will reveal information about the production of selected products, while industry supply chains
can reveal information about the linkages among activities, specifically industries that are engaged in the
production of one or more products. Industry and commodity linkages are surely different, and depending on
the goal of the analysis, one or the other classification may be preferred. Second, the coefficients that define
the interactions among commodities or industries can be purged of trade as in the procedures discussed
here, and this provides a focus on the within-region (in this case, nation) production chains. However,
the coefficients also can be based on technical requirements irrespective of origin, and this can provide
information that can be useful in assessing the production structure of an economy relative to potential
development alternatives.
Contrary to complicating the accounting framework, the format of modern systems of national accounts
actually enriches the possibilities for meaningful analysis, and facilitates analyses of products (commodities)
and activities (industries) in economic systems. Time and effort spent deepening awareness and understanding of the underlying accounting conventions and embedded relationships can yield substantial research and
application dividends.
16/17
References
Alfaro, L., Chor, D., Antras, P., and Conconi, P. (2019). Internalizing Global Value Chains: A Firm-Level
Analysis. Journal of Political Economy, 127(2):508–559.
Antràs, P. and Chor, D. (2013). Organizing the global value chain. Econometrica, 81(6):2127–2204.
Antràs, P., Chor, D., Fally, T., and Hillberry, R. (2012). Measuring the upstreamness of production and
trade flows. The American Economic Review: Papers and Proceedings, 102(3):412–416.
Antras, P. and de Gortari, A. (2020). On the Geography of Global Value Chains. Econometrica, 88(4):1553–
1598.
Choi, J. (2020). The Global Value Chain Under Imperfect Capital Markets. World Economy, 43(2):484–505.
Conconi, P., Garcı́a-Santana, M., Puccio, L., and Venturini, R. (2018). From Final Goods To Inputs: The
Protectionist Effect Of Rules Of Origin. American Economic Review, 108(8):2335–2365.
Dietzenbacher, E., Los, B., Stehrer, R., Timmer, M., and de Vries, G. (2013). The construction of world
input-output tables in the WIOD project. Economic Systems Research, 25(1):71–98.
Dietzenbacher, E. and Romero, I. (2007). Production chains in an interregional framework: Identification
by means of average propagation lengths. International Regional Science Review, 30(4):362–383.
Jackson, R. (1998). Regionalizing national commodity-by-industry accounts. Economic Systems Research,
10(3):223–238.
Kaplinsky, R. and Morris, M. (2001). A Handbook For Value Chain Research (Vol. 113). Ottawa: Idrc.
Kee, H. L. and Tang, H. (2016). Domestic Value Added In Exports: Theory And Firm Evidence From
China. American Economic Review, 106(6):1402–1436.
Kostoska, O., Stojkoski, V., and Kocarev, L. (2020). On the Structure of the World Economy: An Absorbing
Markov Chain Approach. Entropy, 22(4).
Leontief, W. W. (1936). Quantitative input and output relations in the economic systems of the United
States. The Review of Economic Statistics, pages 105–125.
Li, Y., Sun, H., Huang, J., and Huang, Q. (2020). Low-End Lock-In of Chinese Equipment Manufacturing
Industry and the Global Value Chain. Sustainability, 12(7).
McCullough, B. D. and Vinod, H. D. (2003). Verifying the solution from a nonlinear solver: A case study.
American Economic Review, 93(3):873–892.
Miller, R. E. and Blair, P. D. (1985). Input-Output Analysis: Foundations and Extensions. Prentice-Hall,
Englewood Cliffs, New Jersey, USA.
Miller, R. E. and Blair, P. D. (2009). Input-Output Analysis: Foundations and Extensions. Cambridge
University Press, Cambridge, UK.
Murakami, Y. and Otsuka, K. (2020). Governance, Information Spillovers, and Productivity of Local Firms:
Toward an Integrated Approach to Foreign Direct Investment and Global Value Chains. Developing
Economies, 58(2):134–174.
OECD (2018). Supply and Use Indicators. https://www.oecd-ilibrary.org/content/data/f5ff195e-en.
Peng, J. and Zhang, Y. (2020). Impact of Global Value Chains on Export Technology Content of China’s
Manufacturing Industry. Sustainability, 12(1).
Press Release: Nobel Media AB2020 (1973). Prize in Economic Science in Memory of Alfred Nobel to Professor Wassily Leontief. https://www.nobelprize.org/prizes/economic-sciences/1973/press-release. Accessed
2020-02-27.
Press
Release:
Nobel
Media
AB2020
(1984).
Richard
Stone
–
https://www.nobelprize.org/prizes/economic-sciences/1984/stone/facts. Accessed 2020-02-27.
Facts.
17/17
Shen, C. and Zheng, J. (2020). Does Global Value Chains Participation Really Promote Skill-Biased Technological Change? Theory And Evidence From China. Economic Modeling, 86:10–18.
United Nations (1968). A System of National Accounts, volume 2 of Series F. United Nations, New York, 3
edition.
US Census Bureau (2020). North American Industry Classification System (NAICS): What is NAICS and
how is it used?. https://www.census.gov/eos/www/naics/faqs/faqs.html. Accessed 2020-10-01.