Causal Inference, Michael E. Sobel

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Causal Inference theories of causation (Sobel, 1995).

Second, each
unit can be exposed, at least in theory, to any
MICHAEL E. SOBEL level of the cause; were this not so, unit effects
Columbia University, USA could not be defined as an intra-unit comparison
of outcomes under different levels of the cause.
Related, implicit in the notation is the stable unit
Over the past 30–40 years, the way social scien- treatment value assumption or SUTVA (Rubin,
tists make inferences about causal relationships 1980). SUTVA consists of two components. If
has been undergoing a transformation. Previ- two alternative forms of a treatment, for example,
ously, it was common to list causes of an outcome a medicine administered as a pill or capsule,
(outcomes) of interest, then estimate a regression result in different outcome values, these should
(structural equation model) and simply declare be regarded as different treatments. In random-
all coefficients effects. ized experiments, an experimenter designs and
Starting with Rubin (1974, 1977), who bridged implements treatments that are presumably
the gap between randomized experiments and administered in the same way (or a way that
observational studies, statisticians developed a does not affect outcome values) to all units. In
framework for causal inference that has now observational studies, where this is not the case,
made its way into mainstream social science. further elaboration may be needed to avoid an
In this framework, the estimand (effect) is first ill-posed question. For example, an investigator
defined. Second, as a result, it becomes possible to who wants to study the effect of some college or
ask if the effect of interest is identified. Estimation more on age 40 income among persons who have
comes third. In the previous approach, the effect graduated from high school, but not attended
is never defined, so it is not even clear what one college needs to state how such persons will be
is talking about. (hypothetically) exposed to college, as the timing
The key contribution originates with Neyman of their entry and whether or not they are to be
(1990 [1923]), who invented potential outcomes given a scholarship will most likely affect the
notation. Suppose a cause Z, for example, educa- income they would receive were they to attend
tional level, takes two values (0 for high school college. Next, there is no interference among
or less, 1 for some college or more). Let Y be an units: i’s potential outcomes do not depend on
outcome of interest, for example, income at age assignments of other units. In the social sciences,
40. The restriction to two values is for simplicity interference may occur through social interac-
only. Although each unit i takes only one value Zi tions and may be of great interest. Both types of
of education, we nevertheless define the income SUTVA violations have received attention.
i would have under both levels; thus, i has two A key insight from the statistical literature
potential outcomes Yi (0) and Yi (1), only one of is that the identification of causal parameters
which is observed. The unit effect of educational generally depends on the assignment mech-
level 1 versus 0 can then be defined as a compari- anism, that is, the way units are allocated to
son of Yi (1) with Yi (0), for example, Yi (1) − Yi (0). levels of the cause. Suppose the average effect
Although these effects cannot be ascertained, the E(Y(1) − Y(0)) = E(Y(1)) − E(Y(0)) is the target
average effect E(Y(1) − Y(0)) over some popula- estimand. In an observational study where units
tion P is identified under certain conditions, and may self-select into levels, Ȳ 1 − Ȳ 0 , the mean
when these hold, can be estimated. outcome difference between units selecting levels
It is worth spelling out some of the ideas about 1 and 0 will not generally be unbiased or consis-
causation implicit above. First, causation is singu- tent for the average effect, as Ȳ 0 (Ȳ 1 ) estimates
lar: the unit effects may vary. Singular causation E(Y | Z = 0) (E(Y | Z = 1)), the average outcome
can be reconciled with more stringent regularity among units choosing level 0 (1). However, if the

The Blackwell Encyclopedia of Sociology. Edited by George Ritzer and Chris Rojek.
© 2019 John Wiley & Sons, Ltd. Published 2019 by John Wiley & Sons, Ltd.
DOI: 10.1002/9781405165518.wbeos1196
2 CAUSAL INFERENCE

potential outcomes for the units receiving level case, biased estimates result. As an alternative,
0 (1) were a random sample from the potential one might attempt to match level 0 and level
outcomes Yi (0) (Yi (1)), Ȳ 1 − Ȳ 0 would be unbi- 1 units on the confounders. Rosenbaum and
ased for the average effect, as the assignment Rubin (1983) showed that if assignment is uncon-
mechanism is “unconfounded,” that is, Zi is founded, given X, it is unconfounded given the
independent of the potential outcomes (Yi (0), propensity score Pr(Z = 1 | X), thereby justifying
Yi (1)), which implies E(Y | Z = 0) = E(Y(0)), and the estimation of effects by matching and subclas-
E(Y | Z = 1) = E(Y(1)). sification on this one-dimensional summary of
In a completely randomized experiment, n0 X; even so, good matches for units with estimated
and n1 units are assigned to level 0 and 1, respec- propensity scores near 0 and 1 are often difficult
tively, and all possible assignments are equally to obtain, leading some to estimate effects only
likely, implying assignment is unconfounded. on intervals where there are sufficient numbers
In a block randomized experiment, strata S(X) of units at both levels. Although improved pro-
are formed using variables X (covariates) prior cedures for matching directly on X are preferable
to Z, and a completely randomized experiment and now widely available, the propensity score
is conducted within each stratum b = 1, … , B. remains central in some of the other approaches
Let nb = n0b + n1b , where n0b (n1b ) denotes the to estimation, for example inverse probability
number of units in stratum b assigned to level weighting and doubly robust estimation, also
0 (1) of the cause. Let Ȳ 0b (Ȳ 1b ) denote the sam-
in longitudinal causal inference. The use and
ple average for units in block b with level 0 (1);
adaptation of machine-learning procedures to
then Ȳ 1b − Ȳ 0b is unbiased for the average effect
the estimation of effects is another active area of
E(Y(1) − Y(0) | S(X) = b). In this type of exper-
research.
iment, both the average effect and the average
While the unconfoundedness assumption is
effect within strata are typically of interest. The
∑B not testable without further assumptions, it is
weighted average n−1 b=1 nb (Ȳ 1b − Ȳ 0b ) is a
important to try to ascertain whether or not it
consistent estimator of the average effect.
is credible. For example, if it is known that a
In observational studies, the assignment mech-
particular outcome W is unaffected by the cause
anism is not under an investigator’s control, but
Z, yet W is associated with Z, this suggests the
assignment is unconfounded given the covariates
unconfoundedness assumption (with respect to
X that are associated with both Z and the poten-
Y) is less plausible. Sensitivity analyses, in which
tial outcomes Y(0), Y(1). If the investigator can
one tries to ask how big a departure from uncon-
identify these confounders, the average effect is
foundedness is needed to undermine a study’s
identified and in principle estimation is straight-
forward, similar to the case of the block random- results, are also useful in observational studies.
ized experiment. This entry introduces the key ideas under-
In practice, implementation of the theory is pinning the approach to causal inference from
not so straightforward. In randomized experi- statistics, and considers some very basic esti-
ments, subjects do not always take the assigned mands. For a full-length introduction to these
treatment. This has led to the consideration of ideas that offers great insight with a minimum of
new estimands (complier average causal effect) technical detail, Rosenbaum (2017) is outstand-
and a literature on compliance (Angrist, Imbens, ing. Imbens and Rubin (2015) is also introductory
and Rubin, 1996), subsequently extended to and insightful, and accessible to readers patient
principal stratification, a type of mediation. enough to wade through algebraic details; read-
In observational studies, the vector of con- ers who wish to use the procedures discussed
founders X may be high-dimensional, resulting in this entry should find this book especially
in sparsity. If the form of the regression func- useful. Morgan and Winship (2014) is geared
tions E(Y | X = x, Z = 0) = E(Y(0) | X = x) and primarily to sociologists. There are several books
E(Y | X = x, Z = 1) = E(Y(0) | X = x) are known, covering more advanced subjects and estimands
estimation is straightforward and in empirical not touched upon herein, or only considered in
work it was often assumed, without justification, the briefest way, including mediation and inter-
that the relationship is linear; when that is not the ference (see Hong, 2015; Vanderweele, 2015),
CAUSAL INFERENCE 3

and longitudinal causal inference (Hernan and principles, section 9, trans. D.M. Dabrowska and T.P.
Robins, 2019). Speed. Statistical Science, 5, 465–472.
Rosenbaum, P.R. (2017) Observation and Experiment:
SEE ALSO: Methods; Statistics; Theory and An Introduction to Causal Inference, Harvard Univer-
Methods sity Press, Cambridge, MA.
Rosenbaum, P.R. and Rubin, D.B. (1983) The central
role of the propensity score in observational studies
for causal effects. Biometrika, 70, 41–55.
References
Rubin, D.B. (1974) Estimating causal effects of treat-
ments in randomized and nonrandomized studies.
Angrist, J., Imbens, G., and Rubin, D.B. (1996) Iden- Journal of Educational Psychology, 66, 688–701.
tification of causal effects using instrumental vari- Rubin, D.B. (1977) Assignment to treatment groups on
ables. Journal of the American Statistical Association, the basis of a covariate. Journal of Educational Statis-
91, 444–472. tics, 2, 1–26.
Hernan, M.A. and Robins, J.M. (2019) Causal Inference, Rubin, D.B. (1980) Comment on “Randomization anal-
Chapman and Hall/CRC, Boca Raton, FL. ysis of experimental data: the Fisher randomization
Hong, G. (2015) Causality in a Social World: Mod- test,” by D. Basu. Journal of the American Statistical
eration, Mediation, and Spillover, Wiley Blackwell, Association, 75, 591–593.
Chichester. Sobel, M.E. (1995) Causal inference in the social and
Imbens, G.W. and Rubin, D.B. (2015) Causal Inference behavioral sciences, in Handbook of Statistical Mod-
in Statistics, Social and Behavioral Sciences: An Intro- eling for the Social and Behavioral Sciences (ed.
duction, Cambridge University Press, New York. G. Arminger, C.C. Clogg, and M.E. Sobel), Plenum
Morgan, S.L. and Winship, C. (2014) Counterfactuals Press, New York, pp. 1–38.
and Causal Inference, Cambridge University Press, Vanderweele, T. (2015) Explanation in Causal Inference,
Cambridge. Oxford University Press, New York.
Neyman, J.S. (1990 [1923]) On the application of prob-
ability theory to agricultural experiments. Essay on

You might also like