Barrowman 2014, Correlation, Causation, and Confusion
Barrowman 2014, Correlation, Causation, and Confusion
Barrowman 2014, Correlation, Causation, and Confusion
C
ausation has long been something of a mystery, bedeviling philosophers
and scientists down through the ages. What exactly is it? How can it be
measured — that is, can we assess the strength of the relationship between a
cause and its effect? What does an observed association between factors — a
correlation — tell us about a possible causal relationship? How do multiple
factors or causes jointly influence outcomes? And does causation even exist “in
the world,” as it were, or is it merely a habit of our minds, a connection we
draw between two events we have observed in succession many times, as
Hume famously argued? The rich philosophical literature on causation is a
testament to the struggle of thinkers throughout history to develop
satisfactory answers to these questions. Likewise, scientists have long wrestled
with problems of causation in the face of numerous practical and theoretical
impediments.
Yet when speaking of causation, we usually take for granted some notion of
what it is and how we are able to assess it. We do this whenever we consider
the consequences of our actions or those of others, the effects of government
interventions, the impacts of new technologies, the consequences of global
warming, the effectiveness of medical treatments, the harms of street drugs, or
the influence of popular movies. Some causal statements sound strong, such as
when we say that a treatment cured someone or that an announcement by the
government caused a riot. Others give a weaker impression, such as when we
say that the detention of an opposition leader affected international
perceptions. Finally, some statements only hint at causation, such as when we
say that the chemical bisphenol A has been linked to diabetes.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 1/28
14.10.2020 Correlation, Causation, and Confusion
In recent years, it has become widely accepted in a host of diverse fields, such
as business management, economics, education, and medicine, that decisions
should be “evidence-based” — that knowledge of outcomes, gathered from
scientific studies and other empirical sources, should inform our choices, and
we expect that these choices will cause the desired results. We invest large
sums in studies, hoping to find causal links between events. Consequently,
statistics have become increasingly important, as they give insight into the
relationships between factors in a given analysis. However, the industry of
science journalism tends to distort what studies and statistics show us, often
exaggerating causal links and overlooking important nuances.
Causation is rarely as simple as we tend to assume and, perhaps for this reason,
its complexities are often glossed over or even ignored. This is no trifling
matter. Misunderstanding causal links can result in ineffective actions being
chosen, harmful practices perpetuated, and beneficial alternatives overlooked.
Unfortunately, the recent hype about “big data” has encouraged fanciful
notions that such problems can be erased thanks to colossal computing power
and enormous databases. The presumption is that sheer volume of
information, with the help of data-analysis tools, will reveal correlations so
strong that questions about causation need no longer concern us. If two events
occur together often enough, so the thinking goes, we may assume they are in
fact causally linked, even if we don’t know how or why.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 2/28
14.10.2020 Correlation, Causation, and Confusion
Puzzles of Causation
L
et us begin with a familiar example. We know that smoking causes lung
cancer. But not everyone who smokes will develop it; smoking is not a
sufficient cause of lung cancer. Nor is smoking a necessary cause; people who do
not smoke can still develop lung cancer. The verb “to cause” often brings to
mind unrealistic notions of sufficient causation. But it is rare that an event has
just one cause, as John Stuart Mill noted in A System of Logic (1843):
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 3/28
14.10.2020 Correlation, Causation, and Confusion
Ironically, a leading opponent of the claim that smoking causes lung cancer
was geneticist Ronald A. Fisher, one of the foremost pioneers of modern
statistical theory. A number of studies showed an association between smoking
and lung cancer, but Fisher questioned whether there was enough evidence to
suggest causation. (Although technical distinctions between correlation and
association are sometimes made, these terms will be used synonymously in this
essay.) Fisher pointed out, for instance, that there was a correlation between
apple imports and the divorce rate, which was surely not causal. Fisher thereby
launched a cottage industry of pointing out spurious correlations.
The fact that Fisher was himself a smoker and a consultant to tobacco firms
has at times been used to suggest a conflict of interest. But even if he was
wildly off base regarding the link between smoking and lung cancer, his
general concern was valid. The point is often summed up in the maxim,
“Correlation is not causation.” Just because two factors are correlated does not
necessarily mean that one causes the other. Still, as Randall Munroe, author of
the webcomic xkcd, put it: “Correlation doesn’t imply causation, but it does
waggle its eyebrows suggestively and gesture furtively while mouthing ‘look
over there.’” We are tempted to think of correlation and causation as somehow
related, and sometimes they are — but when and how?
The modern debate over correlation and causation goes back to at least the
mid-eighteenth century, when Hume argued that we can never directly
observe causation, only “the constant conjunction of two objects.” It is perhaps
not surprising that scientists and philosophers have had mixed feelings about
causation: on the one hand it appears to be central to the scientific enterprise,
but on the other hand it seems disconcertingly intangible. To this day, debate
continues about whether causation is a feature of the physical world or simply
a convenient way to think about relationships between events. During the
eighteenth and nineteenth centuries, statistical theory and methods enjoyed
tremendous growth but for the most part turned a blind eye to causation. In
1911, Karl Pearson, inventor of the correlation coefficient, dismissed causation
as “another fetish amidst the inscrutable arcana of even modern science.” But
developments in the 1920s began to disentangle correlation and causation, and
paved the way for the modern methods for inferring causes from observed
effects. Before turning to these sophisticated techniques, it is useful to explore
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 4/28
14.10.2020 Correlation, Causation, and Confusion
How can factors be correlated but not causally related? One reason is pure
chance: Fisher’s association between apple imports and the divorce rate was
just a coincidence. Today it is easy to generate such spurious correlations. With
the emergence of big data — enormous data sets collected automatically,
combed for patterns by powerful computing systems — correlations can be
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 5/28
14.10.2020 Correlation, Causation, and Confusion
Another reason why two factors may be correlated even though there is no
cause-and-effect relationship is that they have a common cause. Examples of
such “confounding,” as it is known, are all too common in the scientific
literature. For example, a 1999 study published in Nature showed that children
under the age of two who slept with night lights were more likely to have
myopia. Other researchers later showed that myopic parents were more likely
to keep their lights on at night. It may be that the parents were a common
cause of both the use of night lights and, by virtue of genetic inheritance, the
myopia passed on to their children.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 6/28
14.10.2020 Correlation, Causation, and Confusion
wanes. When his symptoms are particularly bad he visits a quack healer and his
symptoms usually improve within a week or two. The trouble is, the
improvement is simply a result of the natural fluctuation of the illness. The
flare-up of symptoms prompts the patient to visit the quack, but due to the
natural course of illness, the flare-up is followed by improvement within a
week or two. Confounding makes the visits to the quack healer appear
effective.
Misleading correlations may also arise due to the way subjects are selected to
be part of a study. For example, there is evidence that certain studies of an
association between breast implants and connective tissue disease may have
suffered from selection bias. Suppose participation in a study was greater for
women with implants and also for women with connective tissue disease
(perhaps these two groups were more likely to respond to a questionnaire than
women from neither group). The study would then include a
disproportionately large number of women with both implants and connective
tissue disease, leading to an association even if there were no causation at all.
Whenever the selection of subjects into a study is a common effect of both the
exposure variable and the outcome, there is a risk of selection bias. It has been
suggested that bias due to a common effect (selection bias) may be more
difficult to understand than bias due to a common cause (confounding). This
makes selection bias particularly problematic.
In the analysis of big data, selection bias may be especially pernicious because
the processes that affect which individuals are included in or excluded from a
database are not always apparent. Additionally, such databases are often
spotty: for a variety of reasons, many records may be missing some data
elements. In some cases, records that have missing values are automatically
omitted from analyses, leading to another form of selection bias. In these cases,
the associations detected may be nothing more than artifacts of the data
collection and analysis.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 7/28
14.10.2020 Correlation, Causation, and Confusion
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 8/28
14.10.2020 Correlation, Causation, and Confusion
accidents. Sometimes, there is confusion around the term “risk factor”: on the
one hand it may simply refer to a marker of risk (a model of car favored by risk
takers), while on the other hand it may refer to a factor that causes risk (a car
that is unsafe at any speed).
Finally, even if there is indeed a causal relationship between two factors, there
is still the question of which is the cause and which is the effect. In other
words, what is the direction of causation? By itself, a correlation tells us
nothing about this. Of course the effect cannot come before the cause —
except in science fiction novels and some arcane philosophical arguments. But
depending on the type of study, the timing of cause and effect may not be
obvious. For example, it has been claimed that active lifestyles may protect
older people’s cognitive functioning. But some evidence suggests that the
causal direction is the opposite: higher cognitive functioning may result in a
more active lifestyle. Misidentification of the direction of causation is often
referred to as “reverse causation” — although it’s the understanding that’s
reversed, not the causation. When one event follows another, we are often
tempted to conclude that the first event caused the second (referred to by the
Latin phrase post hoc ergo propter hoc). But such an association may in fact be
due to chance, confounding, or selection bias.
Causal claims should be subjected to scrutiny and debunked when they do not
hold up. But in many cases there may not be definitive evidence one way or the
other. Suppose a correlation (for example between exposure to a certain
chemical and some disease) is used to support a claim of causation in a lawsuit
against a corporation or government. The defendant may be able to avoid
liability by raising questions about whether the correlation in fact provides
evidence of causation, and by suggesting plausible alternative explanations. In
such situations, the assertion that correlation does not imply causation can
become a general-purpose tool for neutralizing causal claims. Ultimately, this
raises questions about where the burden of proof in a causal controversy
should lie. As we will see, the important point is that this is a discussion worth
having.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 9/28
14.10.2020 Correlation, Causation, and Confusion
S
ome people are tempted to sidestep the problems of distinguishing
correlation from causation by asking what is so important about
causation. If two factors are correlated, isn’t that enough? Chris Anderson,
author of the bestseller The Long Tail (2006) and former editor-in-chief of
Wired magazine, apparently thinks so. In his 2008 article “The End of Theory:
The Data Deluge Makes the Scientific Method Obsolete,” Anderson argued
that in the age of big data, we can dispense with causation:
Suppose a study finds that, on average, coffee drinkers live longer than people
who don’t drink coffee. The ensuing headlines proclaim that “coffee drinkers
live longer,” which would be a true statement. But someone who hears about
this study might say, “I should start drinking coffee so that I’ll live longer.” This
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 10/28
14.10.2020 Correlation, Causation, and Confusion
First, there is an implicit assumption that you only have to start drinking
coffee to be just like the coffee drinkers in the study. The coffee drinkers in the
study were likely different from the people who were not coffee drinkers in
various ways (diet, exercise, wealth, etc.). Some of these characteristics may
indeed be consequences of drinking coffee, but some may be pre-existing
characteristics. Simply starting to drink coffee may not make you similar to the
coffee drinkers in the study.
In everyday life, people routinely make causal claims that would require a
counterfactual analysis to confirm. Thanks to a new diet, your neighbor lost
thirty pounds. A coworker was promoted because she is related to the boss.
Your favorite team performed poorly this year because of the inept manager.
But did your neighbor not also take up jogging? Is that coworker not a top
performer who genuinely deserved a promotion? Were the players on that
team not some of the worst in the league? To assess the claim that A caused B
we need to consider a counterfactual: What would have happened if A had
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 11/28
14.10.2020 Correlation, Causation, and Confusion
been different? To evaluate whether your neighbor’s dieting caused his weight
loss, we need to consider what would have happened had he not dieted, and so
on. Hume put it this way: “We may define a cause to be an object, followed by
another …, where, if the first object had not been, the second never had existed.”
While we can never directly observe the causal effect that we suspect to be
responsible for an association, we are able to observe the association itself. But
in the presence of confounding or selection bias, the association may be quite
misleading. To answer a causal question, counterfactual reasoning — asking
“what if?” — is indispensable. No amount of data or brute computing power
can replace this.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 12/28
14.10.2020 Correlation, Causation, and Confusion
T
he threats of confounding and selection bias and the complexities of
causal reasoning would seem to be formidable obstacles to science. Of
course, scientists have a powerful tool to circumvent these difficulties: the
experiment. In an experiment, scientists manipulate conditions — holding
some factors constant and varying the factor of interest over the course of
many repetitions — and measure the resulting outcomes. When it is possible
to do this, valid inferences can be obtained about a cause and its effect. But as
scientific techniques extended into the social sciences in the nineteenth
century, experiments came to be conducted in settings so complex that it was
often not possible to control all relevant factors.
But it was not until the late 1940s that the randomized controlled trial (RCT)
was introduced in medicine by English epidemiologist and statistician Austin
Bradford Hill in a study on streptomycin treatment of pulmonary tuberculosis.
The RCT was not only a significant innovation in medicine; it also helped
usher in the current era of evidence-based practice and policy in a wide range
of other fields, such as education, psychology, criminology, and economics.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 13/28
14.10.2020 Correlation, Causation, and Confusion
In medicine, the design of the RCT is that eligible patients who consent to
participate in a study are randomly assigned to one of two (or sometimes more)
treatment groups. Consider an RCT comparing an experimental drug with a
conventional one. All patients meet the same criteria for inclusion into the
study — for instance presence of the disease and aged 50 or older — and end
up in one group or the other purely by chance. The outcomes of patients who
received the conventional drug can therefore be used as substitute
counterfactual outcomes for patients who, by chance, received the
experimental drug — that is, the outcomes of group A can be thought of as
what would have happened to group B if group B had received group A’s
treatment. This is because the known factors, such as sex and age, are
comparable between the two groups (at least on average with a large enough
sample). But also any unknown factors, perhaps the amount of exercise or
sleep the patients get, are comparable. None of the known or unknown factors
influenced whether a patient received the conventional or the experimental
drug. RCTs thus provide an opportunity to draw causal conclusions in complex
settings with many unknown variables, with only limited assumptions
required.
However, RCTs are not always an option. For one thing, they can only be used
to evaluate interventions, such as a drug, but many medical questions concern
diagnosis, prognosis, and other issues that do not involve a comparison of
interventions. Also, RCTs of rare diseases may not be feasible because it would
simply take too long to enroll a sufficient number of patients, even across
multiple medical centers. Finally, it would be unethical to investigate certain
questions using an RCT, such as the effects of administering a virus to a
healthy person. So in medicine and other fields, it is not always possible to
perform an experiment, much less a randomized one.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 14/28
14.10.2020 Correlation, Causation, and Confusion
1956. Over 34,000 British doctors and their smoking habits were surveyed over
time, and the results clearly showed rising mortality due to lung cancer as the
amount of tobacco smoked increased, and declining mortality due to lung
cancer the earlier people quit smoking. Some other examples of observational
studies are surveys of job satisfaction, epidemiological studies of occupational
exposure to hazardous substances, certain studies of the effects of global
warming, and comparisons of consumer spending before and after a tax
increase.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 15/28
14.10.2020 Correlation, Causation, and Confusion
associated with birth order, but maternal age may be a confounder since
maternal age increases with birth order. By examining the relationship
between Down syndrome and birth order separately within birth order groups,
known as a “stratified analysis,” the confounding effect of maternal age may be
removed. This type of approach has its challenges; even if it is successful, the
possibility remains that some confounders have not been included in the
adjustment. This problem, known as “unmeasured confounding,”
fundamentally limits the degree of certainty with which conclusions can be
drawn from observational data.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 16/28
14.10.2020 Correlation, Causation, and Confusion
O
ver the course of the last several centuries, increasingly sophisticated
statistical methods have been devised for drawing quantitative
conclusions from observations. However, the distinction between correlation
and causation was not always clearly made, and it was only in the twentieth
century that rigorous attempts to draw causal conclusions from observed data
began to develop in earnest. Various models and methods have been created to
make causal inferences possible — to infer, based on observed effects, a
probable cause for an event.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 17/28
14.10.2020 Correlation, Causation, and Confusion
A second approach to causal inference had its origins in 1923, with a paper by
the Polish statistician Jerzy Neyman introducing an early counterfactual
account of causality in agricultural experiments. His methods were limited to
experiments but were extended by Harvard statistician Donald Rubin in the
1970s to observational studies. Rubin’s causal model was based on the idea of
“potential outcomes” — essentially counterfactuals.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 18/28
14.10.2020 Correlation, Causation, and Confusion
But suppose that the patients were not randomly assigned to treatment
groups, and that this is instead an observational study. Unlike in an RCT,
where patients in the two groups are likely to be very similar, in an
observational study there may be substantial imbalances (in age, sex, wealth,
etc.) between groups. There are a number of ways to address this problem
using Rubin’s framework. Sometimes imbalances between groups can be dealt
with using matching techniques that ensure the two groups are roughly
similar. A related and more complex method is to estimate, for each patient,
the probability that the patient would receive for example treatment A, given
the patient’s characteristics. This estimate is known as a “propensity score,”
first discussed in a 1983 paper that Rubin coauthored. Patients who received
treatment B can then be matched with patients who received treatment A but
who had similar propensity scores. This provides a general scheme for
obtaining substitute counterfactuals that make causal inferences possible. An
important caveat, however, is that this only works if all relevant variables —
any of which could be confounders — are available. For example, the
relationship between alcohol advertising and youth drinking behavior may be
confounded by unmeasured factors such as family history and peer influence.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 19/28
14.10.2020 Correlation, Causation, and Confusion
numerous ways — for instance, they were younger. And they may have also
differed in ways that were not measured, such as the severity of their heart
attacks. The risk is that — once the measured variables such as age are adjusted
for, using a technique like matching — the unmeasured variables could still
substantially bias results. Had the patients been randomized to receive
different treatments, it would have been much easier to estimate the causal
effect of aggressive treatment. But suppose a variable could be identified that
was correlated with the type of treatment received (aggressive or not
aggressive), did not directly affect the outcome, and was not likely to be
correlated with any confounding variables. Such an “instrumental variable” can
be used to form groups of patients such that patient characteristics are similar
between groups, except that the likelihood of receiving the treatment in
question varies between groups. In this way, an instrumental variable can be
considered to be a sort of natural randomizer. In the heart attack study,
patients who lived closer to hospitals that offered aggressive treatment were
more likely to receive such treatment. The authors of the study realized that an
instrumental variable could be based on a patient’s distance to such a hospital
compared to the distance to their nearest hospital. This variable would not be
expected to affect mortality except through the type of treatment received, nor
would it be expected to affect other possible confounding variables. Provided
these assumptions were valid, the instrumental variable approach could
overcome unmeasured confounding to allow causal conclusions to be drawn.
In this case, the instrumental variable analysis showed that aggressive
treatment had the effect of lowering mortality only to a very small degree, in
striking contrast to estimates using more conventional statistical methods. Far
more important for lowering mortality, the study explained, was that patients
received care within twenty-four hours of admission to the hospital.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 20/28
14.10.2020 Correlation, Causation, and Confusion
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 21/28
14.10.2020 Correlation, Causation, and Confusion
T
he late 1980s saw a resurgence
of interest in refining methods
of causal inference with the help of
diagrams like those used in path
analysis and structural equations
modeling. These newer diagrams
are known as “directed acyclic
graphs” (DAGs) and have been
widely used in computer science
and epidemiology. The graphs are
made up of nodes (commonly
shown as circles) representing
variables, connected by one-way
arrows, such that no path leads
from a node back to itself, which
would represent a causal feedback
loop (hence “acyclic”). (See Figure
2.) Powerful theorems about DAGs
are available thanks to a branch of
mathematics known as graph
theory, used for modeling and
analyzing relations within
biological, physical, social, and
information systems.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 22/28
14.10.2020 Correlation, Causation, and Confusion
With the help of DAGs, the conditions that give rise to selection bias and
confounding have been pinpointed, thereby settling an important question in
the analysis of observational data.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 23/28
14.10.2020 Correlation, Causation, and Confusion
Simpson’s Paradox
D
irected acyclic graphs have helped to solve other longstanding puzzles.
Consider observational data on the relationship between a certain
treatment and recovery from an illness. Suppose that patients who are treated
are more likely to recover than those who are not. But when we examine the
data on male and female patients separately, it turns out that among the males,
those who are treated are less likely to recover; similarly, females who are
treated are also less likely to recover. This reversal — known as Simpson’s
paradox after the statistician Edward H. Simpson — may seem surprising, but
it is a real phenomenon. This kind of situation can arise if the patients who
receive treatment are disproportionately male, and the recovery rate for
females is much lower than for males. Sex is thus a confounder of the
relationship between treatment and recovery in this case, and the sex-specific
results should be used for decision-making about the treatment’s effectiveness:
the treatment is not helpful.
But Simpson’s paradox has another surprising aspect. Suppose that the
treatment is suspected of having an effect on blood pressure, and instead of
breaking the data down by sex, the breakdown is by high versus low blood
pressure one week into treatment (or at roughly the same time in the
untreated group). Imagine that the data of the two groups — high and low
blood pressure — are like the data of the two groups broken down by sex in the
earlier scenario. As before, the patients who are treated are more likely to
recover than those who are not, yet within both of the subgroups (high and low
blood pressure) the patients who are treated are less likely to recover. But in
this case blood pressure, unlike sex, is not a confounder of the relationship
between treatment and recovery, since it is not a common cause of treatment
and recovery. In this scenario, the overall results rather than the subdivided
results should be used for decision-making.
The paradox that today carries Simpson’s name was first identified at the
beginning of the twentieth century, but Simpson examined it in detail in a 1951
paper and noted that the “sensible” interpretation of the data should
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 24/28
14.10.2020 Correlation, Causation, and Confusion
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 25/28
14.10.2020 Correlation, Causation, and Confusion
A
s applications of causal inference are becoming increasingly common in a
variety of fields — not only in computer science and medicine but also in
sociology, economics, public health, and political science — it is appropriate to
consider the achievements and limitations in this field over the course of the
near-century since Sewall Wright’s groundbreaking contributions to causal
inference, his path analysis. The advances since the 1920s have truly been
transformative, with the development of ever more sophisticated methods for
solving complex problems, especially in fields such as epidemiology that rely
largely on observational data rather than experiments. Much progress has been
made in untangling the difficulties surrounding counterfactuals — of finding
ways to know what would have happened if a given intervention, such as a
medical treatment, had not occurred. Tools like the randomized controlled
trial have become so widely accepted that it is hard to imagine our world
without them.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 26/28
14.10.2020 Correlation, Causation, and Confusion
One of the greatest challenges is the intricacy of the causal relationships that
underlie so many phenomena: What causes today’s weather? What are the
effects of violent video games? What will be the results of a tax increase?
Causal diagrams have made a substantial contribution to our ability to analyze
such complex situations — but they can yield unreliable conclusions if the
causal structure is incorrectly specified.
Hume’s point stands: correlation can be directly observed, but never the causal
link between one event and another. Causal inference depends on more than
just the data at hand; the validity of the conclusions always hinges on
assumptions — whether they are based on external evidence, expert
background knowledge, theory, or guesswork. Curiously, the current
excitement about big data has encouraged in some people the opposite notion.
As Chris Anderson writes:
Nick Barrowman, “Correlation, Causation, and Confusion,” The New Atlantis, Number 43, Summer/Fall
2014, pp. 23–44.
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 27/28
14.10.2020 Correlation, Causation, and Confusion
https://www.thenewatlantis.com/publications/correlation-causation-and-confusion 28/28