Uses and Abuses of The Analysis of Covariance
Uses and Abuses of The Analysis of Covariance
Uses and Abuses of The Analysis of Covariance
Abstract
The analysis of covariance (ANCOVA) is a powerful analytic tool, but there
continue to be abuses of the method. We review assumptions and illustrate
legitimate uses of ANCOVA, and summarize statistical packages’ approach to the
method. Finally, we consider how ANCOVA is used in contemporary nursing
research.
Uses and abuses of ANCOVA 2
As many statistics books point out, the analysis of covariance (ANCOVA) has
two primary purposes: (a) to improve the power of a statistical analysis by
reducing error variance, and (b) to statistically "equate" comparison groups. The
first purpose operates well when participants are randomly assigned to their
groups. But using ANCOVA with intact or pre-existing groups can have the
opposite effect, a reduction in statistical power. The second purpose usually
accompanies non-random group comparisons, and analysts apply ANCOVA to
make the group comparisons more “fair.”
In this paper, we review the merits and demerits of these claims for
ANCOVA. More specifically, we explore various ANCOVA pitfalls that can
deliver misleading results for the unwary analyst, and review appropriate uses of
ANCOVA. We also show how statistical packages (BMDP, SPSS, SAS, and
SYSTAT) differ in their approach to ANCOVA. Though our focus is on the
conventional ANOVA formulation, for researchers who subscribe to Cohen’s
(1968) idea that regression analysis can do (just about) anything, our remarks
apply to regression models as well. In fact, regression models may be more
vulnerable to ANCOVA problems because independent variables often serve as
covariates whether or not the researcher intended them to take that role.
When Sir Ronald Fisher invented the ANCOVA model in the 1930s, he took
random assignment and experimental control for granted. Fisher had been studying
agricultural methods, and random assignment was easy to arrange. The point of his
invention was to enhance the precision of the statistical analysis. Today, ANCOVA
is used routinely with quasi-experimental data where treatments cannot—because
of expense, ethical concerns, or general disruptiveness—be randomly assigned to
participants. The inability to assign participants to treatments is particularly
evident in health care research. For example, in comparing lung vital capacity in
smokers and nonsmokers, participants self-select themselves into the two
comparison groups. If the researcher thinks that age might be a confounding
variable, age might be assigned to a covariate role. Whether that decision is a good
or bad one depends largely on two ANCOVA assumptions.
The first statistical assumption is that the covariate(s) is(are) uncorrelated with
other independent variables. In the smoking example, is age correlated with the
independent variable, groups? If the correlation is non-zero, then removing the
variance associated with age will also remove some of the variance associated with
the grouping variable. This in effect leaves less of the dependent variable’s (lung
vital capacity) variance to be accounted for by the independent variable (smoking).
Figure 1 illustrates the situation. Notice that the covariate, age, overlaps with
smoking status (arrowed portion), absorbing some of smoking’s relationship with
lung vital capacity.
Uses and abuses of ANCOVA 3
IV: SMOKING
COV: AGE
Sex F(1,36) = 37.33, p < .001, effect size1 (partial 2) = .51; Smoking F(1,36) =
6.56, p < .01, partial 2 = .15. The interaction term is nonsignificant.
Things change when the data are rerun, using baseline pulse as a covariate.
Once again, the main effects are significant: Sex F(1,35) = 21.81, p < .001, partial
2 = .38; Smoking F(1,35) = 5.71, p < .05, partial 2 = .14. Notice that the effect
size for Sex has dropped substantially. That is because baseline pulse was
correlated with Sex, violating assumption #1.
What about the substantive problem of interpretation? This involves the well
known problem of variance partitioning. Darlington (1968) and other early
analysts described the problem clearly, and admitted no answers. Even today,
Pedhazur (1997) has addressed an entire chapter to the issue, because “variance
partitioning is widely used, mostly abused, in the social sciences for determining
the relative importance of independent variables….[In the last 15 years,] abuses of
variance partitioning have not abated but rather increased” (p. 243). When
variables covary, there is no satisfactory way to assign unique explanatory power
to them individually. One can make odd sounding statements that reveal how
confusing the situation is. For example, “Adjusting for initial pulse rate, sex is
associated with post-exercise pulse rate.” What does it mean to hold pulse rate
constant, as though everyone had the same initial pulse rate? What value is the
hypothetical constant pulse rate? Could one choose a different pulse rate to hold
constant? Back to Pedhazur: “Unfortunately, applications of ANCOVA in quasi-
experimental and non-experimental research are by and large not valid” (1997, p.
654).
creates an expected correlation of zero between the pretest and the grouping
variable; and pretest scores are theoretically and statistically related to the outcome
measure.
Figure 2 depicts this case. Because of random assignment of treatment groups,
the grouping variable is not related to the covariate, Hopelessness pretest scores.
But the covariate is related to the dependent variable, and boosts the independent
variable’s power by removing some of what otherwise would be error variance.
Once the dependent variable’s variance associated with the covariate’s variance is
removed, the portion of the remaining variance in the independent variable shared
with the independent variable (treatment) becomes larger.
IV: GROUP
COV: PRE-
HOPELESSNESS
DV: POST-
HOPELESSNESS
one reported F-ratios for covariates, and one other study gave the simple
correlations between covariates and dependent variables.
arranging the model. With a hierarchical analysis (the preferred approach), the
homogeneity term (interaction between the covariate and the independent variable)
is entered last, and the test is a version of SPSS’s “experimental” approach, where
each successive term is adjusted for previous terms. If a direct or simultaneous
regression is used, the homogeneity term is tested with SPSS’s “regression”
approach.
Conclusions
In 1969, Janet Elashoff called the analysis of covariance (ANCOVA) "a
delicate instrument." It still is. Carefully handled, though, it is an excellent device
for the analyst’s toolkit. To improve the quality of future ANCOVA studies, we
recommend that the method be limited primarily to randomized designs. When the
analyst wants to use ANCOVA with an intact group or other nonrandom
assignment, the correlation between the covariate(s) and the independent
variable(s) should be reported. As the correlations are increasingly non-zero, then
conclusions drawn about the independent variables are increasingly suspect.
ANCOVA is an interesting and useful toolkit, but it is not a fix-all to be applied
indiscriminately to equate groups. As mentioned above, the Johnson-Neyman
method can be used as an option (or as a complement) to ANCOVA. Myers and
Well (1995) offer a brief comparison of ANCOVA with other approaches—
blocking, analysis of gain scores—to improving statistical power in non-random
group). Kirk (1995, Chapter 15) gives a short but excellent review of ANCOVA
applications, and Huitema’s (1980) text remains as the definitive work on
ANCOVA.
We also recommend that researchers report tests of ANCOVA assumptions.
That statistical packages make assumption tests challenging is not a good reason to
avoid them entirely. And it is easy, not challenging, to report the simple
correlations between covariates and dependent variables. In the case where the
correlations are tiny, then there is no gain whatsoever to using ANCOVA.
References
Beck, A. T., & Steer, R.A. (1988). Beck Hopelessness Scale manual. San Antonio:
Psychological Corporation.
Bryk, A.S., & Weisberg, H.I. (1977). Use of the nonequivalent control group
design when subjects are growing. Psychological Bulletin, 84, 950-962.
Cohen, J. (1968). Multiple regression as a general data analytic system.
Psychological Bulletin, 70, 426-443.
Darlington, R.B. (1968). Multiple regression in psychological research and
practice. Psychological Bulletin, 69, 161-182.
Dixon, W.J. (1992). BMDP statistical software manual, Vol. 1. Berkeley, CA:
University of California Press.
Dorsey, S.G., & Soeken, K.L. (1996). Use of the Johnson-Neyman technique as an
alternative to analysis of covariance. Nursing Research, 45, 363-366.
Elashoff, J.D. (1969). Analysis of covariance: A delicate instrument. American
Educational Research Journal, 6, 383-401.
Uses and abuses of ANCOVA 9