The Stroop Color Word Test Influence of

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

ASSESSMENT 10.

1177/1073191105283427
Van der Elst et al. / THE STROOP COLOR-WORD TEST

The Stroop Color-Word Test


Influence of Age, Sex, and Education; and Normative
Data for a Large Sample Across the Adult Age Range

Wim Van der Elst


Martin P. J. Van Boxtel
Gerard J. P. Van Breukelen
Jelle Jolles
Maastricht University

The Stroop Color-Word Test was administered to 1,856 cognitively screened, healthy Dutch-
speaking participants aged 24 to 81 years. The effects of age, gender, and education on
Stroop test performance were investigated to adequately stratify the normative data. The re-
sults showed that especially the speed-dependent Stroop scores (time to complete a subtest),
rather than the accuracy measures (the errors made per Stroop subtask), were profoundly af-
fected by the demographic variables. In addition to the main effects of the demographic vari-
ables, an Age ´ Low Level of Education interaction was found for the Error III and the
Stroop Interference scores. This suggests that executive function, as measured by the Stroop
test, declines with age and that the decline is more pronounced in people with a low level of
education. This is consistent with the reserve hypothesis of brain aging (i.e., that education
generates reserve capacity against the damaging effects of aging on brain functions). Nor-
mative Stroop data were established using both a regression-based and traditional
approach, and the appropriateness of both methods for generating normative data is
discussed.

Keywords: Stroop Color-Word Test; executive function; normative data; education; sex; brain reserve

The Stroop Color-Word Test (Stroop, 1935) is a useful the ink color that incongruously named color words are
and reliable assessment tool in psychology (Lezak, printed in). The increase in time taken to perform the latter
Howieson, & Loring, 2004). Numerous different Stroop task compared with the basic task is referred to as “the
test versions have been developed (e.g., Comalli, Wapner, Stroop interference effect” (e.g., Davidson, Zacks, &
& Werner, 1962; Golden, 1978; Trenerry, Crosson, Williams, 2003; Moering, Schinka, Mortimer, & Graves,
DeBoe, & Leber, 1989), with variations in the color and 2003) and is considered a general measure of cognitive
number of the test items, the number of subtests, and the flexibility and control (Uttl & Graf, 1997) or executive
administration procedure. Despite these variations, the ba- functioning (Moering et al., 2003). These abilities de-
sic paradigm of the Stroop test has remained the same: An cline with age (Ivnik, Malec, Smith, Tangalos, & Peter-
individual’s performance on a basic task (e.g., reading sen, 1996) and in dementia (Houx, Jolles, & Vreeling,
names of colors) is compared with his or her performance 1993), which has made the Stroop test a popular test to
on an analogous task in which a habitual response needs to evaluate various groups of patients with borderline or es-
be suppressed in support of an unusual one (i.e., naming tablished brain pathology (Bohnen, Twijnstra, & Jolles,

Address correspondence to Jelle Jolles, Faculty of Medicine, Department of Psychiatry and Neuropsychology, Maastricht University,
P.O. Box 616, 6200 MD, Maastricht, The Netherlands; phone: +31 43 38 81041; e-mail: [email protected].
Assessment, Volume 13, No. 1, March 2006 62-79
DOI: 10.1177/1073191105283427
© 2006 Sage Publications
Van der Elst et al. / THE STROOP COLOR-WORD TEST 63

1992; Davidson et al., 2003; Dulaney & Rogers, 1994; nitive aging (Jolles, Houx, Van Boxtel, & Ponds, 1995).
Graf, Uttl, & Tuokko, 1995; Houx et al., 1993; MacLeod, Participants in the MAAS study were recruited from the
1991; Uttl & Graf, 1997). Registration Network Family Practices (RegistratieNet
The popularity of the Stroop test in clinical and re- Huisartspraktijken [RNH]), which includes 80,000 peo-
search settings (Lezak et al., 2004) means that it is impor- ple who live in the province of Limburg in the Netherlands
tant to determine the influence of age and age-extrinsic (Metsemakers, Höppener, Knottnerus, Kocken, &
factors on test performance. Previous studies have pro- Limonard, 1992). RNH physicians classify health prob-
vided inconclusive data about the effects of age, sex, and lems according to the International Classification of
education on Stroop test performance. Although most Health Problems in Primary Care (ICPC; Lamberts &
authors reported age-related decrements in Stroop test Wood, 1987). The use of the RNH as a sample frame rather
performance (Daigneault, Braun, & Whitaker, 1992; than a general population sample has the advantage that an
Feinstein, Brown, & Ron, 1994; Hameleers et al., 2000; eligible study sample could be selected beforehand. Thus,
Houx et al., 1993; Ivnik et al., 1996; Klein, Ponds, Houx, individuals with medical conditions known to interfere
& Jolles, 1997; Libon et al., 1994; Moering et al., 2003; with cognition (i.e., cerebrovascular pathology, tumors of
Spreen & Strauss, 1998; Swerdlow, Filion, Geyer, & the nervous system, multiple sclerosis, epilepsy, Parkin-
Braff, 1995; Van Boxtel, ten Tusscher, Metsemakers, sonism, dementia, organic psychosis, schizophrenia, af-
Willems, & Jolles, 2001), Graf et al. (1995) did not find fective psychosis, and mental retardation) were identified
age to influence test performance. Sex differences in using the RNH records and excluded from the sample
Stroop performance have been reported by some frame. In total, 10,396 individuals between age 24 and 81
(Hameleers et al., 2000; Martin & Franzen, 1989; Moering were then randomly drawn from the RNH. They were in-
et al., 2003; Van Boxtel et al., 2001) but not all (Houx et al., formed about the study by their general practitioner rather
1993; Klein et al., 1997; Swerdlow et al., 1995; Trenerry than by the MAAS project staff, which was expected to
et al., 1989) authors. Again, education was found to be have a facilitative effect on compliance, and were asked to
positively related to Stroop test performance by some au- return a prepaid postcard to indicate their willingness to
thors (Hameleers et al., 2000; Houx et al., 1993; Moering participate in the study. In total, 4,490 people (43.2%)
et al., 2003; Van Boxtel et al., 2001) but not by others agreed to participate (3,531 individuals [34%] refused
(Trennery et al., 1989). participation and 2,375 [22.8%] people did not return the
In this study, we administered the Stoop test to a sample postcard). Potential participants were then screened in a
of 1,856 cognitively intact men and women aged 24 to 81 semistructured interview to check for additional exclusion
years with different levels of educational attainment in or- criteria that were not coded in the RNH (i.e., a history of
der to establish normative data. In the past decade, there transient ischemic attacks, brain surgery, hemodialysis for
has been considerable debate about which methodological renal failure, electroconvulsive therapy, and chronic psycho-
approach should be used to derive normative data, that is, a tropic drug use), which led to the exclusion of 301 individ-
traditional approach or a regression-based approach (see uals. Of the remaining 4,189 participants, 1,856 were ran-
Fastenau, 1998; Fastenau & Adams, 1996; Heaton, Avitable, domly selected from 12 discontinuous age categories (25
Grant, & Matthews, 1999; Heaton, Matthews, Grant, & ± 1 years, 30 ± 1 years . . . 80 ± 1 years) for participation in
Avitable, 1996; Van Breukelen & Vlaeyen, in press). The the study.
main difference between the two approaches is that with Not all data for the 1,856 participants administered the
the traditional approach, normative data are derived from Stroop test were included in the analyses. The following
raw test scores by splitting the sample into relevant demo- exclusion criteria were used: a score below 24 on the Mini-
graphic subgroups, whereas with the regression-based ap- Mental State Examination (Folstein, Folstein, & McHugh,
proach, normative data are derived from test scores as pre- 1975), the occurrence of technical problems during test as-
dicted from the relevant demographic variables. Because sessment, and more than 20 errors on the third Stroop
both methods have their advantages, either methodologi- subtask (indicative of possible cognitive problems). This
cal or in terms of ease of use, we established normative led to the exclusion of data for 68 participants (data for 13,
data using both methods. 52, and 3 participants were excluded for the above-
mentioned reasons, respectively). Table 1 provides basic
descriptive data for the sample. Level of Education (LE)
METHOD was assessed by classifying formal schooling in a system
often used in the Netherlands (De Bie, 1987), which is
Participants comparable to the International Standard Classification of
Education (United Nations Educational, Scientific and
The data were derived from the Maastricht Aging Study Cultural Organisation, 1976). The participants were
(MAAS), a prospective study into the determinants of cog- grouped as follows: those with at most primary education
64 ASSESSMENT

TABLE 1
Descriptive Characteristics of the Sample (N = 1,788)
Age Level of Education (frequency)
Age Group (years) n M SD Low Average High Male:Female Ratio

25 ± 1 year 159 25.40 .90 21 77 61 81:78


30 ± 1 154 30.60 .87 22 79 53 75:79
35 ± 1 157 35.55 .90 33 77 47 79:78
40 ± 1 155 40.49 .87 37 72 46 78:77
45 ± 1 162 45.54 .92 49 73 40 84:78
50 ± 1 160 50.29 .88 52 71 37 81:79
55 ± 1 157 55.46 .90 81 49 27 76:81
60 ± 1 158 60.59 .84 73 65 19 76:82
65 ± 1 153 65.45 .87 83 50 20 77:76
70 ± 1 155 70.31 .82 80 57 18 74:81
75 ± 1 158 75.16 .81 80 53 24 81:77
80 ± 1 60 79.81 .82 38 9 12 29:31
Total 1,788 51.41 16.37 649 732 404 891:897

NOTE: Data on level of education were missing from 3 participants.

(LE low), those with junior vocational training (LE aver- the three subsequent subtasks. There was no time limit to
age), and those with senior vocational or academic train- complete a subtask. The times needed to complete each
ing (LE high). The ethnic background of all participants Stroop subtask served as dependent measures (Stroop I,
was Caucasian, and all participants were native Dutch Stroop II, and, Stroop III, respectively). An interference
speakers. measure was calculated by subtracting the average time
needed to complete the first two subtasks from the time
Procedure and Instruments needed to complete the third subtask (Interference =
Stroop III – [(Stroop I + Stroop II) / 2]; Valentijn et al.,
All participants were tested individually at the neuro- 2005). The examiners did not point out errors made during
psychological laboratory of the Brain & Behaviour Insti- the test. Although many participants spontaneously cor-
tute in Maastricht, the Netherlands, using the Stroop test rected themselves when they noticed an error (which re-
version most commonly used in this country (Hammes, quires a certain amount of time, so that the Stroop I, Stroop
1973). This Stroop test version consists of three subtasks. II, and Stroop III scores were, to some extent, indirectly
The stimulus material for each of these subtasks is shown corrected for poor accuracy), this was not always the case.
on a white sheet of paper that is landscape oriented (A4 let- Therefore, the number of errors that were not self-
ter format, 11.69 in ´ 8.26 in [29.70 cm ´ 20.99 cm]). The corrected was also recorded for each Stroop subtask (Error
100 stimuli for each subtask are distributed evenly in a 10 I, Error II, and Error III, respectively).
´ 10 matrix on each sheet of paper with a margin of about The data were collected by five examiners who had
1.97 in (5 cm) at the top, 0.59 in (1.5 cm) on the left and on been intensively trained in test administration. Regular
the right, and 1.57 in (4 cm) at the bottom. The first subtask training sessions were scheduled to ensure uniform ad-
shows color words in random order (red, blue, yellow, ministration of the Stroop.
green) printed in black ink (noncapital letters, 0.157 in
[0.4 cm] high). Subtask 2 displays solid color patches of Data Analysis
0.276 in ´ 0.787 in (0.7 cm ´ 2.0 cm) in one of these four
basic colors. The third subtask contains color words print- We first established which demographic variables were
ed in an incongruous ink color (noncapital letters, 0.157 in and were not predictive for the different Stroop scores so
[0.4 cm] high), for example, the word yellow printed in red that the normative data could be appropriately corrected
ink. The Dutch words for red, blue, yellow, and green are for these variables. Thus, the Stroop scores were regressed
similar to their English equivalents in length and pronunci- on age, age2, sex, level of education, and all two-way inter-
ation time (i.e., rood, blauw, geel, and groen, respectively). actions. Age was centered (calendar age – 50) before com-
The participants were instructed to read the words, puting the quadratic terms and interactions to avoid multi-
name the colors, and finally, name the ink color of the collinearity (Marquardt, 1980). Sex was dummy coded
printed words as quickly and as accurately as possible in with male = 1 and female = 0. LE was dummy coded with
Van der Elst et al. / THE STROOP COLOR-WORD TEST 65

two dummies (LE low and LE high), with LE average as RESULTS


the reference category. The full models were then reduced
in a stepwise way by eliminating the least significant pre- The final models are presented in Table 2. A model for
dictor if its two-tailed p value was higher than .005 (a rela- the Error I score is not presented because none of the inde-
tively small alpha was taken to avoid Type I errors). Note pendent variables contributed to the prediction of this
that a predictor was never removed from the model as long score. No significant influence of outliers was observed
as it was also included in a higher order term in the model. (maximum Cook’s distance = .066). The VIFs of the pre-
The reason for this is that the p value of any predictor is ar- dictors were at most 2.567, well below the cutoff value.
bitrary (depending on the coding used for the predictors) if Table 3 shows the zero-order correlations between the
that predictor is part of any higher order predictor in the raw Stroop scores and the demographic variables used in
model (Aiken & West, 1991). The dummies LE low and the regression models. These regression models (see Ta-
LE high were always either both in or both out of the ble 2) showed that the Stroop I, Stroop II, Stroop III, and
model because they belong together and represent the ef- interference scores were significantly affected by linear
fect of the categorical predictor LE. Similarly, their inter- and quadratic age components. Females had better Stroop
actions with another predictor were always either both in II, Stroop III, and interference scores than did males, but
or both out of the model. The assumptions of regression no sex differences were found for the Stroop I and the error
analysis (homoscedasticity, normal distribution of the re- scores. Participants with a lower educational attainment
siduals, absence of multicollinearity, and absence of influ- scored worse than their higher educated counterparts on
ential cases) were tested for each model. The normal dis- all the Stroop scores with the exception of the Error I
tribution of the residuals was investigated by inspection of score. The Error II score was influenced by level of educa-
the normal probability plots. The occurrence of mul- tion only. Significant Age ´ LE low interactions were
ticollinearity was checked by calculating the Variance found for the interference and Error III scores. The pre-
Inflation Factors (VIFs), which should not exceed 10 dicted interference and Error III scores are shown as a
(Belsley, Kuh, & Welsch, 1980). Cook’s distances were function of age and LE for male participants in Figures 1
calculated to identify possible influential cases. Homo- and 2 (for females, these plots were identical for the Error
scedasticity was evaluated by grouping participants into III score and similar for the Interference score, with a con-
quartiles of the predicted scores and applying the Levene stant value subtracted from the predicted scores; see Ta-
test to the residuals. ble 2). As shown, the difference in the predicted interfer-
Next, normative data were calculated using the tradi- ence and Error III scores between participants with a low
tional and the regression-based approach. For the tradi- educational attainment and participants with a higher edu-
tional approach, Stroop normative statistics (M, SD) were cational attainment increased significantly as a function of
determined based on the observed data per relevant sub- age.
group (as identified by the significant predictors in the re-
gression models). These statistics can be used to convert Traditional Normative Data
the raw scores into Z scores (Z = –[observed score – mean
score] / SD),1 but this conversion is appropriate only if the Table 4 provides the raw Stroop score descriptive statis-
raw scores are normally distributed. If this was not the tics (M, SD) and sample sizes per relevant subgroup (as
case, cumulative percentages for the observed values of identified by the regression models presented in Table 2;
the scores are provided. the continuous predictor age was categorized to age groups
Using the regression-based approach, the raw Stroop of 25 ± 1 years, 30 ± 1 years . . . 80 ± 1 years). These statis-
scores of a person are converted into standardized tics can be used to transform the raw scores of a person into
residuals in three steps. First, the predicted scores of a Z scores, but this transformation is only appropriate if the
person are calculated; second, the residuals are calcu- raw scores per relevant subgroup are normally distributed.
lated (ei = –[observed score – predicted score]);2 and third, This was the case for the Stroop I, Stroop II, Stroop III, and
the residuals are standardized (Zi = ei / SD[residual]). interferences raw scores (all ps of Kolmogorov-Smirnov
These standardized residuals can be interpreted via a Z dis- Z > .005; see Table 4), but the Z-score transformation was
tribution table with cumulative probabilities, but this inter- not appropriate to evaluate a person’s Error I, Error II, and
pretation is appropriate only if the distribution of the stan- Error III scores because of highly skewed distributions of
dardized residuals of the sample is normal. More details the raw scores in the majority of the subgroups. Thus, for
regarding the regression-based normative method can be the error scores, cumulative percentages representing the
found elsewhere (Van Breukelen & Vlaeyen, in press). proportion of people who scored at or above a certain raw
All analyses were performed using the SPSS 11.5 for score were derived from the observed distribution in the
Windows software package. sample (see the appendix, Table A1).
66 ASSESSMENT

TABLE 2
Final Regression Models for the Stroop Scores
2
Measure Variable B SE B T Standardized B R

Stroop I Constant 41.517 .312 133.199**


Age .131 .011 12.150** .275
2
Age .003 .001 4.454** .096
LE low 3.595 .387 9.292** .222
LE high –1.507 .430 -3.502** –.081 .211
Stroop II Constant 52.468 .468 112.150**
Age .209 .014 14.567** .320
2
Age .007 .001 7.633** .159
Sex 2.390 .439 5.442** .112
LE low 4.235 .514 8.237** .191
LE high –2.346 .573 –4.092** –.092 .261
Stroop III Constant 82.601 .996 82.898**
Age .714 .031 23.387** .447
2
Age .023 .002 11.799** .214
Sex 4.470 .934 4.784** .086
LE low 13.285 1.093 12.153** .245
LE high –3.873 1.221 –3.171* –.062 .442
Interference Constant 36.066 .829 43.509**
Age .500 .038 13.318** .396
2
Age .016 .002 9.810** .193
Sex 3.010 .776 3.880** .073
LE low 8.505 .949 8.961** .198
LE high –2.092 1.033 –2.024 –.042
Age ´ LE low .167 .060 2.806* .078
Age ´ LE high –.019 .063 –.305 –.007 .391
Error II Constant .325 .030 10.708**
LE low .135 .044 3.035* .078
LE high –.110 .051 –2.168 –.056 .012
Error III Constant .382 .078 4.932**
Age .004 .004 1.024 .038
2
Age .0007 < .001 4.134** .100
LE low .476 .100 4.756** .130
LE high –.054 .109 –.498 –.013
Age ´ LE low .019 .006 3.055* .104
Age ´ LE high .005 .007 .796 .023 .068

NOTE: The full models included age, age2, sex, LE low, LE high, and all two-way interactions as predictors. LE = level of education. Coding of the pre-
dictors: Age = calendar age – 50; Age2 = (calendar age – 50)2; Sex: male = 1, female = 0; LE low: low education = 1, average or high education = 0; LE high:
high education = 1, low or average education = 0.
*p < .005. **p < .001.

As an example of the traditional method, a 40-year-old Regression-Based Normative Data


man with an average level of education scored 40 on
Stroop I, 60 on Stroop II, and 85 on Stroop III. The inter- The regression models, combined with the standard de-
ference score was 35 (= 85 – [(40 + 60) / 2]), which corre- viations of the residuals, provide normative data: After
sponds to a Z score of .22 (= [– (35 – 37.36)] / 10.55) and a calculation of a person’s predicted scores by means of the
p value of .58, which can be considered normal. This per- regression models in Table 2, the residuals of each score
son scored 0 on Error I, 1 on Error II, and 1 on Error III. are calculated (ei = –[observed score – predicted score])
The cumulative percentage tables in the appendix (Table A1) and standardized (Zi = ei / SD [residual]) using Table 5. The
should be used to norm these scores rather than Table 4 SD (residuals) were calculated per quartile of the predicted
for the reason described. Thus, the person’s Error I score scores to account for heteroscedasticity, because the
was equal to or better than 100% of the Error I scores in Levene test rejected the assumption that variance was ho-
the normative sample, the Error II score was equal to or mogeneous in all the models (all ps < .01). The normal
better than 21% of the Error II scores in the normative probability plots showed normally distributed standard-
sample, and the Error III score was equal to or better than ized residuals for the Stroop I, Stroop II, Stroop III, and in-
22.2% of the Error III scores in the normative sample. terference models when these SD (residuals) as a function
These scores can all be considered normal. of predicted scores were used. However, the use of SD (re-
Van der Elst et al. / THE STROOP COLOR-WORD TEST 67

TABLE 3
Zero-Order Correlations Between the Predictors Used in the
Regression Analyses and the Stroop Measures
Age Sex LE Low LE High Stroop I Stroop II Stroop III Interference Error I Error II Error III

Age 1
Sex 0.008 1
a
LE low 0.313** –0.098** 1
a a
LE high –0.193** 0.128** –0.409** 1
Stroop I 0.379** 0.017 0.343** –0.223** 1
Stroop II 0.429** 0.085** 0.322** –0.214** 0.756** 1
Stroop III 0.578** 0.061 0.407** –0.232** 0.602** 0.739** 1
Interference 0.547** 0.051 0.368** –0.197** 0.377** 0.532** 0.958** 1
Error I 0.044 –0.002 0.077** –0.058 0.046 0.074* 0.078** 0.071* 1
Error II 0.084** 0.008 0.101** –0.088** 0.072* 0.143** 0.137** 0.125** 0.112** 1
Error III 0.176** 0.026 0.194** –0.094** 0.176** 0.188** 0.321** 0.323** 0.107** 0.167** 1

a. These correlations (between two dummy variables) correspond to j coefficients (Hays & Winkler, 1971).
*p < .005. **p < .001.

FIGURE 1
Predicted Stroop Interference Scores for Male Participants by
Level of Education as a Function of Age

90

80
Predicted Stroop Interference score

70

60
LE low
LE average
LE high
50

40

30

20
r

r
a

a
ye

ye

ye

ye

ye

ye

ye

ye

ye

ye

ye

ye
25

30

35

40

45

50

55

60

65

70

75

80
68 ASSESSMENT

FIGURE 2
Predicted Stroop Error III Scores for Males and Females, by Level of Education as a Function of Age

2,5

2
Predicted Stroop Error III score

1,5
LE low
LE average
LE high
1

0,5

0
r

r
a

a
ye

ye

ye

ye

ye

ye

ye

ye

ye

ye

ye

ye
25

30

35

40

45

50

55

60

65

70

75

80

siduals) as a function of predicted scores did not solve the If the performance of such an individual is borderline nor-
problem of skewed Error II and Error III standardized re- mal (e.g., Z value » –1.64), the exact Z values can be
sidual scores, which makes the regression-based norma- determined using Tables 2 and 5.
tive approach inappropriate for these scores. For this rea-
son, there are no SD (residual) values for these scores
presented in Table 5. DISCUSSION
As an example of the regression-based method, the in-
terference score of 35 of the 40-year-old man from the pre- In this study, we first determined which variables affect
vious example is considered. The predicted interference Stroop performance, i.e., age, sex, LE, and all possible
score for this person was 35.676 [= 36.066 + (.500 * –10) + two-way interactions between these predictors.
(.016 * [–102]) + (3.010 * 1) + (8.505 * 0) + (–2.092 * 0) + All the speed-dependent Stroop test scores (Stroop I,
(.167 * 0) + (–.019 * 0)]. So the residual was .676 (= – [35 - Stroop II, Stroop III, and interference) were profoundly
35.676]) and the standardized residual was .053 (= .676 / affected by linear and quadratic age effects and LE (see
12.667). This standardized residual corresponds to an ex- Table 2). Although MacLeod (1991) suggested that sex
act p value of .52 and can be considered normal. Because only has a minor influence on Stroop test performance at
the steps that are required in this regression-based ap- any age, we found clear sex differences on the Stroop II,
proach require some active calculation from the user of the Stroop III, and the interference scores, with women out-
normative data, we also calculated simplified normative performing men over the entire age range studied (there
tables to increase the user-friendliness of the regression- were no Sex ´ Age interactions). With respect to the accu-
based norms (see Tables B1 to B4 in the appendix). If an racy measures (Error I, Error II, and Error III), the influ-
individual is not exactly 25, 30, . . . , 80 years old, then the ence of the demographic variables was less profound: The
person’s age should be rounded up to the closest age given. Error I score was not affected by any of the demographic
TABLE 4
Descriptive Statistics for the Stroop Measures Stratified by Their Relevant Predictors
Age Group
Measure Sex LE All 25 ± 1 30 ± 1 35 ± 1 40 ± 1 45 ± 1 50 ± 1 55 ± 1 60 ± 1 65 ± 1 70 ± 1 75 ± 1 80 ± 1

Stroop I Males and females Low


M 43.23 43.61 44.47 45.94 43.30 44.82 45.43 48.61 47.64 47.16 51.52 51.46
SD 6.03 6.78 6.71 9.97 7.19 6.66 7.14 8.91 7.80 7.14 8.91 7.54
n 21 22 33 37 49 52 81 73 83 80 80 38
Average
M 39.49 41.22 38.59 40.78 41.29 42.68 42.76 42.13 44.51 44.95 47.91 45.78
SD 4.76 6.20 5.28 5.55 6.94 6.98 6.46 6.85 7.55 7.07 7.40 5.16
n 77 79 77 72 73 71 49 65 50 57 53 9
High
M 38.61 39.56 38.96 38.16 39.48 40.47 38.59 43.50 42.92 44.04 44.00 47.98
SD 6.44 6.25 5.57 4.77 7.09 6.90 5.47 5.88 7.64 7.67 4.67 8.02
n 61 53 47 46 40 37 27 19 20 18 24 12
Stroop II Males Low
M 61.37 56.90 57.38 59.48 58.91 57.33 59.97 62.05 63.65 63.77 69.81 71.97
SD 5.24 7.35 10.05 8.46 8.45 9.41 12.07 11.24 9.69 10.66 14.17 12.80
n 8 9 22 21 23 17 33 31 32 35 32 20
Average
M 55.36 54.16 50.96 51.35 54.34 54.94 58.59 57.61 62.74 61.83 65.83 63.85
SD 7.46 8.11 6.49 8.97 10.35 9.43 7.92 8.51 14.06 10.36 10.82 7.73
n 37 41 25 24 32 43 27 38 29 34 27 4
High
M 50.36 51.55 50.49 51.73 51.37 53.77 52.10 56.77 56.61 60.07 59.36 60.14
SD 7.43 7.03 7.92 7.45 8.23 8.45 7.94 5.67 9.72 8.63 8.70 5.74
n 33 29 31 32 23 19 21 12 15 12 18 6

Females Low
M 58.70 54.45 57.67 57.20 51.70 56.14 57.74 62.50 61.36 60.80 67.90 71.38
SD 8.47 7.74 10.06 13.84 7.58 9.19 8.57 8.82 10.11 8.13 11.30 11.82
n 13 13 11 16 26 35 48 42 51 45 48 18
Average
M 49.68 51.17 48.84 52.13 51.53 52.39 56.34 53.89 57.08 56.72 63.80 65.71
SD 8.27 6.31 6.84 8.30 7.84 7.23 7.10 7.52 8.44 10.01 8.43 11.06
n 40 38 52 48 41 28 22 27 21 23 26 5
High
M 49.66 49.41 50.86 44.91 49.88 55.69 48.62 56.55 58.41 54.58 56.34 61.96
SD 8.72 9.47 4.52 5.62 6.51 14.29 11.04 7.71 10.75 11.80 9.02 7.63
n 28 24 16 14 17 18 6 7 5 6 6 6

(continued)
69
70
TABLE 4 (continued)
Age Group
Measure Sex LE All 25 ± 1 30 ± 1 35 ± 1 40 ± 1 45 ± 1 50 ± 1 55 ± 1 60 ± 1 65 ± 1 70 ± 1 75 ± 1 80 ± 1

Stroop III Males Low


M 99.03 86.97 89.03 94.84 106.02 106.18 108.51 105.32 116.99 121.11 137.33 150.96
SD 9.48 12.02 15.63 18.36 22.69 22.94 25.61 19.61 30.14 32.68 26.73 27.85
n 8 9 22 21 23 17 33 31 32 35 32 20
Average
M 84.44 84.22 78.57 82.80 86.74 88.07 95.94 93.78 100.81 106.64 117.64 129.68
SD 14.24 15.15 12.73 13.12 20.62 20.47 15.20 16.26 20.07 23.76 21.31 27.30
n 37 41 25 24 32 43 27 38 29 34 27 4
High
M 73.92 81.85 77.74 82.86 83.26 81.83 83.81 94.40 99.20 102.40 109.46 114.57
SD 11.47 13.66 15.23 17.23 16.94 12.13 17.38 19.48 18.90 15.11 24.25 16.91
n 33 29 31 32 23 19 21 12 15 12 18 6

Females Low
M 92.55 89.16 90.32 91.13 82.68 91.90 100.17 111.74 109.83 112.19 131.97 150.17
SD 16.74 17.67 13.99 24.64 15.74 13.43 18.32 21.16 22.15 20.24 30.48 36.24
n 13 13 11 16 26 35 48 42 51 45 48 18
Average
M 73.93 76.85 77.08 80.32 83.82 83.22 95.42 88.01 97.75 106.03 114.55 126.22
SD 11.00 12.66 14.15 11.75 13.89 17.12 14.58 13.77 25.37 26.45 19.39 21.86
n 40 38 52 48 41 28 22 27 21 23 26 5
High
M 75.75 74.52 80.68 69.63 74.76 86.23 79.72 88.11 91.76 95.68 108.54 125.61
SD 13.50 17.37 11.20 10.52 9.17 15.50 18.72 11.90 19.55 16.26 29.94 33.28
n 28 24 16 14 17 18 6 7 5 6 6
Interference Males Low
M 47.14 36.29 38.75 43.03 53.65 54.90 55.87 49.55 60.70 65.33 76.36 88.95
SD 8.34 8.53 12.86 16.59 18.59 19.37 18.49 17.17 25.11 28.49 22.91 29.22
n 8 9 22 21 23 17 33 31 32 35 32 20
Average
M 36.16 36.17 33.06 37.36 38.85 39.35 45.02 43.12 46.88 52.73 60.92 74.44
SD 11.19 10.95 9.56 10.55 14.94 15.94 14.58 13.50 14.03 18.94 20.06 24.52
n 37 41 25 24 32 43 27 38 29 34 27 4
High
M 29.52 36.47 33.33 37.68 38.01 34.04 38.77 44.35 49.86 49.95 57.90 61.24
SD 7.52 10.97 11.56 15.30 13.00 11.58 13.77 17.72 13.55 12.73 19.60 17.55
n 33 29 31 32 23 19 21 12 15 12 18 6
Females Low
M 41.33 40.42 37.97 38.55 36.30 41.52 49.09 56.51 55.73 58.59 72.47 89.06
SD 15.42 14.80 10.05 16.35 10.94 10.29 14.79 18.42 18.72 18.34 26.45 29.11
n 13 13 11 16 26 35 48 42 51 45 48 18
Average
M 30.13 31.04 33.72 33.76 37.48 35.56 46.16 41.10 47.07 55.91 58.54 70.81
SD 7.43 10.42 10.64 10.95 10.73 14.34 12.22 11.19 23.10 20.89 16.41 14.78
n 40 38 52 48 41 28 22 27 21 23 26 5
High
M 31.34 29.81 35.15 28.64 29.84 38.85 35.03 37.94 40.88 47.16 58.03 69.91
SD 9.82 12.95 10.29 8.56 9.60 11.53 11.81 7.35 7.85 7.68 25.88 27.93
n 28 24 16 14 17 18 6 7 5 6 6 6
Error I Males and females All LEs
a
M 0.21
SD 0.593
n 1,788
Error II Males and females Low
a
M 0.46
SD 0.94
n 649
Average
a
M 0.33
SD 0.81
n 732
High
a
M 0.22
SD 0.61
n 404
Error III Males and females Low
a a a a a a a a a
M 1.14 0.50 0.82 0.57 0.86 1.46 0.80 0.89 1.01 1.15 2.63 2.45
SD 1.65 0.67 1.18 1.28 1.61 2.04 1.44 1.45 1.87 2.03 4.46 3.45
n 21 22 33 37 49 52 80 73 83 80 79 38
Average
a a a a a a a a a a a
M 0.52 0.53 0.45 0.39 0.68 0.49 0.55 0.54 0.96 0.47 0.60 1.33
SD 0.84 1.02 0.91 0.85 1.26 0.89 1.50 0.99 2.37 0.89 1.10 1.58
n 77 79 77 72 73 71 49 65 49 57 53 9
High
a a a a a a a a a
M 0.33 0.60 0.32 0.57 0.35 0.27 0.30 0.26 0.42 0.72 1.33 0.92
SD 0.89 1.81 0.69 1.26 0.80 0.56 0.78 0.73 0.84 1.32 2.55 2.02
n 61 53 47 46 40 37 27 19 19 18 24 12

NOTE: LE = level of education. The normative statistics for the scores that are stratified by LE were based on a total sample of 1,785 instead of 1,788 because data on LE were missing from 3 participants. Data on
Error III scores were missing from 4 participants.
a. The mean and standard deviation for this subgroup cannot be used to convert the raw scores into Z scores because the raw scores in this subgroup were not normally distributed (p value of the Kolmogorov-
Smirnov Z < .005).
71
72 ASSESSMENT

TABLE 5 proach for establishing normative data can only be deter-


Standard Deviations of Residuals mined by comparing the normative data obtained with the
for the Stroop I, Stroop II, Stroop III, two methods with the “true” situation (when all scores of
and Interference Scores the entire population are known, e.g., by using a simula-
Score Predicted Score SD (Residual) tion study). The population distribution of scores is never
known in normative studies; but based on the results of the
Stroop I £ 40.209 5.961 present study, some considerations can be given regarding
Between 40.210 and 43.353 6.400
the issue of regression-based versus traditional norms.
Between 43.354 and 46.059 7.217
³ 46.060 7.921 With the traditional method, raw scores are often con-
Stroop II £ 51.661 7.988 verted into Z scores to evaluate a person’s performance.
Between 51.662 and 55.861 8.459 This method is appropriate if the mean and standard devia-
Between 55.862 and 60.713 9.419 tion are accurately estimated (i.e., the M and SD of the
³ 60.714 10.587
Stroop III £ 79.988 13.963
sample approximate the true population M and SD) and if
Between 79.989 and 92.862 16.367 the raw test scores are normally distributed. A problem
Between 92.863 and 108.585 19.506 that is intrinsically related to the traditional approach is
³ 108.586 25.936 that calculation of these statistics usually requires the total
Interference £ 34.845 11.037 sample to be divided into subgroups. Indeed, performance
Between 34.846 and 41.636 12.667
Between 41.637 and 54.849 15.856
on many (neuro)psychological tests is influenced by in-
³ 54.850 22.472 dependent variables such as age, sex, and education
(Mitrushina, Boone, & D’Elia, 1999). As a result, the total
sample has to be subdivided to provide norms corrected
for these variables, which dramatically decreases the sam-
variables, and although the Error II and Error III scores ple size on which the normative statistics are calculated.
were affected to some extent by demographic variables, For example, splitting a sample by age group (12 levels),
the proportion of explained variance of the models pre- sex (2 levels), and level of education (3 levels) reduces the
dicting these scores was low (R2 < .07). sample size per subgroup to 1/72th of the total sample size.
The impact of a low LE as compared with an average Consequently, the normative statistics are less accurate
LE was larger than the impact of a high LE on all Stroop when estimated for subgroups than for the total sample
measures—with the exception of the Error I score (see B (Van Breukelen & Vlaeyen, in press). The problem of di-
values, Table 2). In addition to the main effect of LE on all viding a sample into subgroups is especially pronounced if
the Stroop scores, there were significant Age ´ LE Low in- the subgroups are not balanced in size. This is usually the
teractions on the interference and Error III scores. In other case in normative studies involving participants from a
words, the influence of educational attainment was inde- broad age range that are randomly selected. Indeed, if
pendent of the influence of age on the Stroop I, II, III, and there is adequate random sampling, certain population
Error II scores, but the difference in performance between trends should be reflected in the sample, that is, the de-
the low and average/high-educated participants increased crease in level of education as a function of increasing age
as a function of age on the interference and Error III (many elderly people educated before 1960 have a low
scores. Because the interference score is generally re- level of educational attainment because they left school
garded as a measure of executive functioning, this interac- early or did not go to a university or other institutes of
tion suggests that executive functioning declines with age higher education; Jolles et al., 1995). As a result, the sam-
and that the magnitude of decline is influenced by a per- ple sizes for certain subgroups are unusually small (e.g.,
son’s educational attainment. This result is consistent with for high-educated elderly and low-educated younger peo-
the cognitive reserve hypothesis, which claims that certain ple), which yields very unreliable normative data for these
environmental factors make individuals less vulnerable to subgroups—with the mean scores and standard deviations
age-related cognitive decline and pathological brain pro- moving up, down, and up again across age groups due to
cesses (Stern et al., 2003). Examples of such factors are chance trends (see Table 4). A possible solution for this
education (Dufouil, Alpérovitch, & Tzourio, 2003; Le problem is the use of broader age categories to increase the
Carret et al., 2003), mental stimulation at work (Bosma, number of participants per subgroup, but then there is a
Van Boxtel, Ponds, Houx, & Jolles, 2002), and leisure problem with the boundary value. For example, if the age
activity (Scarmeas, Levy, Tang, Manly, & Stern, 2001). groups are as broad as 24 to 36 years, 39 to 51 years, and so
We derived normative Stroop data using a traditional on, the scores of people aged 36 and 39 (differing only 3
and a regression-based approach because both methods years) would be evaluated against different normative
have their advantages and disadvantages. The best ap- data, whereas the scores of people aged 24 to 36 (differing
Van der Elst et al. / THE STROOP COLOR-WORD TEST 73

12 years) would be evaluated against the same normative tinuous variables, whereas the accuracy measures are dis-
data, which is not acceptable (Capitani, 1997). crete. In general, linear regression analysis is primarily
In the regression-based approach, the problem of unre- suitable to analyze continuous dependent variables, al-
liable normative data for certain subgroups because the though in most cases discrete dependent variables can
sample was subdivided does not occur. Indeed, regression- also be analyzed with regression analysis. For example,
based norms provide more accurate estimates of popula- the regression-based normative approach was found to be
tion statistics because they are based on equations that appropriate in previous normative research with the Verbal
are derived using the data for all demographic groups Learning Test of Rey (Van der Elst, Van Boxtel, Van
(Van Breukelen & Vlaeyen, in press; Zachary & Gorsuch, Breukelen, & Jolles, 2005), a test for declarative memory
1985). In fact, normative data can even be provided for of which most of the scores are discrete variables (with a
people with certain demographic characteristics that were range of the scores of 0 to 15). The Stroop error scores are
not in the sample. For example, in the present study, dis- also discrete variables that cover about the same range of
continuous age groups were used (24-26 years, 29-31 scores, but a problem with the error scores is that they are
years, 34-36 years, etc.), but normative data for nontested confined to a small part of the possible range of the scores
people aged 28 years, for example, could be determined by and that they are highly skewed. When such data are ana-
using the regression model and the SD (residual) derived lyzed with regression analysis, the assumption of nor-
from the scores of the tested people. Also, the problem mally distributed residuals is usually violated (Fox, 1997).
of unbalanced data does not occur in the regression-based This assumption is arbitrary for most purposes—because
approach because any imbalance in the sample does not regression analysis is robust against violations of the nor-
bias the estimation or testing of the regression weights but mality of the residuals assumption (Fox, 1997)—but in the
only causes some loss of statistical power (because the context of the regression-based normative approach, this
standard error of a regression weight is proportional to the assumption is important because the standardized residu-
VIF of the predictor at hand; see, e.g., Kleinbaum, Kupper, als can only be interpreted via a Z distribution table if they
Muller, & Nizam, 1998). In addition, the regression-based are normally distributed. However, heavily skewed scores
normative approach offers some interesting possibilities that are confined to a small part of the possible range
that cannot be achieved by using a traditional approach. of scores can be normed by using a method similar to the
For example, as an alternative for the interference score, a regression-based approach, that is, by dichotomizing the
regression model can be constructed that predicts the scores and analyzing them by logistic regression (Fox,
Stroop III score based on the Stroop I and Stroop II scores 1997), but this method is more complex and beyond the
in addition to the relevant demographic variables. Such an scope of this article. Rather than using logistic regres-
approach avoids the use of difference scores such as the in- sion, we used cumulative percentages (see the appendix;
terference score, which may reduce the problems that are Table A1) to evaluate the error scores of a testee. The use
associated with such scores, that is, their typically low reli- of cumulative percentages avoids the problem of non-
ability. We did not provide such measures here, because normally distributed raw scores (which is required to use
normative tables for each relevant combination of predic- the Z-score transformation in the traditional approach),
tors (Stroop I score, Stroop II score, age, sex, and/or LE) but the problems associated with the division of the sample
would be too numerous. Thus, for such norms to be used, into subgroups remain. Indeed, for the Error III score, the
the three-step procedure to convert raw scores into standard- sample had to be divided into many subgroups because of
ized residuals would be required, which is not very user- significant effects of age and LE on this score, which
friendly. A solution is to use a computer-based algorithm means that the cumulative percentages of the Error III
that performs these calculations automatically (which can scores for certain subgroups are derived from the data of
easily be done with standard programs such as Microsoft small samples (see the ns in Table 4). Thus, caution is
Excel) or to use categorized values instead of continuous needed when using the normative cumulative percentages
values. for the Error III score.
Although the regression-based approach has some sig- Although the normative data presented here are for
nificant advantages over the traditional method, it requires Dutch-speaking participants, the Stroop test has been
some distributional assumptions that are not always satis- found to be culturally robust in a large-scale study that in-
fied. Indeed, in our study, the regression-based method volved seven different languages (English, Danish, Dutch,
was appropriate to derive norms for the speed-dependent French, German, Greek, and Spanish; Møller et al., 1998).
scores (Stroop I, Stroop II, Stroop III, and interference This suggests that the normative data can be used for peo-
scores), but there were problems with the accuracy mea- ple living in different cultural areas or for people who
sures (error scores). An important difference between both speak other languages, but more research is needed. Houx
types of scores is that the speed-dependent scores are con- et al. (2002) showed that the shortened Stroop test (with 40
74 ASSESSMENT

items per subtest instead of 100) was a reliable test (test/ Stroop test are similar across countries and languages. As
retest correlation with 2-week test interval of .80 for the a final remark, it should be kept in mind we used the
40-item Stroop III score) in a large sample of elderly par- Hammes (1973) version of the Stroop test. Although the
ticipants in Great Britain, Ireland, and the Netherlands Stroop interference effect is a robust effect that has been ob-
(N = 5,804; age range = 70-82 years). The Pearson’s corre- served with numerous different versions of the Stroop test,
lations varied less than .05 for each of the three countries, more research is needed regarding the comparability and
which suggests that the psychometric properties of the interchangeability of the different Stroop test versions.
Van der Elst et al. / THE STROOP COLOR-WORD TEST 75

APPENDIX A1
Cumulative Percentages for the Raw Error I, Error II, and Error III Scores That Represent the Proportion of Individuals
Who Score At Or Above 0, 1, 2, 3, 4, and 5, Stratified by Their Relevant Predictors

MALE & FEMALE


Raw score All Age Groups
All LE’s 5 .2
Error I

4 .6
3 1.4
2 3.8
1 14.7
0 100.0
MALE & FEMALE
Raw score All Age Groups
5 .6
4 1.5
LE low

3 4.0
2 11.6
1 27.2
0 100.0
Raw score All Age Groups
5 .5
LE average

4 .8
Error II

3 2.2
2 7.1
1 21.0
0 100.0
Raw score All Age Groups
5 .2
LE high

4 1.5
3 1.5
2 5.4
1 14.1
0 100.0
MALE & FEMALE
Age group
Raw score
25±1 30±1 35±1 40±1 45±1 50±1 55±1 60±1 65±1 70±1 75±1 80±1
LE low

5 4.8 0.0 0.0 2.7 2.0 9.6 1.3 4.1 6.0 7.5 17.7 18.4
4 9.5 0.0 3.0 5.4 4.1 17.3 3.8 4.1 9.6 11.3 22.8 21.1
3 19.0 0.0 12.1 8.1 12.2 25.0 11.3 8.2 14.5 15.0 27.8 28.9
2 28.6 9.1 27.3 10.8 22.4 34.6 22.5 24.7 24.1 27.5 38.0 39.5
1 47.6 40.9 39.4 27.0 36.7 48.1 36.3 42.5 37.3 42.5 55.7 63.2
0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
Age group
Raw score
25±1 30±1 35±1 40±1 45±1 50±1 55±1 60±1 65±1 70±1 75±1 80±1
Error III

LE average

5 0.0 0.0 0.0 0.0 1.4 0.0 4.1 1.5 2.0 0.0 1.9 0.0
4 0.0 2.5 2.6 1.4 4.1 1.4 4.1 1.5 2.0 1.8 3.8 22.0
3 5.2 8.9 5.2 4.2 13.7 5.6 8.2 3.1 8.2 5.3 5.7 22.0
2 11.7 13.9 10.4 11.1 21.9 11.3 16.3 10.8 18.4 10.5 17.0 22.0
1 35.1 27.8 27.3 22.2 27.4 31.0 16.3 35.4 42.9 29.8 32.1 66.7
0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0
Age group
Raw score
25±1 30±1 35±1 40±1 45±1 50±1 55±1 60±1 65±1 70±1 75±1 80±1
5 1.6 3.8 0.0 4.3 0.0 0.0 0.0 0.0 0.0 5.6 12.5 8.3
LE high

4 1.6 3.8 0.0 6.5 0.0 0.0 0.0 0.0 0.0 5.6 12.5 8.3
3 3.3 5.7 2.1 8.7 5.0 0.0 3.7 5.3 0.0 5.6 12.5 8.3
2 9.8 5.7 8.5 10.9 10.0 5.4 11.1 5.3 21.1 22.2 25.0 16.7
1 16.4 28.3 21.3 26.1 20.0 21.6 14.8 15.8 21.1 33.3 37.5 33.3
0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0

NOTE: LE = level of education.


76 ASSESSMENT

APPENDIX B1
Regression based normative data for the Stroop I score stratified by Age (25, 30, ..., 80 years) and Level of Education. The
raw score leading to a particular Z-value is given for Z-values indicating the percentiles 5, 10, 20, 50, 80, 90 and 95

MALE & FEMALE


Z Cum. Age in years
value prob. 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 31.9 31.9 32.0 32.3 32.7 33.3 34.0 33.7 34.8 35.9 37.3 38.8

LE low
1.28 .90 34.5 34.5 34.6 34.9 35.3 35.9 36.6 36.6 37.6 38.8 40.1 41.6
0.84 .80 37.6 37.6 37.8 38.0 38.5 39.0 39.8 40.1 41.1 42.3 43.6 45.1
0 .50 43.7 43.7 43.8 44.1 44.5 45.1 45.8 46.7 47.8 48.9 50.3 51.7
-0.84 .20 49.8 49.8 49.9 50.2 50.6 51.2 51.9 53.4 54.4 55.6 56.9 58.4
-1.28 .10 52.9 52.9 53.1 53.3 53.8 54.3 55.1 56.9 57.9 59.1 60.4 61.9
-1.64 .05 55.5 55.5 55.7 55.9 56.4 56.9 57.7 59.7 60.7 61.9 63.3 64.7

Z Cum. Age in years


value prob. 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 30.3 30.3 29.7 30.0 30.4 31.0 31.8 32.6 32.3 33.5 33.7 35.2
LE average

1.28 .90 32.5 32.5 32.0 32.3 32.7 33.3 34.1 34.9 34.9 36.1 36.5 38.0
0.84 .80 35.1 35.1 34.9 35.1 35.6 36.1 36.9 37.8 38.1 39.3 40.0 41.5
0 .50 40.1 40.1 40.2 40.5 40.9 41.5 42.2 43.1 44.2 45.3 46.7 48.1
-0.84 .20 45.1 45.1 45.6 45.9 46.3 46.9 47.6 48.5 50.2 51.4 53.3 54.8
-1.28 .10 47.7 47.7 48.4 48.7 49.1 49.7 50.4 51.3 53.4 54.6 56.8 58.3
-1.64 .05 49.9 49.9 50.7 51.0 51.4 52.0 52.7 53.6 56.0 57.2 59.7 61.1

Z Cum. Age in years

value prob. 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 28.8 28.8 28.9 29.2 29.7 30.2 30.2 31.1 32.2 32.0 33.3 33.6
LE high

1.28 .90 31.0 31.0 31.1 31.4 31.8 32.4 32.5 33.4 34.5 34.6 35.9 36.5
0.84 .80 33.6 33.6 33.7 34.0 34.4 35.0 35.4 36.2 37.3 37.8 39.1 40.0
0 .50 38.6 38.6 38.7 39.0 39.4 40.0 40.7 41.6 42.7 43.8 45.2 46.6
-0.84 .20 43.6 43.6 43.7 44.0 44.4 45.0 46.1 47.0 48.0 49.9 51.2 53.3
-1.28 .10 46.2 46.2 46.4 46.6 47.1 47.6 48.9 49.8 50.8 53.1 54.4 56.8
-1.64 .05 48.4 48.4 48.5 48.8 49.2 49.8 51.2 52.1 53.1 55.7 57.0 59.6

NOTE: LE = level of education; Cum. prob. = cumulative probability.

APPENDIX B2
Regression based normative data for the Stroop II score stratified by Age (25, 30, ..., 80 years), Sex, and Level of Education.
The raw score leading to a particular Z-value is given for Z-values indicating the percentiles 5, 10, 20, 50, 80, 90 and 95

MALE FEMALE
Z Cum. Age in years Age in years

value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 42.8 42.3 42.1 42.3 42.8 43.6 44.9 44.5 46.4 48.7 51.3 54.3 42.0 41.5 41.3 41.4 42.0 41.3 42.5 44.0 44.1 46.3 48.9 51.9
LE low

1.28 .90 46.2 45.7 45.5 45.6 46.2 47.0 48.3 48.3 50.3 52.5 55.1 58.1 45.0 44.5 44.3 44.5 45.0 44.6 45.9 47.4 47.9 50.1 52.8 55.7
0.84 .80 50.3 49.8 49.6 49.8 50.3 51.2 52.4 53.0 54.9 57.2 59.8 62.8 48.7 48.2 48.0 48.2 48.7 48.8 50.0 51.6 52.5 54.8 57.4 60.4
0 .50 58.2 57.7 57.5 57.7 58.2 59.1 60.3 61.9 63.8 66.1 68.7 71.7 55.9 55.3 55.1 55.3 55.8 56.7 57.9 59.5 61.4 63.7 66.3 69.3
-0.84 .20 66.2 65.6 65.4 65.6 66.1 67.0 68.2 70.8 72.7 75.0 77.6 80.6 63.0 62.4 62.2 62.4 62.9 64.6 65.8 67.4 70.3 72.6 75.2 78.2
-1.28 .10 70.3 69.8 69.6 69.8 70.3 71.1 72.4 75.4 77.4 79.6 82.2 85.2 66.7 66.2 66.0 66.1 66.7 68.8 70.0 71.5 75.0 77.2 79.9 82.8
-1.64 .05 73.7 73.2 73.0 73.2 73.7 74.5 75.8 79.2 81.2 83.4 86.1 89.0 69.7 69.2 69.0 69.2 69.7 72.2 73.4 74.9 78.8 81.0 83.7 86.6

Z Cum. Age in years Age in years


value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 40.1 39.6 39.4 39.6 40.1 41.0 40.6 42.2 44.1 44.5 47.1 50.1 38.5 38.0 37.8 38.0 38.5 38.6 39.8 41.4 41.7 44.0 44.7 47.7
LE average

1.28 .90 43.2 42.7 42.5 42.6 43.2 44.0 44.0 45.6 47.5 48.3 50.9 53.9 41.4 40.9 40.7 40.9 41.4 41.6 42.9 44.4 45.1 47.4 48.5 51.5
0.84 .80 46.9 46.4 46.2 46.4 46.9 47.8 48.2 49.7 51.7 52.9 55.6 58.5 44.9 44.4 44.2 44.4 44.9 45.4 46.6 48.2 49.3 51.5 53.2 56.1
0 .50 54.0 53.5 53.3 53.5 54.0 54.9 56.1 57.6 59.6 61.8 64.5 67.4 51.6 51.1 50.9 51.1 51.6 52.5 53.7 55.3 57.2 59.4 62.1 65.0
-0.84 .20 61.1 60.6 60.4 60.6 61.1 62.0 64.0 65.6 67.5 70.7 73.4 76.3 58.3 57.8 57.6 57.8 58.3 59.6 60.8 62.4 65.1 67.4 71.0 73.9
-1.28 .10 64.8 64.3 64.1 64.3 64.8 65.7 68.1 69.7 71.6 75.4 78.0 81.0 61.8 61.3 61.1 61.3 61.8 63.3 64.5 66.1 69.2 71.5 75.6 78.6
-1.64 .05 67.9 67.4 67.2 67.3 67.9 68.7 71.5 73.1 75.0 79.2 81.8 84.8 64.7 64.2 64.0 64.2 64.7 66.3 67.6 69.1 72.6 74.9 79.4 82.4

Z Cum. Age in years Age in years


value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 37.8 38.0 37.9 38.0 38.5 38.6 39.9 41.4 41.8 44.0 44.7 47.7 36.2 35.6 35.5 35.6 36.2 37.0 38.2 39.0 41.0 41.7 44.3 45.3
LE high

1.28 .90 40.8 40.9 40.7 40.9 41.4 41.7 42.9 44.5 45.2 47.4 48.6 51.5 39.0 38.5 38.3 38.5 39.0 39.9 41.1 42.1 44.0 45.0 47.7 49.1
0.84 .80 44.6 44.4 44.2 44.4 44.9 45.4 46.6 48.2 49.3 51.6 53.2 56.2 42.6 42.0 41.9 42.0 42.5 43.4 44.6 45.8 47.7 49.2 51.8 53.8
0 .50 51.7 51.1 51.0 51.1 51.6 52.5 53.7 55.3 57.2 59.5 62.1 65.1 49.3 48.7 48.6 48.7 49.3 50.1 51.3 52.9 54.8 57.1 59.7 62.7
-0.84 .20 58.8 57.8 57.7 57.8 58.4 59.6 60.8 62.4 65.1 67.4 71.0 74.0 56.0 55.5 55.3 55.4 56.0 56.8 58.1 60.0 61.9 65.0 67.6 71.6
-1.28 .10 62.5 61.4 61.2 61.3 61.9 63.3 64.6 66.1 69.3 71.5 75.7 78.6 59.5 59.0 58.8 59.0 59.5 60.3 61.6 63.7 65.7 69.2 71.8 76.2
-1.64 .05 65.5 64.2 64.1 64.2 64.7 66.4 67.6 69.2 72.7 74.9 79.5 82.4 62.4 61.8 61.7 61.8 62.4 63.2 64.4 66.8 68.7 72.5 75.2 80.1

NOTE: LE = level of education; Cum. prob. = cumulative probability.


Van der Elst et al. / THE STROOP COLOR-WORD TEST 77

APPENDIX B3
Regression based normative data for the Stroop III score stratified by Age (25, 30, ..., 80 years), Sex, and Level of Education.
The raw score leading to a particular Z-value is given for Z-values indicating the percentiles 5, 10, 20, 50, 80, 90 and 95

MALE FEMALE
Z Cum. Age in years Age in years

value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 64.9 63.3 62.8 63.5 65.4 68.4 72.5 67.3 73.7 81.3 90.0 99.9 65.6 64.0 63.5 64.2 60.9 63.9 68.0 73.3 69.2 76.8 85.6 95.5
LE low

1.28 .90 71.9 70.3 69.9 70.5 72.4 75.4 79.5 76.6 83.0 90.6 99.4 109.3 71.5 69.9 69.4 70.1 67.9 70.9 75.1 80.4 78.6 86.2 94.9 104.8
0.84 .80 80.5 78.9 78.4 79.1 81.0 84.0 88.1 88.0 94.5 102.0 110.8 120.7 78.7 77.1 76.6 77.3 76.5 79.5 83.6 88.9 90.0 97.6 106.3 116.2
0 .50 96.9 95.3 94.8 95.5 97.4 100.4 104.5 109.8 116.2 123.8 132.6 142.5 92.4 90.8 90.4 91.0 92.9 95.9 100.0 105.3 111.8 119.4 128.1 138.0
-0.84 .20 113.3 111.7 111.2 111.9 113.7 116.7 120.9 131.6 138.0 145.6 154.4 164.3 106.2 104.6 104.1 104.8 109.3 112.3 116.4 121.7 133.6 141.2 149.9 159.8
-1.28 .10 121.8 120.2 119.8 120.5 122.3 125.3 129.5 143.0 149.4 157.0 165.8 175.7 113.4 111.8 111.3 112.0 117.9 120.9 125.0 130.3 145.0 152.6 161.3 171.2
-1.64 .05 128.9 127.3 126.8 127.5 129.4 132.3 136.5 152.3 158.8 166.4 175.1 185.0 119.3 117.6 117.2 117.9 124.9 127.9 132.0 137.3 154.3 161.9 170.6 180.5

Z Cum. Age in years Age in years


value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 56.8 55.1 54.7 55.4 57.2 60.2 64.4 64.5 71.0 68.0 76.8 86.7 56.2 54.6 54.2 54.9 56.7 55.8 59.9 65.2 66.5 74.1 72.3 82.2
LE average

1.28 .90 62.6 61.0 60.6 61.3 63.1 66.1 70.3 71.5 78.0 77.4 86.1 96.0 61.3 59.6 59.2 59.9 61.7 61.7 65.8 71.1 73.5 81.1 81.6 91.5
0.84 .80 69.8 68.2 67.8 68.5 70.3 73.3 77.5 80.1 86.6 88.8 97.5 107.4 67.4 65.8 65.3 66.0 67.9 68.9 73.0 78.3 82.1 89.7 93.0 102.9
0 .50 83.6 82.0 81.5 82.2 84.1 87.1 91.2 96.5 103.0 110.6 119.3 129.2 79.1 77.5 77.1 77.8 79.6 82.6 86.7 92.0 98.5 106.1 114.8 124.7
-0.84 .20 97.3 95.7 95.3 96.0 97.8 100.8 105.0 112.9 119.3 132.3 141.1 151.0 90.9 89.2 88.8 89.5 91.3 96.3 100.5 105.8 114.9 122.5 136.6 146.5
-1.28 .10 104.5 102.9 102.5 103.2 105.0 108.0 112.2 121.5 127.9 143.7 152.5 162.4 97.0 95.4 94.9 95.6 97.5 103.6 107.7 113.0 123.5 131.0 148.0 157.9
-1.64 .05 110.4 108.8 108.4 109.1 110.9 113.9 118.1 128.5 134.9 153.1 161.8 171.7 102.0 100.4 100.0 100.7 102.5 109.4 113.6 118.9 130.5 138.1 157.4 167.3

Z Cum. Age in years Age in years


value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 56.8 55.2 54.8 55.5 53.4 56.4 60.5 65.8 67.1 74.7 72.9 82.8 52.4 50.7 50.3 51.0 52.8 55.8 56.0 61.3 62.6 70.2 68.4 78.3
LE high

1.28 .90 61.9 60.2 59.8 60.5 59.3 62.2 66.4 71.7 74.1 81.7 82.2 92.1 57.4 55.8 55.3 56.0 57.9 60.9 61.9 67.2 69.6 77.2 77.8 87.6
0.84 .80 68.0 66.4 65.9 66.6 66.5 69.4 73.6 78.9 82.7 90.3 93.6 103.5 63.5 61.9 61.5 62.2 64.0 67.0 69.1 74.4 78.2 85.8 89.2 99.1
0 .50 79.7 78.1 77.7 78.4 80.2 83.2 87.3 92.6 99.1 106.7 115.4 125.3 75.3 73.6 73.2 73.9 75.7 78.7 82.9 88.2 94.6 102.2 111.0 120.8
-0.84 .20 91.5 89.8 89.4 90.1 94.0 96.9 101.1 106.4 115.5 123.1 137.2 147.1 87.0 85.4 84.9 85.6 87.5 90.5 96.6 101.9 111.0 118.6 132.7 142.6
-1.28 .10 97.6 96.0 95.5 96.2 101.2 104.1 108.3 113.6 124.1 131.6 148.6 158.5 93.1 91.5 91.1 91.8 93.6 96.6 103.8 109.1 119.6 127.2 144.2 154.0
-1.64 .05 102.6 101.0 100.6 101.3 107.0 110.0 114.2 119.5 131.1 138.7 158.0 167.9 98.2 96.5 96.1 96.8 98.6 101.6 109.7 115.0 126.6 134.2 153.5 163.4

NOTE: LE = level of education; Cum. prob. = cumulative probability.

APPENDIX B4
Regression based normative data for the Stroop Interference score stratified by Age (25, 30, ..., 80 years),
Sex, and Level of Education. The raw score leading to a particular Z-value is given for Z-values indicating
the percentiles 5, 10, 20, 50, 80, 90 and 95

MALE FEMALE
Z Cum. Age in years Age in years

value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 20.1 19.9 20.4 16.5 18.6 21.6 25.3 19.0 24.3 30.5 37.4 45.1 17.1 16.9 17.4 18.7 20.9 18.6 22.3 26.8 21.3 27.5 34.4 42.1
LE low

1.28 .90 24.7 24.4 25.0 22.2 24.4 27.3 31.0 27.1 32.4 38.6 45.5 53.2 21.7 21.4 22.0 23.3 25.4 24.3 28.0 32.5 29.4 35.5 42.5 50.2
0.84 .80 30.3 30.0 30.5 29.2 31.3 34.3 38.0 37.0 42.3 48.4 55.4 63.1 27.3 27.0 27.5 28.9 31.0 31.3 35.0 39.5 39.3 45.4 52.4 60.1
0 .50 40.9 40.6 41.2 42.5 44.6 47.6 51.3 55.9 61.2 67.3 74.3 82.0 37.9 37.6 38.2 39.5 41.6 44.6 48.3 52.8 58.2 64.3 71.2 79.0
-0.84 .20 51.5 51.3 51.8 55.8 58.0 60.9 64.6 74.7 80.1 86.2 93.1 100.9 48.5 48.3 48.8 50.1 52.3 57.9 61.6 66.2 77.1 83.2 90.1 97.9
-1.28 .10 57.1 56.9 57.4 62.8 64.9 67.9 71.6 84.6 90.0 96.1 103.0 110.8 54.1 53.8 54.4 55.7 57.8 64.9 68.6 73.1 86.9 93.1 100.0 107.7
-1.64 .05 61.7 61.4 61.9 68.5 70.6 73.6 77.3 92.7 98.0 104.2 111.1 118.8 58.7 58.4 58.9 60.3 62.4 70.6 74.3 78.8 95.0 101.2 108.1 115.8

Z Cum. Age in years Age in years

value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 15.8 14.7 14.4 14.9 16.2 18.3 16.0 19.7 24.2 18.6 24.7 31.6 15.5 14.4 14.1 14.6 15.9 15.3 18.2 16.7 21.2 26.5 21.7 28.6
LE average

1.28 .90 20.4 19.3 19.0 19.5 20.8 22.9 21.7 25.4 29.9 26.7 32.8 39.7 19.4 18.3 18.0 18.5 19.8 19.9 22.8 22.4 26.9 32.2 29.8 36.7
0.84 .80 25.9 24.8 24.5 25.0 26.3 28.4 28.7 32.4 36.9 36.6 42.7 49.6 24.3 23.2 22.9 23.4 24.7 25.4 28.3 29.3 33.8 39.1 39.7 46.6
0 .50 36.6 35.5 35.2 35.7 37.0 39.1 42.0 45.7 50.2 55.5 61.6 68.5 33.6 32.5 32.2 32.7 34.0 36.1 39.0 42.7 47.2 52.5 58.6 65.5
-0.84 .20 47.2 46.1 45.8 46.3 47.6 49.7 55.3 59.0 63.5 74.4 80.5 87.4 42.8 41.7 41.4 41.9 43.2 46.7 49.6 56.0 60.5 65.8 77.4 84.3
-1.28 .10 52.8 51.7 51.4 51.9 53.2 55.3 62.3 66.0 70.5 84.2 90.3 97.2 47.7 46.6 46.3 46.8 48.1 52.3 55.2 63.0 67.5 72.8 87.3 94.2
-1.64 .05 57.3 56.2 55.9 56.4 57.7 59.8 68.0 71.7 76.2 92.3 98.4 105.3 51.7 50.6 50.3 50.8 52.1 56.8 59.7 68.7 73.2 78.5 95.4 102.3

Z Cum. Age in years Age in years

value prob. 25 30 35 40 45 50 55 60 65 70 75 80 25 30 35 40 45 50 55 60 65 70 75 80
1.64 .95 14.2 15.7 15.3 15.7 14.2 16.2 19.0 17.4 21.8 27.0 22.2 29.0 13.8 12.7 12.3 12.7 13.9 15.9 16.0 19.6 18.8 24.0 19.1 25.9
LE high

1.28 .90 18.7 19.6 19.2 19.6 18.8 20.8 23.6 23.1 27.5 32.7 30.2 37.0 17.8 16.6 16.2 16.6 17.8 19.8 20.6 24.2 24.5 29.7 27.2 34.0
0.84 .80 24.3 24.5 24.1 24.5 24.3 26.3 29.1 30.1 34.5 39.7 40.1 46.9 22.7 21.5 21.1 21.5 22.7 24.7 26.1 29.7 31.5 36.7 37.1 43.9
0 .50 35.0 33.8 33.4 33.8 35.0 37.0 39.8 43.4 47.8 53.0 59.0 65.8 31.9 30.8 30.4 30.8 32.0 34.0 36.8 40.4 44.8 50.0 56.0 62.8
-0.84 .20 45.6 43.0 42.6 43.0 45.6 47.6 50.4 56.7 61.1 66.3 77.9 84.7 41.2 40.0 39.6 40.0 41.2 43.2 47.4 51.0 58.1 63.3 74.9 81.7
-1.28 .10 51.2 47.9 47.5 47.9 51.2 53.2 56.0 63.7 68.1 73.3 87.8 94.6 46.1 44.9 44.5 44.9 46.1 48.1 53.0 56.6 65.1 70.3 84.8 91.6
-1.64 .05 55.7 51.9 51.5 51.9 55.8 57.8 60.6 69.4 73.8 79.0 95.9 102.7 50.0 48.9 48.5 48.9 50.1 52.1 57.6 61.2 70.8 76.0 92.9 99.7

NOTE: LE = level of education; Cum. prob. = cumulative probability.


78 ASSESSMENT

NOTES Graf, P., Uttl, B., & Tuokko, H. (1995). Color- and Picture-word Stroop
tests: Performance changes in old age. Journal of Clinical and Exper-
1. A Z score is usually calculated as (observed score – mean score) / imental Neuropsychology, 17, 390-415.
SD (so without reversing the positive/negative sign), because in general a Hameleers, P. A. H. M., Van Boxtel, M. P. J., Hogervorst, E., Riedel, W. J.,
higher/lower observed score compared to the mean score signifies a Houx, P. J., Buntinx, F., & Jolles, J. (2000). Habitual caffeine con-
better/worse performance than expected, respectively. With regard to the sumption and its relation to memory, planning capacity and psycho-
Stroop test scores, however, a higher score means a worse performance. motor performance across multiple age groups. Human
For this reason, the sign of the residual value is reversed. Psychopharmacology: Clinical and Experimental, 15, 573-581.
Hammes, J. (1973). De Stroop Kleur-Woord Test: Handleiding [The
2. The sign of the residual value is reversed for the reason given in
Stroop Color-Word Test: Manual]. Amsterdam: Swets & Zeitlinger.
Note 1.
Hays, W. L., & Winkler, R. L. (1971). Statistics: Probability, inference,
and decision. New York: Holt, Rinehart & Winston.
Heaton, R. K., Avitable, N., Grant, I., & Matthews, C. G. (1999). Further
REFERENCES crossvalidation of regression-based neuropsychological norms with
an update for the Boston Naming Test. Journal of Clinical and Exper-
imental Neuropsychology, 21, 572-582.
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and inter- Heaton, R. K., Matthews, C. G., Grant, I., & Avitable, N. (1996). Demo-
preting interactions. Newbury Park, CA: Sage. graphic corrections with comprehensive norms: An overzealous at-
Belsley, D. A., Kuh, E., & Welsch, R. E. (1980). Regression diagnostics: tempt, or a good start? Journal of Clinical and Experimental
Identifying influential data and sources of collinearity. New York: Neuropsychology, 18, 449-458.
John Wiley. Houx, P. J., Jolles, J., & Vreeling, F. W. (1993). Stroop Interference: Ag-
Bohnen, N., Twijnstra, A., & Jolles, J. (1992). Performance in the Stroop ing effects associated with the Stroop Color-Word Test. Experimental
Color Word Test in relationship to the persistence of symptoms fol- Aging Research, 19, 209-224.
lowing mild head injury. Acta Neurologica Scandinavica, 85, 116- Houx, P. J., Shepherd, J., Blauw, G., Murphy, M. B., Ford, I., Bollen,
121. E. L., et al. (2002). Testing cognitive function in elderly populations:
Bosma, H., Van Boxtel, M. P. J., Ponds, R. W. H. M., Houx, P. J., & Jolles, The PROSPER study. Journal of Neurology, Neurosurgery & Psychi-
J. (2002). Mental work demands protect against cognitive impair- atry, 73, 385-389.
ment: MAAS prospective cohort study. Experimental Aging Re- Ivnik, R. J., Malec, J. F., Smith, G. E., Tangalos, E. G., & Petersen, R. C.
search, 29, 33-45. (1996). Neuropsychological tests’ norms above age 55: COWAT,
Capitani, E. (1997). Normative data and neuropsychological assessment: BNT, MAE Token, WRAT-R Reading, AMNART, STROOP, TMT,
Common problems in clinical practice and research. Neuropsycho- and JLO. The Clinical Neuropsychologist, 10, 262-278.
logical Rehabilitation, 7, 295-309. Jolles, J., Houx, P. J., Van Boxtel, M. P. J., & Ponds, R. W. H. M. (1995).
Comalli, P. E., Wapner, S., & Werner, H. (1962). Interference effects of Maastricht aging study: Determinants of cognitive aging. Maastricht,
Stroop Color-Word Test in childhood, adulthood, and aging. Journal the Netherlands: Neuropsych Publishers.
of Genetic Psychology, 100, 47-53. Klein, M., Ponds, R. W., Houx, P. J., & Jolles, J. (1997). Effect of test du-
Daigneault, S., Braun, C. M., & Whitaker, H. A. (1992). Early effects of ration on age-related differences in Stroop interference. Journal of
normal aging on perseverative and non-perseverative prefrontal mea- Clinical and Experimental Neuropsychology, 19, 77-82.
sures. Developmental Neuropsychology, 8, 99-114. Kleinbaum, D. G., Kupper, L. L., Muller, K. E., & Nizam, A. (1998). Ap-
Davidson, D. J., Zacks, R. T., & Williams, C. C. (2003). Stroop Interfer- plied regression analysis and other multivariate methods (3rd ed.).
ence, practice and aging. Aging Neuropsychology and Cognition, 10, New York: Duxbury Press.
85-98. Lamberts, H., & Wood, M. (1987). ICPC: International Classification of
De Bie, S. E. (1987). Standaardvragen 1987: Voorstellen voor uniformer- Primary Care. Oxford: Oxford University Press.
ing van vraagstellingen naar achtergrondkenmerken en interviews Le Carret, N., Lafont, S., Letenneur, L., Dartigues, J. F., Mayo, W., &
[Standard questions 1987: Proposal for uniformization of questions Fabrigoule, C. (2003). The effect of education on cognitive perfor-
regarding background variables and interviews]. Leiden, the Nether- mances and its implication for the constitution of the cognitive re-
lands: Leiden University Press. serve. Developmental Neuropsychology, 23, 317-337.
Dufouil, C., Alpérovitch, A., & Tzourio, C. (2003). Influence of educa- Lezak, M. D., Howieson, D. B., & Loring, D. W. (2004). Neuropsycho-
tion on the relationship between white matter lesions and cognition. logical assessment (4th ed.). New York: Oxford University Press.
Neurology, 60, 831-836. Libon, D. J., Glosser, G., Malamut, B. L., Kaplan, E., Goldberg, E.,
Dulaney, C. L., & Rogers, W. A. (1994). Mechanisms underlying reduc- Swenson, R., et al. (1994). Age, executive functions, and visuospatial
tion in Stroop Interference with practice for young and old adults. functioning in healthy older adults. Neuropsychology, 8, 38-43.
Journal of Experimental Psychology: Learning, Memory, and Cogni- MacLeod, C. M. (1991). Half a century of research on the Stroop effect:
tion, 20, 470-484. An integrative review. Psychological Bulletin, 109, 163-203.
Fastenau, P. S. (1998). Validity of regression-based norms: an empirical Marquardt, D. W. (1980). You should standardize the predictor variables
test of the comprehensive norms with older adults. Journal of Clinical in your regression models. Journal of the American Statistical Asso-
and Experimental Neuropsychology, 6, 906-916. ciation, 75, 87-91.
Fastenau, P. S., & Adams, K. M. (1996). Heaton, Grant, and Matthews’ Martin, N. J., & Franzen, M. D. (1989). The effect of anxiety on neuro-
Comprehensive Norms: An overzealous attempt. Journal of Clinical psychological function. International Journal of Clinical Neuro-
and Experimental Neuropsychology, 18, 444-448. psychology, 11, 1-8.
Feinstein, A., Brown, R., & Ron, M. (1994). Effects of practice of serial Metsemakers, J. F. M., Höppener, P., Knottnerus, J. A., Kocken, R. J. J., &
tests of attention in healthy subjects. Journal of Clinical and Experi- Limonard, C. B. G. (1992). Computerized health information in the
mental Neuropsychology, 16, 436-447. Netherlands: A registration network of family practices. British Jour-
Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). Mini-Mental nal of General Practice, 42, 102-106.
State: a practical method for grading the cognitive state of patients for Mitrushina, M. N., Boone, K. B., & D’Elia, L. F. (1999). Handbook of
the clinician. Journal of Psychiatric Research, 12, 189-198. normative data for neuropsychological assessment. New York: Ox-
Fox, J. (1997). Applied regression analysis, linear models, and related ford University Press.
methods. Newbury Park, CA: Sage. Moering, R. G., Schinka, J. A., Mortimer, J. A., & Graves, A. B. (2003).
Golden, C. (1978). Stroop Color and Word Test: Manual for clinical and Normative data for elderly African Americans for the Stroop Color
experimental uses. Chicago: Stoelting. and Word Test. Archives of Clinical Neuropsychology, 607, 1-11.
Van der Elst et al. / THE STROOP COLOR-WORD TEST 79

Møller, J. T., Cluitmans, P., Rasmussen, L. S., Houx, P. J., Rasmussen, H., Wim Van der Elst is a PhD student in neuropsychology in the
Canet, J., et al. (1998). Long-term postoperative cognitive dysfunc-
Department of Psychiatry and Neuropsychology at Maastricht
tion in the elderly: ISPOCD1 study. The Lancet, 351, 857-861.
Scarmeas, N., Levy, G., Tang, M. X., Manly, J., & Stern, Y. (2001). Influ-
University in the Netherlands. His research interests include
ence of leisure activity on the incidence of Alzheimer’s disease. Neu- psychometrics and models of cognitive aging. He has authored
rology, 57, 2236-2242. several articles on the development of normative data for com-
Spreen, O., & Strauss, E. (1998). A compendium of neuropsychological monly used neuropsychological tests.
tests. New York: Oxford University Press.
Stern, Y., Zarahn, E., Hilton, H. J., Flynn, J., DeLaPaz, R., & Rakitin Martin P. J. Van Boxtel, MD, PhD, is an associate professor in
(2003). Exploring the neural basis of cognitive reserve. Journal of the Department of Psychiatry and Neuropsychology at
Clinical and Experimental Neuropsychology, 25, 691-701.
Maastricht University. He is engaged in the Maastricht Aging
Stroop, J. (1935). Studies of interference in serial verbal reactions. Jour-
nal of Experimental Psychology, 18, 643-662.
Study, a longitudinal research program of the determinants of
Swerdlow, N. R., Filion, D., Geyer, M. A., & Braff, D. L. (1995). “Nor- usual and pathological cognitive aging. His present research
mal” personality correlates of sensimotor, cognitive, and visuospatial interests involve vascular risk factors and imaging techniques in
gating. Biological Psychiatry, 37, 286-299. cognitive aging studies.
Trenerry, M., Crosson, B., DeBoe, J., & Leber, W. (1989). Stroop Neuro-
psychological Screening Test manual. Adessa, FL: Psychological As- Gerard J. P. Van Breukelen is an associate professor of statis-
sessment Resources (PAR). tics in the Department of Psychology at Maastricht University.
United Nations Educational, Scientific and Cultural Organisation. (1976). His past research focused on psychometrics and response time
International Standard Classification of Education (ISCED). Paris:
UNESCO.
models. His current research deals with the optimal design and
Uttl, B., & Graf, P. (1997). Color-Word Stroop test performance across analysis of field experiments with a nested design or repeated
the adult life span. Journal of Clinical and Experimental Neuro- measures and is based on mixed (multilevel) regression model-
psychology, 19, 405-420. ing. He has coauthored numerous articles in this field and in
Valentijn, S. A. M., van Boxtel, M. P. J., Van Hooren, S. A., Bosma, H., applied psychology.
Beckers, H. J. M., Ponds, R. W. H. M., et al. (2005). Change in sen-
sory functioning predicts change in cognitive functioning: Results Jelle Jolles, PhD, has a degree in psychology (1977, specializa-
from a 6-year follow-up in the Maastricht Aging Study. Journal of the
tion neuropsychology) and a degree in chemistry (1975, special-
American Geriatrics Society, 53, 374-380.
Van Boxtel, M. P. J., ten Tusscher, M. P. M., Metsemakers, J. F. M.,
ization neurochemistry). He is a full professor of neuropsychol-
Willems, B., & Jolles, J. (2001). Visual determinants of reduced per- ogy and biological psychology at Maastricht University. He
formance on the Stroop Color-Word Test in normal aging individuals. leads the Division of Cognitive Disorders, which is embedded in
Journal of Clinical and Experimental Neuropsychology, 23, 620-627. the Research Institute Brain & Behavior, and is the director of the
Van Breukelen, G. J. P., & Vlaeyen, J. W. S. (in press). Norming Clinical Alzheimer Centre Limburg. Research activities are focused on
Questionnaires with multiple regression: the pain cognition list. Psy-
chological Assessment.
the relationship between biological and psychological factors in
Van der Elst, W., Van Boxtel, M. P. J., Van Breukelen, G. P. J., & Jolles, J. their effect on behavioral, cognitive, and affective functioning in
(2005). Rey’s verbal learning test: Normative data for 1855 healthy normal subjects and neuropsychiatric patients. His managerial
participants aged 24-81 years and the influence of sex, education, and activities relate to research, teaching, education, and patient care.
model of presentation. Journal of the International Neuropsycho- He is head of the patient care unit specializing in clinical neuro-
logical Society, 11, 290-302.
psychology at the psychiatric hospital Vijverdal and the Aca-
Zachary, R. A., & Gorsuch, R. L. (1985). Continuous norming: Implica-
tions for the WAIS-R. Journal of Clinical Psychology, 41, 86-97. demic Hospital Maastricht.

You might also like