This Content Downloaded From 114.10.22.171 On Fri, 03 Mar 2023 03:46:40 UTC

Download as pdf or txt
Download as pdf or txt
You are on page 1of 56

Improving Students' Learning With Effective Learning Techniques: Promising

Directions From Cognitive and Educational Psychology


Author(s): John Dunlosky, Katherine A. Rawson, Elizabeth J. Marsh, Mitchell J. Nathan
and Daniel T. Willingham
Source: Psychological Science in the Public Interest , 2013, Vol. 14, No. 1 (2013), pp. 4-58
Published by: Sage Publications, Inc. on behalf of the Association for Psychological
Science

Stable URL: https://www.jstor.org/stable/23484712

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

Sage Publications, Inc. and Association for Psychological Science are collaborating with JSTOR
to digitize, preserve and extend access to Psychological Science in the Public Interest

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
I ASSOCIATION FOR
PSYCHOLOGICAL SCIENCE

Psychological Science in the


Public Interest

Improving Students' Learning With 14(1)4-58


©The Author(s) 2013

Effective Learning Techniques: Promising Reprints and permission:


sagepub.com/journalsPermissions.nav

Directions From Cognitive and


DOI: 10.1177/1529 i 00612453266

http://pspi.sagepub.com

(DSAGE
Educational Psychology

John Dunlosky1, Katherine A. Rawson1, Elizabeth J. Marsh2,


Mitchell J. Nathan3, and Daniel T. Willingham4
'Department of Psychology, Kent State University; department of Psychology and Neuroscience, Duke University;
department of Educational Psychology, Department of Curriculum & Instruction, and Department of Psychology,
University of Wisconsin-Madison; and department of Psychology, University of Virginia

Summary
Many students are being left behind by an educational system that some people believe is in crisis. Improving educationa
outcomes will require efforts on many fronts, but a central premise of this monograph is that one part of a solution invol
helping students to better regulate their learning through the use of effective learning techniques. Fortunately, cognitive
educational psychologists have been developing and evaluating easy-to-use learning techniques that could help students achie
their learning goals. In this monograph, we discuss 10 learning techniques in detail and offer recommendations about th
relative utility.We selected techniques that were expected to be relatively easy to use and hence could be adopted by man
students. Also, some techniques (e.g., highlighting and rereading) were selected because students report relying heavily o
them, which makes it especially important to examine how well they work.The techniques include elaborative interrogat
self-explanation, summarization, highlighting (or underlining), the keyword mnemonic, imagery use for text learning, reread
practice testing, distributed practice, and interleaved practice.
To offer recommendations about the relative utility of these techniques, we evaluated whether their benefits generaliz
across four categories of variables: learning conditions, student characteristics, materials, and criterion tasks. Learning condi
include aspects of the learning environment in which the technique is implemented, such as whether a student studies a
or with a group. Student characteristics include variables such as age, ability, and level of prior knowledge. Materials vary f
simple concepts to mathematical problems to complicated science texts. Criterion tasks include different outcome measur
that are relevant to student achievement, such as those tapping memory, problem solving, and comprehension.
We attempted to provide thorough reviews for each technique, so this monograph is rather lengthy. However, we also wr
the monograph in a modular fashion, so it is easy to use. In particular, each review is divided into the following sections:

1. General description of the technique and why it should work


2. How general are the effects of this technique?
2a. Learning conditions
2b. Student characteristics
2c. Materials
2d. Criterion tasks
3. Effects in representative educational contexts
4. Issues for implementation
5. Overall assessment

Corresponding Author:
John Dunlosky, Psychology, Kent State University, Kent, OH 44242
E-mail: [email protected]

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement

The review for each technique can be read independently of the others, and particular variables of interest can be easily
compared across techniques.
To foreshadow our final recommendations, the techniques vary widely with respect to their generalizability and promise
for improving student learning. Practice testing and distributed practice received high utility assessments because they benefit
learners of different ages and abilities and have been shown to boost students' performance across many criterion tasks and
even in educational contexts. Elaborative interrogation, self-explanation, and interleaved practice received moderate utility
assessments. The benefits of these techniques do generalize across some variables, yet despite their promise, they fell short
of a high utility assessment because the evidence for their efficacy is limited. For instance, elaborative interrogation and self
explanation have not been adequately evaluated in educational contexts, and the benefits of interleaving have just begun to be
systematically explored, so the ultimate effectiveness of these techniques is currently unknown. Nevertheless, the techniques
that received moderate-utility ratings show enough promise for us to recommend their use in appropriate situations, which we
describe in detail within the review of each technique.
Five techniques received a low utility assessment: summarization, highlighting, the keyword mnemonic, imagery use for text
learning, and rereading.These techniques were rated as low utility for numerous reasons. Summarization and imagery use for
text learning have been shown to help some students on some criterion tasks, yet the conditions under which these techniques
produce benefits are limited,and much research is still needed to fully explore their overall effectiveness.The keyword mnemonic
is difficult to implement in some contexts, and it appears to benefit students for a limited number of materials and for short
retention intervals. Most students report rereading and highlighting, yet these techniques do not consistently boost students'
performance, so other techniques should be used in their place (e.g., practice testing instead of rereading).
Our hope is that this monograph will foster improvements in student learning, not only by showcasing which learning
techniques are likely to have the most generalizable effects but also by encouraging researchers to continue investigating the
most promising techniques. Accordingly, in our closing remarks, we discuss some issues for how these techniques could be
implemented by teachers and students, and we highlight directions for future research.

Introduction techniques (e.g., self-testing, distributed practice) because an


initial survey of the literature indicated that they could im
If simple techniques were available that teachers and students student success across a wide range of co
could use to improve student learning and achievement, would niques (e.g., rereading and highlig
you be surprised if teachers were not being told about these because students report using them frequ
techniques and if many students were not using them? What if dents are responsible for regulating a
students were instead adopting ineffective learning techniques their learning as they progress from elem
that undermined their achievement, or at least did not improve middle school and high school to colle
it? Shouldn't they stop using these techniques and begin using also need to continue regulating their o
ones that are effective? Psychologists have been developing it takes place in the context of postgr
and evaluating the efficacy of techniques for study and instruc- workplace, the development of new hob
tion for more than 100 years. Nevertheless, some effective activities.
techniques are underutilized—many teachers do not learn Thus, we limited our choices to tech
about them, and hence many students do not use them, despite implemented by students without assi
evidence suggesting that the techniques could benefit student requiring advanced technologies or exten
achievement with little added effort. Also, some learning tech- would have to be prepared by a teache
niques that are popular and often used by students are rela- be required for students to learn how to u
tively ineffective. One potential reason for the disconnect fidelity, but in principle, students sho
between research on the efficacy of learning techniques and techniques without supervision. We also c
their use in educational practice is that because so many tech- which a sufficient amount of empirical
niques are available, it would be challenging for educators to to support at least a preliminary assessme
sift through the relevant research to decide which ones show cacy. Of course, we could not review all
promise of efficacy and could feasibly be implemented by stu- meet these criteria, given the in-depth n
dents (Pressley, Goodchild, Fleet, Zajchowski, & Evans, and these criteria excluded some techniq
1989). promise, such as techniques that are driven by advanced
Toward meeting this challenge, we explored the efficacy of technologies.
10 learning techniques (listed in Table 1) that students could Because teachers are most like
use to improve their success across a wide variety of content niques in educational psycho
domains.1 The learning techniques we consider here were cho- how some educational-psychology
sen on the basis of the following criteria. We chose some (Ormrod, 2008; Santrock, 200

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Durilosky et al.

Table I. Learning Techniques

Technique Description

1. Elaborative interrogation Generating an e


2. Self-explanation Explaining how new in
during problem solving
3. Summarization Writing summaries
4. Highlighting/underlining Marking poten
5. Keyword mnemonic Using keywords
6. Imagery for text Attempting to form
7. Rereading Restudying text materia
8. Practice testing Self-testing or taking
9. Distributed practice Implementing a sc
10. Interleaved practice Implementing a sch
study that mixes different kinds of

Note. See text for a detailed description of each lea

Table 2. Examples of the Four Categories of

Materials Learning Student characteristics3


conditions Criterion tasks

Vocabulary Amount of practice (dosage) Age Cued recall


Translation equivalents Open- vs. closed-book practice Prior domain knowledge Free recall
Lecture content Reading vs. listening Working memory capacity Recognition
Science definitions Incidental vs. intentional learning Verbal ability Problem solving
Narrative texts Direct instruction Interests Argument development
Expository texts Discovery learning Fluid intelligence Essay writing
Mathematical concepts Rereading lagsb Motivation Creation of portfolios
Maps Kind of practice testsc Prior achievement Achievement tests

Diagrams Group vs. individual learning Self-efficacy Classroom quizzes

sSome of these characteristics are more state based (e.g., motivation) and some are more trait based (e.g., fluid intelligence); this distinction is
relevant to the malleability of each characteristic, but a discussion of this dimension is beyond the scope of this article.
bLearning condition is specific to rereading.
'Learning condition is specific to practice testing.

McCown, & Biehler, 2009; Sternberg & Williams, 2010; learning environment itself (e.g., noisiness vs. quietness in a
Woolfolk, 2007). Despite the promise of some of the tech- classroom), but they largely pertain to the way in which a
niques, many of these textbooks did not provide sufficient learning technique is implemented. For instance, a technique
coverage, which would include up-to-date reviews of their could be used only once or many times (a variable referred to
efficacy and analyses of their generalizability and potential as dosage) when students are studying, or a technique could be
limitations. Accordingly, for all of the learning techniques used when students are either reading or listening to the to-be
listed in Table 1, we reviewed the literature to identify the gen- learned materials.
eralizability of their benefits across four categories of vari- Any number of student characteristics could also influence
ables—materials, learning conditions, student characteristics, the effectiveness of a given learning technique. For example,
and criterion tasks. The choice of these categories was inspired in comparison to more advanced students, younger students in
by Jenkins' (1979) model (for an example of its use in educa- early grades may not benefit from a technique. Students' basic
tional contexts, see Marsh & Butler, in press), and examples of cognitive abilities, such as working memory capacity or gen
each category are presented in Table 2. Materials pertain to the eral fluid intelligence, may also influence the efficacy of a
specific content that students are expected to learn, remember, given technique. In an educational context, domain knowledge
or comprehend. Learning conditions pertain to aspects of refers to the valid, relevant knowledge a student brings to a
the context in which students are interacting with the to-be- lesson. Domain knowledge may be required for students to use
learned materials. These conditions include aspects of the some of the learning techniques listed in Table 1. For instance,

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement

the use of imagery while reading texts requires that students (e.g., the keyword mnemonic), others are focused more on
know the objects and ideas that the words refer to so that they improving comprehension (e.g., self-explanation), and yet
can produce internal images of them. Students with some others may enhance both memory and comprehension (e.g.,
domain knowledge about a topic may also find it easier to use practice testing). Thus, our review of each learning technique
self-explanation and elaborative interrogation, which are two describes how it can be used, its effectiveness for producing
techniques that involve answering "why" questions about a long-term retention and comprehension, and its breadth of
particular concept (e.g., "Why would particles of ice rise up efficacy across the categories of variables listed in Table 2.
within a cloud?"). Domain knowledge may enhance the bene
fits of summarization and highlighting as well. Nevertheless,
although some domain knowledge will benefit students as
Reviewing the Learning Techniques
they begin learning new content within a given domain, it is In the following series of reviews, we consider the available
not a prerequisite for using most of the learning techniques. evidence for the efficacy of each of the learning techniques.
The degree to which the efficacy of each learning technique Each review begins with a brief description of the technique
obtains across long retention intervals and generalizes across and a discussion about why it is expected to improve student
different criterion tasks is of critical importance. Our reviews learning. We then consider generalizability (with respect to
and recommendations are based on evidence, which typically learning conditions, materials, student characteristics, and cri
pertains to students' objective performance on any number of terion tasks), highlight any research on the technique that has
criterion tasks. Criterion tasks (Table 2, rightmost column) been conducted in representative educational contexts, and
vary with respect to the specific kinds of knowledge that they address any identified issues for implementing the technique,
tap. Some tasks are meant to tap students' memory for infor- Accordingly, the reviews are largely modular: Each of the 10
mation (e.g., "What is operant conditioning?"), others are reviews is organized around these themes (with corresponding
largely meant to tap students' comprehension (e.g., "Explain headers) so readers can easily identify the most relevant infer
tile difference between classical conditioning and operant con- mation without necessarily having to read the monograph in
ditioning"), and still others are meant to tap students' applica- its entirety.
tion of knowledge (e.g., "How would you apply operant At the end of each review, we provide an overall assess
conditioning to train a dog to sit down?"). Indeed, Bloom and ment for each technique in terms of its relatively utility—low,
colleagues divided learning objectives into six categories, moderate, or high. Students and teachers who are not already
from memory (or knowledge) and comprehension of facts to doing so should consider using techniques designated as high
their application, analysis, synthesis, and evaluation (B. S. utility, because the effects of these techniques are robust and
Bloom, Engelhart, Furst, Hill, & Krathwohl, 1956; for an generalize widely. Techniques could have been designated as
updated taxonomy, see L. W. Anderson & Krathwohl, 2001). low utility or moderate utility for any number of reasons. For
In discussing how the techniques influence criterion perfor- instance, a technique could have been designated as low utility
manee, we emphasize investigations that have gone beyond because its effects are limited to a small subset of materials
demonstrating improved memory for target material by mea- that students need to learn; the technique may be useful in
suring students' comprehension, application, and transfer of some cases and adopted in appropriate contexts, but, relative
knowledge. Note, however, that although gaining factual to the other techniques, it would be considered low in utility
knowledge is not considered the only or ultimate objective of because of its limited generalizability. A technique could also
schooling, we unabashedly consider efforts to improve student receive a low- or moderate-utility rating if it showed promise,
retention of knowledge as essential for reaching other instruc- yet insufficient evidence was available to support confidence
tional objectives; if one does not remember core ideas, facts, in assigning a higher utility assessment. In such cases, we
or concepts, applying them may prove difficult, if not impos- encourage researchers to further explore these techniques
sible. Students who have forgotten principles of algebra will within educational settings, but students and teachers may
be unable to apply them to solve problems or use them as a want to use caution before adopting them widely. Most impor
foundation for learning calculus (or physics, economics, or tant, given that each utility assessment could have been
other related domains), and students who do not remember assigned for a variety of reasons, we discuss the rationale for a
what operant conditioning is will likely have difficulties given assessment at the end of each review,
applying it to solve behavioral problems. We are not advocat- Finally, our intent was to conduct exhaustive reviews of
ing that students spend their time robotically memorizing the literature on each learning technique. For techniques that
facts; instead, we are acknowledging the important interplay have been reviewed extensively (e.g., distributed practice),
between memory for a concept on one hand and the ability to however, we relied on previous reviews and supplemented
comprehend and apply it on the other. them with any research that appeared after they had been pub
An aim of this monograph is to encourage students to use lished. For many of the learning techniques, too many articles
the appropriate learning technique (or techniques) to accom- have been published to cite them all; therefore, in our discus
plish a given instructional objective. Some learning techniques sion of most of the techniques, we cite a subset of relevant
are largely focused on bolstering students' memory for facts articles.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
8 Dunlosky et al.

I Elaborative int
. , , ... , , ... note that most elaborative-interrogation prompts explicitly or
Anyone who has spent time around young children knows that ., . .
- ,c . 0„ , , „ implicitly invite processing of both similarities and differences
one of their most frequent utterances is Why ! (perhaps com- r , _ ,,, _
i i, , * j ilXT ,,n TT . . ... between related entities (e.g., why a fact would be true of one
ing in a close second behind No! ). Humans are inquisitive , ' .
, , , . , . - .. c . . province versus other provinces). As we highlight below, pro
creatures by nature, attuned to seeking explanations for states, ^ . ...... , 00 , , j
j . ■ ,, ,, j j, . . , cessing of similarities and differences among to-be-leamed
actions, and events in the world around us. Fortunately, a siz- b .... , ,, . .
, , v j e ■, . . .. r , facts also accounts for findings that elaborative-interrogation
able body of evidence suggests that the power of explanatory „ ° . ° ,
. • , , , . . . c -r- effects are often larger when elaborations are precise rather
questioning can be harnessed to promote learning. Speciti- , • ,
cally, research on both elaborative interrogation and self- than imPrecise> when Pnor kno^le^e 18 hlf
, .. , , . .. . i , lower (consistent with research showing that preexisting
explanation has shown that prompting students to answer \ ,
"Why?" questions can facilitate learning. These two literatures knowledge enhances memory by facilitating distinctive pro
are highly related but have mostly developed independently of cessin§' e-8-> Rawson an versee e, ), an w en
it. a a.y.- h it. u i u i a elaborations are self-generated rather than provided (a finding
one another. Additionally, they have overlapping but nomden- b \ v „ b
T- it ■ j consistent with research showing that distinctiveness effects
tical strengths and weaknesses. For these reasons, we consider . ._ „ _ .,
the two literatures separately. dePend on self-generating item-specific
F 3 1996).
I.I General description of elaborative
why it should work. In one of the earl
of elaborative interrogation, Pressley
Wood, and Ahmad (1987) presented under
with a list of sentences, each describin
lar man (e.g., "The hungry man got into
orative-interrogation group, for each sen
were prompted to explain "Why did that
that?" Another group of participants was
with an explanation for each sentence (e.
got into the car to go to the restaurant"
simply read each sentence. On a final test
were cued to recall which man perform
"Who got in the car?"), the elaborative
substantiallyoutperformed the other tw
across experiments, accuracy in this gr
72%, compared with approximately 37% i
two groups). From this and similar st
reported average effect sizes ranging f
As illustrated above, the key to elaborat
involves prompting learners to generate a
explicitly stated fact. The particular form
prompt has differed somewhat across s
include "Why does it make sense that...?"
and simply "Why?" However, the majorit
used prompts following the general form
fact be true of this [X] and not som
The prevailing theoretical account of ela
tion effects is that elaborative interrogati
by supporting the integration of new inf
prior knowledge. During elaborative inter
presumably "activate schemata . . . The
help to organize new information which
(Willoughby & Wood, 1994, p. 140). Altho
of new facts with prior knowledge may
tion (Hunt, 2006) of that information, or
sufficient—students must also be able to
related facts to be accurate when ident

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement

cognitive disabilities (Scruggs, Mastropieri, Sullivan, & Hes knowledge permits the generation of more appropriate expla
ser, 1993), although Wood, Willoughby, Bolger, Younger, and nations for why a fact is true. If so, one might expect final-test
Kaspar (1993) did not find effects with a sample of low performance to vary as a function of the quality of the explana
achieving students. On the other end of the continuum, elabo tions generated during study. However, the evidence is mixed.
rad ve-interrogation effects have been shown for high-achieving Whereas some studies have found that test performance is bet
fifth and sixth graders (Wood & Hewitt, 1993; Wood, Wil ter following adequate elaborative-interrogation responses
loughby, et al., 1993). (i.e., those that include a precise, plausible, or accurate expla
Another key dimension along which learners differ is level nation for a fact) than for inadequate responses, the differences
of prior knowledge, a factor that has been extensively investi have often been small, and other studies have failed to find
gated within the literature on elaborative interrogation. Both differences (although the numerical trends are usually in the
correlational and experimental evidence suggest that prior anticipated direction). A somewhat more consistent finding is
knowledge is an important moderator of elaborative-interroga that performance is better following an adequate response than
tion effects, such that effects generally increase as prior no response, although in this case, too, the results are some
knowledge increases. For example, Woloshyn, Pressley, and what mixed. More generally, the available evidence should be
Schneider (1992) presented Canadian and German students interpreted with caution, given that outcomes are based on
with facts about Canadian provinces and German states. Thus, conditional post hoc analyses that likely reflect item-selection
both groups of students had more domain knowledge for one effects. Thus, the extent to which elaborative-interrogation
set of facts and less domain knowledge for the other set. As effects depend on the quality of the elaborations generated is
shown in Figure 1, students showed larger effects of elabora still an open question.
tive interrogation in their high-knowledge domain (a 24% 1.2c Materials. Although several studies have replicated
increase) than in their low-knowledge domain (a 12% elaborative-interrogation effects using the relatively artificial
increase). Other studies manipulating the familiarity of to-be "man sentences" used by Pressley et al. ( 1987), the majority of
learned materials have reported similar patterns, with signifi subsequent research has extended these effects using materials
cant effects for new facts about familiar items but weaker or that better represent what students are actually expected to
nonexistent effects for facts about unfamiliar items. Despite learn. The most commonly used materials involved sets of
some exceptions (e.g., Ozgungor & Guthrie, 2004), the overall facts about various familiar and unfamiliar animals (e.g., "The
conclusion that emerges from the literature is that high-knowl Western Spotted Skunk's hole is usually found on a sandy
edge learners will generally be best equipped to profit from piecethe
of farmland near crops"), usually with an elaborative
elaborative-interrogation technique. The benefit for interrogation
lower prompt following the presentation of each fact.
knowledge learners is less certain. Other studies have extended elaborative-interrogation effects
One intuitive explanation for why prior knowledge moder to fact lists from other content domains, including facts
ates the effects of elaborative interrogation is that higher about U.S. states, German states, Canadian provinces, and
universities; possible reasons for dinosaur extinction; and
□ Elaborative Interrogation gender-specific facts about men and women. Other studies
have shown elaborative-interrogation effects for factual state
□ Reading Control
80
ments about various topics (e.g., the solar system) that are nor
matively consistent or inconsistent with learners' prior beliefs
(e.g., Woloshyn, Paivio, & Pressley, 1994). Effects have also
been shown for facts contained in longer connected discourse,
including expository texts on animals (e.g., Seifert, 1994);
human digestion (B. L. Smith, Holliday, & Austin, 2010); the
neuropsychology of phantom pain (Ozgungor & Guthrie,
2004); retail, merchandising, and accounting (Dornisch &
Sperling, 2006); and various science concepts (McDaniel &
Donnelly, 1996). Thus, elaborative-interrogation effects are
relatively robust across factual material of different kinds and
with different contents. However, it is important to note that
elaborative interrogation has been applied (and may be appli
cable) only to discrete units of factual information.
1.2d Criterion tasks. Whereas elaborative-interrogation
High Knowledge Low Knowledge effects appear to be relatively robust across materials and
learners, the extensions of elaborative-interrogation effects
Fig. I. Mean percentage of correct responses on a final test for learners
across measures that tap different kinds or levels of learning is
with high or low domain knowledge who engaged in elaborative interroga
tion or in reading only during learning (in Woloshyn, Pressley, & Schneider, somewhat more limited. With only a few exceptions, the
1992). Standard errors are not available. majority of elaborative-interrogation studies have relied on the

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
10 Dunlosky et al.

following associative-m
ally involving the prese
entity for which the f
and matching (in which
facts and entities and m
entity). Effects have al
ognition (B. L. Smith e
Woloshyn & Stockley, 1
measures, a few studies
tion effects on free-rec
1995; Woloshyn et al.
(Domisch & Sperling, 2
All of the aforementio
ory for explicitly state
used measures tapping c
factual information. Al
interrogation effects o
tests that required infe
(Dornisch & Sperling,
Ozgungor & Guthrie,
also found that elaborat
mance on a concept-rela
rated the pairwise relat
sage, and rating cohere
ses); however, Dornisc
significant elaborative-
solving test. In sum, wh
on associative memory
extent to which elabora
comprehension is less certain. to time demands. Almost all studies set reasonable limits on
Of even greater concern than the limited array of measures the amount of time allotted for reading a fact and for
that have been used is the fact that few studies have examined ing an elaboration (e.g., 15 seconds allotted for each
performance after meaningful delays. Almost all prior studies In one of the few studies permitting self-paced learn
have administered outcome measures either immediately or time-on-task difference between the elaborative-interro
within a few minutes of the learning phase. Results from the and reading-only groups was relatively minimal (32 m
few studies that have used longer retention intervals are prom- vs. 28 minutes; B. L. Smith et al., 2010). Finally, the
ising. Elaborative-interrogation effects have been shown after tency of the prompts used across studies allows for r
delays of 1-2 weeks (Scruggs et al., 1994; Woloshyn et al., straightforward recommendations to students about the n
1994), 1-2 months (Kahl & Woloshyn, 1994; Willoughby, of the questions they should use to elaborate on facts
Waller, Wood, & MacKinnon, 1993; Woloshyn & Stockley, study.
1995), and even 75 and 180 days (Woloshyn et al., 1994). In With that said, one limitation noted above concer
almost all of these studies, however, the delayed test was pre- potentially narrow applicability of elaborative interroga
ceded by one or more criterion tests at shorter intervals, intro- discrete factual statements. As Hamilton (1997) noted,
ducing the possibility that performance on the delayed test was rative interrogation is fairly prescribed when focusing
contaminated by the practice provided by the preceding tests, of factual sentences. However, when focusing on mor
Thus, further work is needed before any definitive conclusions plex outcomes, it is not as clear to what one should dir
can be drawn about the extent to which elaborative interroga- 'why' questions" (p. 308). For example, when learning
tion produces durable gains in learning. complex causal process or system (e.g., the digestive s
the appropriate grain size for elaborative interrogation is an
1.3 Effects in representative educational contexts. Con- open question (e.g., should a prompt focus on an entir
cerning the evidence that elaborative interrogation will or just a smaller part of it?). Furthermore, whereas the fact
enhance learning in representative educational contexts, few be elaborated are clear when dealing with fact lists, ela
studies have been conducted outside the laboratory. However, ing on facts embedded in lengthier texts will require s
outcomes from a recent study are suggestive (B. L. Smith to identify their own target facts. Thus, students m
et al., 2010). Participants were undergraduates enrolled in an some instruction about the kinds of content to

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement I I

when
elaborative interrogation maythe logical
be rules w
fruitfully ap
problems text,
also of concern with lengthier presented
withdurin
som
self-explanation groups
gesting that elaborative-interrogation effects sa
diluted (Callender & group (see2007)
McDaniel, Fig. 2).
or In a
even
group
say, Sperling, & Dornisch, was explicitly
2010) told
when elaborat
the concrete
tion prompts are administered practice pr
infrequently (e.
every 1 or 2 pages). forthcoming abstract pro
As illustrated above, th
1.5 Elaborative interrogation: Overall
tion involves havingassess
stud
cessing
elaborative interrogation during
as having learnin
moderate
assumptions
tive-interrogation effects have been about
shown the r
acr
broad range of factualrogation, self-explanatio
topics, although some co
about the applicabilitying the integration
of elaborative of n
interroga
knowledge.
that is lengthier or more complex However,
than fact com
l
learner characteristics,
used
effects
in the
ofelaborative-i
elaborative
have been consistently documented
used for lear
to elicit self-explan
across age,
young as upper elementary studies.
but Depending
some evid
the particular
that the benefits of elaborative mechanism
interrogation
for learners with low may differ
levels somewhat.
of domain knowled T
explanation prompts effe
criterion tasks, elaborative-interrogation diff
firmly established on are content-free
measures versus
of associative
studies
istered after short delays, have used
but firm prompt
conclusions a
particular content
to which elaborative interrogation fro
benefits co
the extent to which elaborative-interrogation
"Explain what the senten
information
across longer delays await further does the sen
research. F
it relate
demonstrating the efficacy of to what youinte
elaborative alre
resentative educational contexts would
continuum, manyalso be
studie
the need for further research
more content-specific,
to establish the suc
ge
elaborative-interrogation effects is primarily
nique did not receive a high-utility rating.
Q Concurrent Self-Exp

□ Retrospective Self-Ex
2 Self-explanation
□ No Self-Explanation
2.1 General description of self-explanation
should work. In the seminal study on self-exp
(1983) explored its effects on logical reason
Wason card-selection task. In this task, a stud
four cards labeled "A," "4," "D," and "3" and b
cate which cards must be turned over to test the r
has A on one side, it has 3 on the other side" (
of the more general "if P, then Q" rule). Stud
asked to solve a concrete instantiation of the r
of jam on one side of a jar and the sale price
accuracy was near zero. They then were provid
mal explanation about how to solve the "if P, t
were given a set of concrete problems involvin
and other logical rules (e.g., "if P, then not Q"
concrete practice problems, one group of
Concrete Practice Abstract Transfer
prompted to self-explain while solving each p
Problems Problems
ing the reasons for choosing or not choos
Another group of students
Fig. 2. Meansolved
percentage all problem
of logical-reasoning
only then were asked rectly for concrete practice
to explain how problems
theyand hadsubs
stract transfer problems in Berry (1983). Durin
ing the problems. Students
in a control g
self-explained while solving each problem, self-
prompted to self-explain
problems,ator any
were notpoint.
prompted toAccura
engage in s
tice problems was 90% or
errors better
are not available. in all three gr

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
12 Dunlosky et al.

different items (e.g., "W


able outcomes by multi
the denominator 7 in t
limit our review to st
relatively content-free.
prompts do elicit explan
of these prompts would
specific prompts to put
more general technique
their own. Furthermo
ated in the self-explanat
is functionally more clo
testing. and they found self-explanation effects only for lower-skill
Even within the set of studies selected for review here, con- students. Further work is ne
siderable variability remains in the self-explanation prompts self-explanation effects acros
that have been used. Furthermore, the range of tasks and mea- dimensions.
sures that have been used to explore self-explanation is quite 2.2c Materials. One of the str
large. Although we view this range as a strength of the litera- literature is that effects h
ture, the variability in self-explanation prompts, tasks, and ferent materials within a task
measures does not easily support a general summative state- different task domains. In add
ment about the mechanisms that underlie self-explanation problems used by Berry (
effects. shown to support the solving of other kinds of logic puzzles.
Self-explanation has also been shown to facilitate
2.2 How general are the effects of self-explanation? of various kinds of math probl
2.2a Learning conditions. Several studies have manipulated problems for kindergartners, m
other aspects of learning conditions in addition to self- lems for elementary-age stude
explanation. For example, Rittle-Johnson (2006) found that geometric theorems for older le
self-explanation was effective when accompanied by either ing problem solving, self-explan
direct instruction or discovery learning. Concerning poten- ers' evaluation of the goodness
tial moderating factors, Berry (1983) included a group who in classroom instruction.
self-explained after the completion of each problem rather younger learners overcome vari
than during problem solving. Retrospective self-explanation improving children's understan
did enhance performance relative to no self-explanation, but individuals can have a belief th
the effects were not as pronounced as with concurrent self- number conservation (i.e.
explanation. Another moderating factor may concern the an array does not change wh
extent to which provided explanations are made available to in the array change), and princ
learners. Schworm and Renkl (2006) found that self-expla- all objects balance on a fulc
nation effects were significantly diminished when learners explanation has improved
could access explanations, presumably because learners adults'learning of endgame str
made minimal attempts to answer the explanatory prompts of the research on self-explana
before consulting the provided information (see also Aleven problem-solving tasks, sev
& Koedinger, 2002). explanation effects for learning from text, including both short
2.2b Student characteristics. Self-explanation effects have narratives and lengthier expository texts. Thus, self-explana
been shown with both younger and older learners. Indeed, tion appears to be broadly applicable,
self-explanation research has relied much less heavily on sam- 2.2d Criterion tasks. Given the range of tasks and domains in
pies of college students than most other literatures have, with which self-explanation has been investigated, it is perhaps not
at least as many studies involving younger learners as involv- surprising that self-explanation effects have been shown on a
ing undergraduates. Several studies have reported self- wide range of criterion measures. Some studies have shown
explanation effects with kindergartners, and other studies have self-explanation effects on standard measures of memory,
shown effects for elementary school students, middle school including free recall, cued recall, fill-in-the-blank tests, asso
students, and high school students. ciative matching, and multiple-choice tests tapping explicitly
In contrast to the breadth of age groups examined, the stated information. Studies involving text learning have also
extent to which the effects of self-explanation generalize shown effects on measures of comprehension, including dia
across different levels of prior knowledge or ability has not gram-drawing tasks, application-based questions, and tasks in
been sufficiently explored. Concerning knowledge level, which learners must make inferences on the basis of

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 13

information implied but not explicitly stated


those studies involving some form of problem
virtually every study has shown self-explanat
near-transfer tests in which students are asked
lems that have the same structure as, but are n
the practice problems. Additionally, self-expla
on far-transfer tests (in which students are as
lems that differ from practice problems not o
face features but also in one or more structur
been shown for the solving of math proble
learning. Thus, self-explanation facilitates an
of learning outcomes. instructions that simply prompted them to think aloud during
In contrast, the durability of self-explanation effects is woe- study. The following week, all students received a
fully underexplored. Almost every study to date has adminis- of the theorem and completed the final test the nex
tered criterion tests within minutes of completion of the explanation did not improve performance on ne
learning phase. Only five studies have used longer retention questions but did improve performance on f
intervals. Self-explanation effects persisted across 1-2 day questions,
delays for playing chess endgames (de Bruin, Rikers, &
Schmidt, 2007) and for retention of short narratives (Magliano, 2.4 Issues for implementation. As noted above, a
Trabasso, & Graesser, 1999). Self-explanation effects per- strength of the self-explanation strategy is its broad ap
sisted across a 1-week delay for the learning of geometric ity across a range of tasks and content domains. Furth
theorems (although an additional study session intervened in almost all of the studies reporting significant effec
between initial learning and the final test; R. M. F. Wong, explanation, participants were provided with minimal
Lawson, & Keeves, 2002) and for learning from a text on the tions and little to no practice with self-explanation
circulatory system (although the final test was an open-book completing the experimental task. Thus, most stud
test; Chi et al., 1994). Finally, Rittle-Johnson (2006) reported ently can profit from self-explanation with minimal
significant effects on performance in solving math problems However, some students may require more instruc
after a 2-week delay; however, the participants in this study successfully implement self-explanation. In a study
also completed an immediate test, thus introducing the possi- jean and Cauzinille-Marmèche (1997), ninth g
bility that testing effects influenced performance on the poor algebra skills received minimal training prior to
delayed test. Taken together, the outcomes of these few studies in self-explanation while solving algebra problem
are promising, but considerably more research is needed of think-aloud protocols revealed that students produce
before confident conclusions can be made about the longevity more paraphrases than explanations. Several
of self-explanation effects. reported positive correlations between final-test performance
and both the quantity and quality of explanations generated by
2.3 Effects in representative educational contexts. Con- students during learning, further suggesting that the benefit of
ceming the strength of the evidence that self-explanation will self-explanation might be enhanced by teaching students how
enhance learning in educational contexts, outcomes from two to effectively implement the self-explanation technique (for
studies in which participants were asked to leam course-relevant examples of training methods, see Ainsworth & Burcham,
content are at least suggestive. In a study by Schworm and 2007; R. M. F. Wong et al., 2002). However, in at least some
Renkl (2006), students in a teacher-education program learned of these studies, students who produced more or better-quality
how to develop example problems to use in their classrooms self-explanations may have had greater domain knowledge; if
by studying samples of well-designed and poorly designed so, then further training with the technique may not have ben
example problems in a computer program. On each trial, stu- efited the more poorly performing students. Investigating the
dents in a self-explanation group were prompted to explain contribution of these factors (skill at self-explanation vs.
why one of two examples was more effective than the other, domain knowledge) to the efficacy of self-explanation will
whereas students in a control group were not prompted to self- have important implications for how and when to use this
explain. Half of the participants in each group were also given technique.
the option to examine experimenter-provided explanations on An outstanding issue concerns the time demands associated
each trial. On an immediate test in which participants selected with self-explanation and the extent to which self-explanation
and developed example problems, the self-explanation group effects may have been due to increased time on task. Unfortu
outperformed the control group. However, this effect was lim- nately, few studies equated time on task when comparing self
ited to students who had not been able to view provided expia- explanation conditions to control conditions involving other
nations, presumably because students made minimal attempts strategies or activities, and most studies involving self-paced
to self-explain before consulting the provided information. practice did not report participants' time on task. In the few

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
14 Dunlosky et al.

studies reporting time


ally yielded nontrivial i
time spent learning in t
to other conditions, a
given the high dosage
implemented. For examp
ers to self-explain after
text, which doubled the
ing the text relative to a
minutes, respectively). W
(2006) reported that ti
formance across groups
reported that controll
effects of self-explanation. experiment. Bretzing and Kulhavy (1979) had high school
Within the small number of studies in which time on juniors and seniors study a 2,000-word text about a fi
task was equated, results were somewhat mixed. Three studies tribe of people. Students were assigned to one of five
equating time on task reported significant effects of self- conditions and given up to 30 minutes to study the text.
explanation (de Bruin et al., 2007; de Koning, Tabbers, Rikers, reading each page, students in a summarization group
& Paas, 2011; O'Reilly, Symons, & MacLatchy-Gaudet, instructed to write three lines of text that summarized t
1998). In contrast, Matthews and Rittle-Johnson (2009) had points from that page. Students in a note-taking group r
one group of third through fifth graders practice solving math similar instructions, except that they were told to tak
problems with self-explanation and a control group solve three lines of notes on each page of text while read
twice as many practice problems without self-explanation; the dents in a verbatim-copying group were instructed to
two groups performed similarly on a final test. Clearly, further and copy the three most important lines on each page.
research is needed to establish the bang for the buck provided in a letter-search group copied all the capitalized word
by self-explanation before strong prescriptive conclusions can text, also filling up three lines. Finally, students in a
be made. group simply read the text without recording anything. (A sub
set of students from the four conditions involving writing
2.5 Self-explanation: Overall assessment. We rate self- allowed to review what they had written,
explanation as having moderate utility. A major strength of poses we will focus on the students who di
this technique is that its effects have been shown across differ- review before the final test.) Students wer
ent content materials within task domains as well as across after learning or 1 week later, answer
several different task domains. Self-explanation effects have required them to connect information fro
also been shown across an impressive age range, although fur- both the immediate and delayed tests, stud
ther work is needed to explore the extent to which these effects zation and note-taking groups performed
depend on learners' knowledge or ability level. Self-explana- students in the verbatim-copying and cont
tion effects have also been shown across an impressive range worst performance in the letter-search gr
of learning outcomes, including various measures of memory, Bretzing and Kulhavy's (1979) result
comprehension, and transfer. In contrast, further research is claim that summarization boosts lea
needed to establish the durability of these effects across educa- because it involves attending to and extra
tionally relevant delays and to establish the efficacy of self- meaning and gist of the material. The con
explanation in representative educational contexts. Although ment were specifically designed to manipu
most research has shown effects of self-explanation with mini- dents processed the texts for meaning, w
mal training, some results have suggested that effects may be condition involving shallow processing of t
enhanced if students are taught how to effectively implement require learners to extract its meaning (
the self-explanation strategy. One final concern has to do with 1972). Summarization was more benefici
the nontrivial time demands associated with self-explanation, task and yielded benefits similar to t
at least at the dosages examined in most of the research that another task known to boost learning (e.g
has shown effects of this strategy. havy, 1981; Crawford, 1925a, 1925b; Di Vesta
More than just facilitating the extraction of meaning, how
3 Summarization summarization should also boost organizational proce
given that extracting the gist of a text requires learner
Students often have to learn large amounts of information, connect disparate pieces of the text,
which requires them to identify what is important and how dif- evaluating its individual components (
ferent ideas connect to one another. One popular technique for which note-taking affords organizational p

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 15

| Summarization A lot of research has involved summarization in some form,


yet whereas some evidence demonstrates that summarization
13 Note-Taking
works (e.g., L. W. Brooks, Dansereau, Holley, & Spurlin,
[3 Verbatim 1983; Doctorow, Wittrock, & Marks, 1978), T. H. Anderson
□ Letter Search and Armbruster's (1984) conclusion that "research in support
of summarizing as a studying activity is sparse indeed"
16 r D Control
(p. 670) is not outmoded. Instead of focusing on discovering
when (and how) summarization works, by itself and without
training, researchers have tended to explore how to train stu
dents to write better summaries (e.g., Friend, 2001; Flare &
Borchardt, 1984) or to examine other benefits of training the
skill of summarization. Still others have simply assumed that
summarization works, including it as a component in larger
interventions (e.g., Carr, Bigler, & Morningstar, 1991; Lee,
Lim, & Grabowski, 2010; Palincsar & Brown, 1984; Spôrer,
Brunstein, & Kieschke, 2009). When collapsing across find
ings pertaining to all forms of summarization, summarization
appears to benefit students, but the evidence for any one
instantiation of the strategy is less compelling.
The focus on training students to summarize reflects the
belief that the quality of summaries matters. If a summary does
Immediate Test Delayed Test not emphasize the main points of a text, or if it includes incor
Fig. 3. Mean number of correct responses on a test occurring rectshortly
information, why would it be expected to benefit learning
andlearning
after study as a function of test type (immediate or delayed) and retention? Consider a study by Bednall and Kehoe (2011,
condition in Bretzing and Kulhavy (1979). Error bars represent standard
Experiment 2), in which undergraduates studied six Web units
that explained different logical fallacies and provided examples
of each. Of interest for present purposes are two groups: a con
trol group who simply read the units and a group in which stu
Morris, & Smith, 1985). One last point should be made dents were asked to summarize the material as if they were
about
the results from Bretzing and Kulhavy (1979)—namely, explaining
that it to a friend. Both groups received the following
summarization and note-taking were both more tests: beneficial
a multiple-choice quiz that tested information directly
stated in the Web unit; a short-answer test in which, for each of
than was verbatim copying. Students in the verbatim-copying
a list in
group still had to locate the most important information of the
presented statements, students were required to name
text, but they did not synthesize it into a summary orthe specific fallacy that had been committed or write "not a fal
rephrase
it in their notes. Thus, writing about the important lacy"
points if one
in had not occurred; and, finally, an application test
one's own words produced a benefit over and above thatthat of students to write explanations of logical fallacies
required
selecting important information; students benefited in examples
from the that had been studied (near transfer) as well as
more active processing involved in summarization and explanations
note of fallacies in novel examples (far transfer). Sum
taking (see Wittrock, 1990, and Chi, 2009, for reviews ofdid not benefit overall performance, but the research
marization
erssuggest
active/generative learning). These explanations all noticed that the summaries varied a lot in content; for one
that summarization helps students identify and organize studiedthe
fallacy, only 64% of the summaries included the correct
main ideas within a text. definition. Table 3 shows the relationships between summary
So how strong is the evidence that summarization is a content
ben and later performance. Higher-quality summaries that
eficial learning strategy? One reason this question is difficult
contained more information and that were linked to prior knowl
to answer is that the summarization strategy has been imple edge were associated with better performance.
mented in many different ways across studies, making it diffi Several other studies have supported the claim that the
cult to draw general conclusions about its efficacy. Pressleyquality of summaries has consequences for later performance.
and colleagues described the situation well when they noted Most similar to the Bednall and Kehoe (2011) result is Ross
and Di Vesta's (1976) finding that the length (in words) of an
that "summarization is not one strategy but a family of strate
gies" (Pressley, Johnson, Symons, McGoldrick, & Kurita, oral summary (a very rough indicator of quality) correlated
1989, p. 5). Depending on the particular instructions given,with
stu later performance on multiple-choice and short-answer
dents' summaries might consist of single words, sentences,questions.
or Similarly, Dyer, Riley, and Yekovich (1979) found
that final-test questions were more likely to be answered cor
longer paragraphs; be limited in length or not; capture an entire
text or only a portion of it; be written or spoken aloud; orrectly
be if the information needed to answer them had been
produced from memory or with the text present. included in an earlier summary. Gamer (1982) used a different

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
16 Dunlosky et al.

Table 3. Correlations b
Bednall & Kehoe, 2011,

Test

Multiple-choice test Short-answer test


Measure of summary quality (factual knowledge) (identification) Application test
Number of correct definitions .42* .43* .52*
Amount of extra information .31 * .21 * .40*

Note.Asterisks indicate correlations significantly greater than 0. "Amoun


number of summaries in which a student included information that had
rial (e.g., an extra example).

method to show that the quality of summaries matters: Under- n


graduates read a passage on Dutch elm disease and then wrote ies
a summary at the bottom of the page. Five days later, the stu- (
dents took an old/new recognition test; critical items were new
statements that captured the gist of the passage (as in Brans- fo
ford & Franks, 1971). Students who wrote better summaries answ
(i.e., summaries that captured more important information) mar
were more likely to falsely recognize these gist statements, a on
pattern suggesting that the students had extracted a higher- 3.2
level understanding of the main ideas of the text. prima
research on individual differences
3.2 How general are the effects of summarization? students
3.2a Learning conditions. As noted already, many different You
types of summaries can influence learning and retention; sum- wr
marization can be simple, requiring the generation of only a
heading (e.g., L. W. Brooks et al., 1983) or a single sentence 19
per paragraph of a text (e.g., Doctorow et al., 1978), or it can be
as complicated as an oral presentation on an entire set of stud- m
ied material (e.g., Ross & Di Vesta, 1976). Whether it is better
to summarize smaller pieces of a text (more frequent summari- e
zation) or to capture more of the text in a larger summary (less
frequent summarization) has been debated (Foos, 1995; Spur- 90 m
lin, Dansereau, O'Donnell, & Brooks, 1988). The debate zation
remains unresolved, perhaps because what constitutes the mos
effective summary for a text likely depends on many factors ref
(including students' ability and the nature of the material). were
One other open question involves whether studied material tice
should be present during summarization. Hidi and Anderson
(1986) pointed out that having the text present might help the t
reader to succeed at identifying its most important points as cha
well as relating parts of the text to one another. However, sum- im
marizing a text without having it present involves retrieval, and
which is known to benefit memory (see the Practice Testing Sim
section of this monograph), and also prevents the learner fro
engaging in verbatim copying. The Dyer et al. (1979) study 19
described earlier involved summarizing without the text près- de
ent; in this study, no overall benefit from summarizing stu
occurred, even though information that had been included in
summaries was benefited (overall, this benefit was overshad- gen
owed by costs to the greater amount of information that had vi

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 17

using the strategy). First, both general writin


in a topic have been linked to summarization
graders (Head, Readence, & Buss, 1989). Wri
measured via performance on an unrelated ess
in the topic (American history) was measur
asked students how much they would like to
of 25 topics. Of course, interest may be confo
knowledge about a topic, and knowledge may
to summarization skill. Recht and Leslie (19
seventh- and eighth-grade students who knew
ball (as measured by a pretest) were better at
625-word passage about a baseball game than
who knew less about baseball. This finding
cated with different materials, but it seems pl
dents with more domain-relevant knowledge
able to identify the main points of a text and
The question is whether domain experts would
the summarization strategy or whether it wo
with the processing in which these students w
ously engage. a better match to generative tests than to tests that depend on
3.2c Materials. The majority of studies have used prose pas- recognition,
sages on such diverse topics as a fictitious primitive tribe, des- Unfortunately, the one study we fo
ert life, geology, the blue shark, an earthquake in Lisbon, the stakes test did not show a benefit from
history of Switzerland, and fictional stories. These passages (Brozo, Stahl, & Gordon, 1985). Of
have ranged in length from a few hundred words to a few thou- poses were two groups in the study,
sand words. Other materials have included Web modules and college students in a remedial read
lectures. For the most part, characteristics of materials have training either in summarization or in
not been systematically manipulated, which makes it difficult self-questioning condition, students
to draw strong conclusions about this factor, even though 15 choice comprehension questions). Train
years have passed since Hidi and Anderson (1986) made an each week, students received approx
argument for its probable importance. As discussed in Yu instruction and practice that involved app
(2009), it makes sense that the length, readability, and organi- to 1-page news articles. Of interest
zation of a text might all influence a reader's ability to sum- manee on the Georgia State Rege
marize it, but these factors need to be investigated in studies involves answering multiple-choice re
that manipulate them while holding all other factors constant questions about passages; passing this
(as opposed to comparing texts that vary along multiple requirement for many college studen
dimensions). tern of Georgia (see http://www2.gsu.edu/~wwwrtp/). Students
3.2d Criterion tasks. The majority of summarization studies also took a practice test before taking th
have examined the effects of summarization on either reten- Unfortunately, the mean scores for bo
tion of factual details or comprehension of a text (often requir- below passing, for both the practice
ing inferences) through performance on multiple-choice ever, the self-questioning group performed
questions, cued recall questions, or free recall. Other benefits marization group on both the prac
of summarization include enhanced metacognition (with text- Regents' examination. This study did n
absent summarization improving the extent to which readers scores and did not include a no-training c
can accurately evaluate what they do or do not know; M.C.M. caution is warranted in interpreting th
Anderson & Thiede, 2008; Thiede & Anderson, 2003) and emphasizes the need to establish that ou
improved note-taking following training (A. King, 1992; oratory work generalize to actual educa
Rinehart et al., 1986). gests that summarization may not have the same influence in
Whereas several studies have shown benefits of summari- both contexts,
zation (sometimes following training) on measures of applica- Finally, concerning test delays, several studies h
tion (e.g., B. Y. L. Wong, Wong, Perry, & Sawatsky, 1986), cated that when summarization does boost perform
others have failed to find such benefits. For example, consider effects are relatively robust over delays of days or wee
a study in which L. F. Annis (1985) had undergraduates read a Bretzing & Kulhavy, 1979; B. L. Stein & Kirby, 1992). S
passage on an earthquake and then examined the consequences larly, benefits of training programs have persisted se
of summarization for performance on questions designed to weeks after the end of training (e.g., Hare & Borchardt

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
18 Dunlosky et al.

3.3 Effects in represe


eral of the large summ
conducted in regular cla
doing so. For example,
in the context of a rem
uates, and the study by
sixth-grade classrooms,
regular teachers. In the
from the classroom tra
more feasible to condu
classrooms
than in the l
commitment for studen
not involve training we
example, in the Bedna
about logical fallacies
the modules were actua
ment. Overall, benefits
the real constraint is w
cessfully summarize, no
lab or the classroom. studies have examined summarization training in the class
room, what are lacking are classroom studies examinin
3.4 Issues for implementation. Summarization would be effectiveness of summarization as a techn
feasible for undergraduates or other learners who already dents' learning, comprehension, and
know how to summarize. For these students, summarization content.
would constitute an easy-to-implement technique that would
not take a lot of time to complete or understand. The only
concern would be whether these students might be better
4 Highlighting and underlining
served by some other strategy, but certainly summarization Any educator who has examined students
would be better than the study strategies students typically familiar with the sight of a marked-up, m
favor, such as highlighting and rereading (as we discuss in the More systematic evaluations of actual t
sections on those strategies below). A trickier issue would dent materials have supported the claim tha
concern implementing the strategy with students who are not underlining are common behaviors (e.g
skilled summarizers. Relatively intensive training programs Lonka, Lindblom-Ylánne, & Maury,
are required for middle school students or learners with learn- 1989). When students themselves are as
ing disabilities to benefit from summarization. Such efforts do when studying, they commonly repor
are not misplaced; training has been shown to benefit perfor- lighting, or otherwise marking material a
manee on a range of measures, although the training proce- (e.g., Cioffi, 1986; Gurung, Weidert, & Je
dures do raise practical issues (e.g., Gajria & Salvia, 1992: these techniques as equivalent, given tha
6.5-11 hours of training used for sixth through ninth graders should work the same way (and at least
with learning disabilities; Malone & Mastropieri, 1991: 2 differences between them; Fowler &
days of training used for middle school students with learning ment 2). The techniques typically appe
disabilities; Rinehart et al., 1986: 45-50 minutes of instruc- they are simple to use, do not entail traini
tion per day for 5 days used for sixth graders). Of course, students to invest much time beyond what
instructors may want students to summarize material because for reading the material. The question w
summarization itself is a goal, not because they plan to use technique that is so easy to use actually hel
summarization as a study technique, and that goal may merit understand any benefits specific to high
the efforts of training. ing (for brevity, henceforth referred to as highlighting), we do
However, if the goal is to use summarization as a study not consider studies in which active marking of text was paired
technique, our question is whether training students would be with other common techniques, such as note-taking (e.g.,
worth the amount of time it would take, both in terms of the Arnold, 1942; L. B. Brown & Smiley, 1978; Mathews, 1938).
time required on the part of the instructor and in terms of the Although many students report combining multiple techniques
time taken away from students' other activities. For instance, (e.g., L. Annis & Davis, 1978; Wade, Trathen, & Schraw,
in terms of efficacy, summarization tends to fall in the middle 1990), each technique must be evaluated independently to dis
of the pack when compared to other techniques. In direct cover which ones are crucial for success.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 19

4.1 General description of highlighting and


and why they should work. As an introduction
vant issues, we begin with a description of
experiment. Fowler and Barker (1974, Exp. 1
uates read articles (totaling about 8,000 words)
and city life from Scientific American and Sci
were assigned to one of three groups: a contro
they only read the articles; an active-highlight
which they were free to highlight as much of
wanted; or a passive-highlighting group, in wh
marked texts that had been highlighted by yo
in the active-highlighting group. Everyone rec
study the texts (time on task was equated acro
dents in the active-highlighting condition wer
particularly important material. All subjects ret
1 week later and were allowed to review their
als for 10 minutes before taking a 54-item mu
test. Overall, the highlighting groups did not ou
control group on the final test, a result that h
been echoed in much of the literature (e.g., Ho
& Jenkins, 1972; Stordahl & Christensen, 1
However, results from more detailed analy
manee in the two highlighting groups are info
what effects highlighting might have on cogn
First, within the active-highlighting group, per
better on test items for which the relevant text
lighted (see Blanchard & Mikkelson, 1987; L
1988 for similar results). Second, this benefi
information was greater for the active high
selected what to highlight) than for passive hig
saw the same information highlighted, but did
Third, this benefit to highlighted information
nied by a small cost on test questions probing
had not been highlighted. quences. First, overmarking reduces the degree to which
To explain such findings, researchers often point to a basic marked text is distinguished from other tex
cognitive phenomenon known as the isolation effect, whereby less likely to remember marked text if it is
a semantically or phonologically unique item in a list is much (Lorch, Lorch, & Klusewitz, 1995). Second, it l
better remembered than its less distinctive counterparts (see processing to mark a lot of text than to singl
Hunt, 1995, for a description of this work). For instance, if important details. Consistent with this latter i
students are studying a list of categorically related words (e.g., marking text may be more likely to be observ
"desk," "bed," "chair," "table") and a word from a different menters impose explicit limits on the amount o
category (e.g., "cow") is presented, the students will later be are allowed to mark.For example, Rickards and
more likely to recall it than they would if it had been studied in found that students limited to underlining a s
a list of categorically related words (e.g., "goat," "pig," paragraph later recalled more of a science te
"horse," "chicken"). The analogy to highlighting is that a underlining control group. Similarly, L. L.
highlighted, underlined, or capitalized sentence will "pop out" found that marking one sentence per paragr
of the text in the same way that the word "cow" would if it students in a reading class to remember the u
were isolated in a list of words for types of furniture. Consis- mation, although it did not translate into an o
tent with this expectation, a number of studies have shown that
reading marked text promotes later memory for the marked 4.2 How general are the effects of highlighti
material: Students are more likely to remember things that the lining? We have outlined hypothetical mec
experimenter highlighted or underlined in the text (e.g., highlighting might aid memory, and partic
Cashen & Leicht, 1970; Crouse & Idstein, 1972; Hartley, highlighting that would be necessary for thes
Bartlett, & Branthwaite, 1980; Klare, Mabry, & Gustafson, be effective (e.g., highlighting only important m
1955; see Lorch, 1989 for a review). ever, most studies have shown no benefit of highlight

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
20 Duntosky et al.

is typically used) over an


and thus the question con
of highlighting is largel
lighting has not been pa
tions have systematically
moderate the effectiven
could not include a Lear
below, given the lack of
literature permits, we s
erate the effectiveness o
our conclusion about the
nique holds across a w
4.2b Student characteris
Air Force basic trainees
dren (e.g., Rickards & De
(i.e., students who score
section; Nist & Hogrebe,
graduates (e.g., Todd & K
groups struggled to high
other studies have sugge
mark text. Results from
prior knowledge might
lighting. In particular, t
engines that either was
key information underl
menters had access to p
mechanical-aptitude sco
experiment to those scor
to airmen who had recei
premarked texts and did
have underlined on their
with little knowledge o
which parts of a text w
would benefit less from
able students would). underlining draws attention more to individual concepts (sup
One other interesting possibility has come from a study in porting memory for facts) than to connection
which experimenters extrinsically motivated participants by cepts (as required by the inference quest
promising them that the top scorers on an exam would receive with this idea, in another study, underliners w
$5 (Fass & Schumacher, 1978). Participants read a text about a final test would be in a multiple-choice format
enzymes; half the participants were told to underline key on it than did underliners who expected it
words and phrases. All participants then took a 15-item multi- answer format (Kulhavy, Dyer, & Silver, 1975
pie-choice test. A benefit from underlining was observed the actual format of the final-test questions. Unde
among students who could earn the $5 bonus, but not among mation may naturally line up with the kinds o
students in a control group. Thus, although results from this students expect on multiple-choice tests (e.g.,
single study need to be replicated, it does appear that some 1988), but students may be less sure about what
students may have the ability to highlight effectively, but do when studying for a short-answer test,
not always do so.
4.2c Materials. Similar conclusions about marking text have 4.5 Effects in representative educationa
come from studies using a variety of different text materials on alluded to at the beginning of this section, su
topics as diverse as aerodynamics, ancient Greek schools, textbooks and other student materials have
aggression, and Tanzania, ranging in length from a few hun- frequency of highlighting and underlinin
dred words to a few thousand. Todd and Kessler (1971) contexts (e.g., Bell & Limber, 2010; Lonka et
manipulated text length (all of the materials were relatively clear are the consequences of such real-w
short, with lengths of 44, 140, or 256 words) and found that Classroom studies have examined whether inst
underlining was ineffective regardless of the text length. Fass markings affect examination performance. F

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 21

Cashen and Leicht (1970) had psychology stude


entific American articles on animal learning,
group conflict, each of which contained five c
ments, which were underlined in red for half
The articles were related to course content bu
ered in lectures. Exam scores on items related
statements were higher when the statements
lined in red than when they had not. Interestin
the underlining condition also scored better on
about information that
S The keyword
had been
mnemonic in sentences
critical statements (as opposed to scoring wors
about nonunderlined information). The benefi
items was replicated in another psychology cl
Cashen, 1972), although the effects were weak
is unclear whether the results from either of
would generalize to a situation in which stude
charge of their own highlighting, because the
mark many more than five statements in an
would show less discrimination between impor
information). one: mental imagery. The earliest systematic research on
imagery was begun in the late 1800s by Francis G
4.4 Issues for implementation. Students already are familiar historical review, see Tho
with and spontaneously adopt the technique of highlighting; debates have arisen about its n
the problem is that the way the technique is typically imple- shyn, 1981), such as whether
mented is not effective. Whereas the technique as it is typi- age of dual codes (one imagina
cally used is not normally detrimental to learning (but see storage of a distinctive proposit
Peterson, 1992, for a possible exception), it may be problem- Hunt, 1989), and whether men
atic to the extent that it prevents students from engaging in same brain mechanisms as vi
other, more productive strategies. 1998).
One possibility that should be explored is whether students Few of these debates have been en
could be trained to highlight more effectively. We located nately, their resolution is not essent
three studies focused on training students to highlight. In two power of mental imagery. In parti
of these cases, training involved one or more sessions in which use of imagery can enhance learni
students practiced reading texts to look for main ideas before wide variety of materials and for
marking any text. Students received feedback about practice ties. A review of this entire literatur
texts before marking (and being tested on) the target text, and single monograph or perhaps ev
training improved performance (e.g., Amer, 1994; Hayati & imagery is one of the most highly i
Shariatifar, 2009). In the third case, students received feed- ties and has inspired enough empir
back on their ability to underline the most important content in own publication (i.e., the Journa
a text; critically, students were instructed to underline as little of an exhaustive review, we bri
as possible. In one condition, students even lost points for of mental imagery for improving stu
underlining extraneous material (Glover, Zimmer, Filbeck, & been empirically scrutinized: t
Plake, 1980). The training procedures in all three cases monic for learning foreign-langua
involved feedback, and they all had some safeguard against of mental imagery for compr
overuse of the technique. Given students' enthusiasm for high- materials,
lighting and underlining (or perhaps overenthusiasm, given
that students do not always use the technique correctly), dis- 5.1 General description of t
covering fail-proof ways to ensure that this technique is used why it works. Imagine a student
effectively might be easier than convincing students to aban- vocabulary, including words such
don it entirely in favor of other techniques. (key), revenir (to come back), and mo
learning, the student uses the keyword mnemonic,
4.5 Highlighting and underlining: Overall assessment. On technique based on interactive im
the basis of the available evidence, we rate highlighting and Atkinson and Raugh (1975). To
underlining as having low utility. In most situations that have dent would first find an English
been examined and with most participants, highlighting does the foreign cue word, such as den

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
22 Dunlosky et al.

"la clef." The studen


the English keyword
So, for la dent-tooth,
ing a large molar wit
(1975) had college stud
learn Spanish-English
students first learned
keyword with the app
associated with the ke
oped interactive imag
English translations. I
generate the English
Spanish cue (e.g., "gus
word mnemonic perfo
did a control group
equivalents without keywords. "revenge" (e.g., one might need "to come back" to taste its
Beyond this first demonstration, the potential benefits of sweetness), but imaging this abstract term would be
the keyword mnemonic have been extensively explored, and and might even limit retention. Indeed, Hall (1988) fou
its power partly resides in the use of interactive images. In a control group (which received task practice but no sp
particular, the interactive image involves elaboration that inte- instructions on how to study) outperformed a keywo
grates the words meaningfully, and the images themselves in a test involving English definitions that did not easi
should help to distinguish the sought-after translation from keyword generation, even when the keywords were p
other candidates. For instance, in the example above, the Proponents of the keyword mnemonic do acknowledge th
image of the "large molar" distinguishes "tooth" (the target) benefits may be limited to keyword-friendly materia
from other candidates relevant to dentists (e.g., gums, drills, concrete nouns), and in fact, the vast majority of th
floss). As we discuss next, the keyword mnemonic can be on the keyword mnemonic has involved materials that af
effectively used by students of different ages and abilities for its use.
a variety of materials. Nevertheless, our analysis of this litera- Second, in most studies, the keywords have been p
ture also uncovered limitations of the keyword mnemonic that by the experimenters, and in some cases, the interact
may constrain its utility for teachers and students. Given these (in the form of pictures) were provided as well. Fe
limitations, we did not separate our review of the literature have directly examined whether students can su
into separate sections that pertain to each variable category generate their own keywords, and those that have hav
(Table 2) but instead provide a brief overview of the most rel- mixed results: Sometimes students' self-generated k
evant evidence concerning the generalizability of this facilitate retention as well as experimenter-provided k
technique. do (Shapiro & Waters, 2005), and sometimes they do not
(Shriberg, Levin, McCormick, & Pressley, 1
5.2 a-d How general are the effects of the keyword mne- Wang, 1996). For more
monic? The benefits of the keyword mnemonic generalize to multiple attributes, as i
many different kinds of material: (a) foreign-language vocabu- experimenter-provide
lary from a variety of languages (French, German, Italian, some students may h
Latin, Russian, Spanish, and Tagalog); (b) the definitions of extensive training. Finally
obscure English vocabulaiy words and science terms; (c) state- ties generating images a
capital associations (e.g., Lincoln is the capital of Nebraska); mnemonic only if k
(d) medical terminology; (e) people's names and accomplish- image (in the form of a p
ments or occupations; and (f) minerals and their attributes (e.g., (Pressley & Levin, 19
the mineral wolframite is soft, dark in color, and used in the willing to construct ap
home). Equally impressive, the keyword mnemonic has also monic useful, even these t
been shown to benefit learners of different ages (from second to use the technique only
graders to college students) and students with learning disabili- keyword friendly.
ties (for a review, see Jitendra, Edwards, Sacks, & Jacobson, Third, and perhaps m
2004). Although the bulk of research on the keyword mne- monic may not produce d
monic has focused on students' retention of target materials, investigating the long-ter
the technique has also been shown to improve students' perfor- included a test soon aft
manee on a variety of transfer tasks: It helps them (a) to gener- delay of several days
ate appropriate sentences using newly learned English & Miller, 1986; Raug

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 23

generally demonstrated a benefit


CH Keyword of keyword
delay (for a review, see Wang, Thomas, & O
E] Rote Repetition
Unfortunately, these promising effects were c
22
the experimental designs. In particular, all ite
on both the immediate and
20 delayed tests. Give
word mnemonic yielded better
18 performance on
tests, this initial increase in successful reca
16
boosted performance on the delayed tests and t
ately disadvantaged the 14
control groups. Put di
advantage in delayed test 12performance could ha
due to the effects of retrieval practice (i.e., from
10
test) and not to the use of keyword mnemonic
retrieval can slow forgetting; see the Practice

1
below).
This possibility was supported by data from Wang et al.
(1992; see also Wang & Thomas, 1995), who administered
immediate and delayed tests to different groups of students. As
shown in Figure 4 (top panel), for participants who received 20
the immediate test, the keyword-mnemonic group outper 18
formed a rote-repetition control group. By contrast, this bene
16
fit vanished for participants who received only the delayed
test. Even more telling, as shown in the bottom panel of Figure 14
4, when the researchers equated the performance of the two 12
groups on the immediate test (by giving the rote-repetition
10
group more practice), performance on the delayed test was
significantly better for the rote-repetition group than for the
keyword-mnemonic group (Wang et al., 1992).

a
These data suggest that the keyword mnemonic leads to
accelerated forgetting. One explanation for this surprising out
come concerns decoding at retrieval: Students must decode
each image to retrieve the appropriate target, and at longer
delays, such decoding may be particularly difficult. For Immediate Test Delayed Test
instance, when a student retrieves "a dentist holding a large
Fig. 4. Mean number of items correctly recalled on a cued-recall test oc
molar with a pair of pliers," he or she may have difficulty
curring soon after study (immediate test) or 1 week after study (delayed
deciding whether the target is "molar," "tooth," "pliers," or test) In Wang, Thomas, and Ouellette (1992). Values in the top panel are
"enamel." from Experiment I, and those in the bottom panel are from Experiment 3.
Standard errors are not available.

5.3 Effects in representative educational contexts. The


keyword mnemonic has been implemented in classroom set
tings, and the outcomes have been mixed. On the promisingkeyword-mnemonic training for college students enrolled in
an elementary French course (cf. van Hell & Mahn, 1997; bu
side, Levin, Pressley, McCormick, Miller, and Shriberg (1979)
see Lawson & Hogben, 1998).
had fifth graders use the keyword mnemonic to learn Spanish
vocabulary words that were keyword friendly. Students were
trained to use the mnemonic in small groups or as an entire
5.4 Issues for implementation. The majority of research on
class, and in both cases, the groups who used the keywordthe keyword mnemonic has involved at least some (and occa
mnemonic performed substantially better than did control sionally extensive) training, largely aimed at helping student
develop interactive images and use them to subsequently
groups who were encouraged to use their own strategies while
studying. Less promising are results for high school students
retrieve targets. Beyond training, implementation also require
who Levin et al. ( 1979) trained to use the keyword mnemonic.
the development of keywords, whether by students, teachers
These students were enrolled in a lst-year or 2nd-yearorlan textbook designers. The effort involved in generating som
keywords may not be the most efficient use of time for st
guage course, which is exactly the context in which one would
expect the keyword mnemonic to help. However, the keyworddents (or teachers), particularly given that at least one easy
mnemonic did not benefit recall, regardless of whether technique (i.e., retrieval practice, Fritz, Morris, Acton
to-use
Voelkel, & Etkind, 2007) benefits retention as much as th
students were trained individually or in groups. Likewise,
Willerman and Melvin (1979) did not find benefits of keyword mnemonic does.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
24 Dunlosky et al.

5.5 The keyword


■ Imagery mn
basis of the literature
□ No Imagery
mnemonic as low utilit
80 r
word mnemonic be wi
keyword-friendly mat
terms of time needed
and it may not produ
clear that students wil
mnemonic when they
research is needed to
keyword generation (a
an efficient use of stu
gies. In one head-to-h
language vocabulary w
keyword mnemonic (w
than after practice tes
tests 1 week later (Fri
that practice testing i
No Drawing Drawing
ble (as reviewed belo
seems Fig. S. Accuracy on a multiple-choice
superior to exam in which answers
the
inferred from a text in Leutner, Leopold, and Sumfleth (2009). P
either did or did not receive instructions to use imagery while rea
either did or did not draw pictures to illustrate the content of
6 Imagery use
Error bars represent standard errors.
fo
6.1 General descripti
work. one demonst In
enhancing knowledge text
to generate a coherent learn
representation of a
(2009) gave may enhance
tenth a student's general understanding
grad of the
text on the dipole
so, the influence of imagery use maychar be robust across
were told tasks that tap
to read memory and comprehension.
the Despitet t
were told to andread
sibilities the dramatic effect of theimagery demonst t
of each paragraph
Leutner et al. (2009), our review of the literature us sugg
Imagery instructions
the effects of using mental imagery to learn from tex
students rather limited and not robust.instruct
were
content of each paragr
reading, 6.2 How general
the are the effects of imagery use for text
students
questions learning? Investigations
for whichof imagery use for learning text
th
able from the text but needed to be inferred from it. As shown materials have focused on single sentences and longer text
materials. Evidence concerning the impact of imagery on sen
in Figure 5, the instructions to mentally imagine the content of
each paragraph significantly boosted the comprehension-test tence learning largely comes from investigations of other mne
performance of students in the mental-imagery group, in com monic techniques (e.g., elaborative interrogation) in which
parison to students in the control group (Cohen's d = 0.72). imagery instructions have been included in a comparison con
dition. This research has typically demonstrated that groups
This effect is impressive, especially given that (a) training was
not required, (b) the text involved complex science content,who receive imagery instructions have better memory for sen
and (c) the criterion test required learners to make inferencestences than do no-instruction control groups (e.g., R. C.
about the content. Finally, drawing did not improve compreAnderson & Hidde, 1971; Wood, Pressley, & Winne, 1990). In
hension, and it actually negated the benefits of imagerythe remainder of this section, we focus on the degree to which
imagery instructions improve learning for longer text
instructions. The potential for another activity to interfere with
materials.
the potency of imagery is discussed further in the subsection
on learning conditions (6.2a) below. 6.2a Learning conditions. Learning conditions play a poten
A variety of mechanisms may contribute to the benefits oftially important role in moderating the benefits of imagery, so
imaging text material on later test performance. Developingwe briefly discuss two conditions here—namely, the modality
images can enhance one's mental organization or integrationof text presentation and learners' actual use of imagery after
of information in the text, and idiosyncratic images of particureceiving imagery instructions. Modality pertains to whether
lar referents in the text could enhance learning as well (cf. dis
students are asked to use imagery as they read a text or as they
tinctive processing; Hunt, 2006). Moreover, using one's priorlisten to a narration of a text. L. R. Brooks (1967, 1968)

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 25

reported that participants' visualization of a p


matrix was disrupted when they had to read a
by contrast, visualization was not disrupted w
listened to the description. Thus, it is possible
of imagery are not fully actualized when stude
would be most evident if they listened. Two o
relevant to this possibility. First, the majorit
research has involved students reading text
imagery benefits have sometimes been found
reading does not entirely undermine imaginai
ond, in experiments in which participants eit
tened to a text, the results have been mixed
imagery has benefited performance more amon
have listened to texts than among students wh
(De Beni & Moè, 2003; Levin & Divine-Hawk
in one case, imagery benefited performance s
modalities in a sample of fourth graders (Mah
1982). to benefit from using imagery (Oakhill & Patel, 1991; Press
The actual use of imagery as a learning technique should ley, 1976), but younger stud
also be considered when evaluating the imagery literature. In attempting to generate me
particular, even if students are instructed to use imagery, they (Guttman, Levin, & Pr
may not necessarily use it. For instance, R. C. Anderson and 6.2c Materials. Simil
Kulhavy (1972) had high school seniors read a lengthy text monic, investigations of
passage about a fictitious primitive tribe; some students were often used texts that are
told to generate images while reading, whereas others were that can be visualized or sh
told to read carefully. Imagery instructions did not influence terms. Across investigat
performance, but reported use of imagery was significantly widely and include lon
correlated with performance (see also Denis, 1982). The prob- e.g., R. C. Anderson & Ku
lem here is that some students who were instructed to use relatively short stories
imagery did not, whereas some uninstructed students sponta- 1990; Maher & Sulliv
neously used it. Both circumstances would reduce the observed sages (Levin & Divin
effect of imagery instructions, and students' spontaneous use With regard to these var
of imagery in control conditions may be partly responsible for sion is that sometimes im
the failure of imagery to benefit performance in some cases, and sometimes they do not
Unfortunately, researchers have typically not measured imag- tions whereby imagery he
ery use, so evaluation of these possibilities must await further for another kind of mate
research. effect for any given kind of material may not be due to the
6.2b Student characteristics. The efficacy of imagery instruc- material per se, but instead ma
tions have been evaluated across a wide range of student ages uncontrolled factors, making it
and abilities. Consider data from studies involving fourth any) characteristics of the materia
graders, given that this particular grade level has been popular will be beneficial.
in imagery research. In general, imagery instructions have Fortunately, some investigat
tended to boost criterion performance for fourth graders, but tent of text materials when exa
even here the exceptions are noteworthy. For instance, imag- use. In De Beni and Moè (20
ery instructions boosted the immediate test performance of tions that were easy to imagine,
fourth graders who studied short (e.g., 12-sentence) stories description of a pathway that wa
that could be pictorially represented (e.g., Levin & Divine- ize, and another was abstract
Hawkins, 1974), but in some studies, this benefit was found imagine. As compared with instr
only for students who were biased to use imagery or for skilled texts, instructions to use imag
readers (Levin, Divine-Hawkins, Kerst, & Guttman, 1974). easy-to-imagine texts and the spatia
For reading longer narratives (e.g., narratives of 400 words or recall of the abstract texts. M
more), imagery instructions have significantly benefited fourth dent only when students listen
graders'free recall of text material (Gambrell & Jawitz, 1993; read it (as discussed unde
Rasco, Tennyson, & Boutwell, 1975; see also Lesgold, McCor- above). Thus, the benefits of
mick, & Golinkoff, 1975) and performance on multiple-choice strained to texts that directly su

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
26 Dunlosky et al.

Although the bulk of t


that were specifically c
have used the Metropol
dardized test that tap
extensive training in th
both studies failed to f
performance (Lesgold,
when participants were
skills to complete th
6.2d Criterion tasks. Th
within groups of studen
tions between imagery
rion task. Consider firs
college students. When
or short-answer questio
in the text, college stud
image (e.g., Gyeselinck
2009; Hodes, 1992; Rasco
earlier, these effects m
passages rather than lis
contrast, despite the fa
dents develop an integr
instructions did not sig
questions that require
information in a text (
sion questions about a
1992). , when they are reading texts that easily lend themselves to ima
This pattern is also apparent from studies with sixth grad- ginal representations. How much
ers, who do show significant benefits of imagery use on mea- ensure that students consistent
sures involving the recall or summarization of text information under the appropriate conditio
(e.g., Kulhavy & Swenson, 1975), but show reduced or nonex
istent benefits on comprehension tests and on criterion tests 6.5 Imagery use for learn
that require application of the knowledge (Gagne & Memory, Imagery can improve students' l
1978; Miccinati, 1982). In general, imagery instructions tend the promising work by Leu
not to enhance students' understanding or application of the potential utility of imagery use
content of a text. One study demonstrated that training duction is also more broadly a
improved 8- and 9-year-olds' performance on inference ques- mnemonic. Nevertheless, the be
tions, but in this case, training was extensive (three sessions), constrained to imagery-friendly
which may not be practical in some settings. ory, and further demonstrations o
When imagery instructions do improve criterion perfor- technique (across different criter
manee, a question arises as to whether these effects are long relevant retention intervals) are
lasting. Unfortunately, the question of whether the use of the use of imagery for learning t
imagery protects against the forgetting of text content has not
been widely investigated; in the majority of studies, criterion
tests have been administered immediately or shortly after the
7 Rereading
target material was studied. In one exception, Kulhavy and Rereading is one of the techniques that students most fre
Swenson (1975) found that imagery instructions benefited quently report using during self-regulated study (Carrier,
fifth and sixth graders' accuracy in answering questions that 2003; Hartwig & Dunlosky, 2012; Karpicke, Butler, & Roedi
tapped the gist of the texts, and this effect was even apparent 1 ger, 2009; Kornell & Bjork, 2007; Wissman, Rawson, & Pyc,
week after the texts were initially read. The degree to which 2012). For example, Carrier (2003) surveyed college students
these long-term benefits are robust and generalize across a in an upper-division psychology course, and 65% reported
variety of criterion tasks is an open question. using rereading as a technique when preparing for course
exams. More recent surveys have reported similar results.
6.3 Effects in representative educational contexts. Many Kornell and Bjork (2007) and Hartwig and Dunlosky (2012)
of the studies on imagery use and text learning have involved asked students if they typically read a textbook, article, or

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 27

other source materialregardless


more of the kind
than or level of informationduring
once within the
these two studies, 18% text. Inof contrast, the qualitative hypothesis
students reported assumes that r
chapters, and another rereading
62% differentially
reported affects the processing of higher-level
rereading p
of the material. Even and lower-level information within a text, withstudent
high-performing particular
emphasis placed on the conceptual
rereading regularly. Karpicke et al. organization
(2009) and process aske
ing of main ideas
ates at an elite university during rereading.
(where To evaluate these hypoth
students' aver
eses, several
were above 1400) to list all studies
of have examined
thefree recall as a function of
techniques t
studying and then to the
rankkind or levelthem
of text information.
inThe results have beenof fr
terms
Eighty-four percent of somewhat mixed, but the evidence
students appears to favor the
included quali
rerea
notes in their list, and
tativerereading
hypothesis. Although a few was also
studies found the t
that rereading
nique (listed as the most frequently
produced similar improvements in the recall used tech
of main ideas and
of details (a finding
students). Students' heavy consistent with the
reliance on quantitative
rereadin hypothe
regulated study raises sis), an
several important
studies have reported greaterquestion:
improvement in the
effective technique? recall of main ideas than in the recall of details (e.g., Bromage
& Mayer, 1986; Kiewra, Mayer, Christensen, Kim, & Risch,
7. I General 1991; Rawsonof
description & Kintsch, 2005).
rereading and
work. In an early study by Rothkopf (1968), u
7.2 How general
read an expository text are the effects
(either aof 1,500-wor
rereading?
making leather or a 750-word passage
7.2a Learning conditions. Following about
the early work of Roth
tory) zero, two, kopf
one, or (1968),
four subsequenttimes.
research established that the effects w
Reading
and rereading was
massed (i.e.,
of rereading are fairly each
robust across present
other variations in learn
occurred immediately ing after
conditions. Forthe
example, rereading
previous effects obtainpres
regard
a 10-minute delay, a less cloze test
of whether learners was
are forewarned thatadminis
they will be given
10% of the content words were deleted from the text and the opportunity to study more than once, although Barnett and
students were to fill in the missing words. As shown in
Seefeldt (1989) found a small but significant increase in the
Figure 6, performance improved as a function of number magnitude
of of the rereading effect among learners who were
readings. forewarned, relative to learners who were not forewarned.
Why does rereading improve learning? Mayer (1983; Bro Furthermore, rereading effects obtain with both self-paced
mage & Mayer, 1986) outlined two basic accounts of reread reading and experimenter-paced presentation. Although most
ing effects. According to the quantitative hypothesis, rereading studies have involved the silent reading of written material,
simply increases the total amount of information encoded, effects of repeated presentations have also been shown when
learners listen to an auditory presentation of text material (e.g.,
Bromage & Mayer, 1986; Mayer, 1983).2
One aspect of the learning conditions that does significantly
moderate the effects of rereading concerns the lag between ini
tial reading and rereading. Although advantages of rereading
over reading only once have been shown with massed reread
ing and with spaced rereading (in which some amount of time
passes or intervening material is presented between initial
study and restudy), spaced rereading usually outperforms
massed rereading. Flowever, the relative advantage of spaced
reading over massed rereading may be moderated by the
length of the retention interval, an issue that we discuss further
in the subsection on criterion tasks below (7.2d). The effect of
spaced rereading may also depend on the length of the lag
between initial study and restudy. In a recent study by Verkoei
jen, Rikers, and Ôzsoy (2008), learners read a lengthy exposi
tory text and then reread it immediately afterward, 4 days later,
or 3.5 weeks later. Two days after rereading, all participants
Number of Readings
completed a final test. Performance was greater for the group
Fig. 6. Mean percentage of correct responses on a final cloze test for who reread after a 4-day lag than for the massed rereaders,
learners who read an expository text zero, one, two, or four times in whereas performance for the group who reread after a 3.5
Rothkopf (1968). Means shown are overall means for two conditions, one
in which learners read a 1,500-word text and one in which learners read week lag was intermediate and did not significantly differ
a 750-word text. Values are estimated from original figures in Rothkopffrom performance in either of the other two groups. With that
(1968). Standard errors are not available. said, spaced rereading appears to be effective at least across

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
28 Dunlosky et al.

moderate lags, with st


lagsof several minutes,
One other learning cond
of practice, or dosage. M
single reading appear
majority of studies tha
have shown diminishin
als. However, an impor
involved massed rerea
spaced rereading trials
remains an open question. illustrative but nonexhaustive list includes physics (e.g.,
Finally, although learners in most experiments have studied Ohm's law), law (e.g., legal principles
only one text, rereading effects have also been shown when (e.g., the construction of the Brooklyn
learners are asked to study several texts, providing suggestive (e.g., how a camera exposure me
evidence that rereading effects can withstand interference insects), geography (e.g., of Africa), and
from other learning materials. treatment of mental disorders).
7.2b Student characteristics. The extant literature is severely 7.2d Criterion tasks. Across rereading studies,
limited with respect to establishing the generality of rereading monly used outcome measure has been free
effects across different groups of learners. To our knowledge, consistently shown effects of both massed an
all but two studies of rereading effects have involved under- ing with very few exceptions. Several studies h
graduate students. Concerning the two exceptions, Amlund, rereading effects on cue-based recall measures, s
Kardash, and Kulhavy (1986) reported rereading effects with the-blank tests and short-answer questions tapp
graduate students, and O'Shea, Sindelar, and O'Shea (1985) information. In contrast, the effects of rerea
reported effects with third graders. tion are less certain, with weak or nonexistent effects on sen
The extent to which rereading effects depend on knowledge tence-verification tasks and multiple-choice questions tapping
level is also woefully underexplored. In the only study to date information explicitly stated in the text (Callender & McDan
that has provided any evidence about the extent to which iel, 2009; Dunlosky & Rawson, 2005; Hinze & Wiley, 2011;
knowledge may moderate rereading effects (Arnold, 1942), Kardash & Scholes, 1995). The evidence concerning the
both high-knowledge and low-knowledge readers showed an effects of rereading on comprehension is somewhat muddy,
advantage of massed rereading over outlining or summarizing Although some studies have shown positive effects of reread
a passage for the same amount of time. Additional suggestive ing on answering problem-solving essay questions (Mayer,
evidence that relevant background knowledge is not requisite 1983) and short-answer application or inference questions
for rereading effects has come from three recent studies that (Karpicke & Blunt, 2011; Rawson & Kintsch, 2005), other
used the same text (Rawson, 2012; Rawson & Kintsch, 2005; studies using application or inference-based questions have
Verkoeijen et al., 2008) and found significant rereading effects reported effects only for higher-ability students (Barnett &
for learners with virtually no specific prior knowledge about Seefeldt, 1989) or no effects at all (Callender & McDaniel,
the main topics of the text (the charge of the Light Brigade in 2009; Dunlosky & Rawson, 2005; Durgunoglu, Mir, & Ariño
the Crimean War and the Hollywood film portraying the event). Marti, 1993; Griffin, Wiley, & Thiede, 2008).
Similarly, few studies have examined rereading effects as a Concerning the durability of learning, most of the studies
function of ability, and the available evidence is somewhat that have shown significant rereading effects have adminis
mixed. Arnold (1942) found an advantage of massed rereading tered criterion tests within a few minutes after the final study
over outlining or summarizing a passage for the same amount trial, and most of these studies reported an advantage of
of time among learners with both higher and lower levels of massed rereading over a single reading. The effects of massed
intelligence and both higher and lower levels of reading ability rereading after longer delays are somewhat mixed. Agarwal,
(but see Callender & McDaniel, 2009, who did not find an Karpicke, Kang, Roediger, and McDermott (2008; see also
effect of massed rereading over single reading for either Karpicke & Blunt, 2011) reported massed rereading effects
higher- or lower-ability readers). Raney (1993) reported a sim- after 1 week, but other studies have failed to find significant
ilar advantage of massed rereading over a single reading for effects after 1-2 days (Callender & McDaniel, 2009; Cranney,
readers with either higher or lower working-memory spans. Ahn, McKinnon, Morris, & Watts, 2009; Hinze & Wiley,
Finally, Barnett and Seefeldt (1989) defined high- and low- 2011; Rawson & Kintsch, 2005).
ability groups by a median split of ACT scores; both groups Fewer studies have involved spaced rereading, although a
showed an advantage of massed rereading over a single read- relatively consistent advantage for spaced rereading over a
ing for short-answer factual questions, but only high-ability single reading has been shown both on immediate tests and on
learners showed an effect for questions that required applica- tests administered after a 2-day delay. Regarding the compari
tion of the information. son of massed rereading with spaced rereading, neither

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 29

schedule shows a consistent advantage on imme


similar number of studies have shown an advan
over massing, ansp advantage of massing over
differences in performance. In contrast, spa
sistently outperforms massed rereading on del
explore the benefits of spacing more generally
uted Practice section below. although rereading is relatively economical with respect to
time demands and training requirements when compared with
7.3 Effects in representative educational contexts. Given some other learning techniques, rereading is also
that rereading is the study technique that students most com- much less effective. The relative disadvantage of rere
monly report using, it is perhaps ironic that no experimental other techniques is the largest strike against reread
research has assessed its impact on learning in educational the factor that weighed most heavily in our decision to
contexts. Although many of the topics of the expository texts it a rating of low utility.
used in rereading research are arguably similar to those that
students might encounter in a course, none of the aforemen
tioned studies have involved materials taken from actual 8 Practice testing
course content. Furthermore, none of the studies were admin- Testing is likely viewed by many students as an undesir
istered in the context of a course, nor have any of the outcome necessity of education, and we suspect that most stude
measures involved course-related tests. The only available would prefer to take as few tests as possible. This view of tes
evidence involves correlational findings reported in survey ing is understandable, given that most students' exper
studies, and it is mixed. Carrier (2003) found a nonsignificant with testing involves high-stakes summative assessments
negative association between self-reported rereading of text- are administered to evaluate learning. This view of testing
book chapters and exam performance but a significantly posi- also unfortunate, because it overshadows the fact that te
tive association between self-reported review of lecture notes also improves learning. Since the seminal study by Abbot
and exam performance. Hartwig and Dunlosky (2012) found a (1909), more than 100 years of research has yielded severa
small but significant positive association between self-reported hundred experiments showing that practice testing enh
rereading of textbook chapters or notes and self-reported grade learning and retention (for recent reviews, see Rawson &
point average, even after controlling for self-reported use of losky, 2011; Roediger & Butler, 2011; Roediger, Putnam, &
other techniques. Smith, 2011 ). Even in 1906, Edward Thorndike recommended
that "the active recall of a fact from within is, as a rule, bette
7.4 Issues for implementation. One advantage of rereading than its impression from without" (p. 123, Th
is that students require no training to use it, other than perhaps The century of research on practice testing sin
being instructed that rereading is generally most effective ported Thorndike's recommendation by dem
when completed after a moderate delay rather than immedi- broad generalizability of the benefits of practic
ately after an initial reading. Additionally, relative to some Note that we use the term practice testing here
other learning techniques, rereading is relatively economical guish testing that is completed as a low-stakes o
with respect to time demands (e.g., in those studies permitting practice or learning activity outside of class fro
self-paced study, the amount of time spent rereading has typi- assessments that are administered by an instruct
cally been less than the amount of time spent during initial (b) to encompass any form of practice testing th
reading). However, in head-to-head comparisons of learning would be able to engage in on their own. For exam
techniques, rereading has not fared well against some of the testing could involve practicing recall of target
more effective techniques discussed here. For example, direct the use of actual or virtual flashcards, complet
comparisons of rereading to elaborative interrogation, self- problems or questions included at the end of text
explanation, and practice testing (described in the Practice or completing practice tests included in the electr
Testing section below) have consistently shown rereading to mental materials that increasingly accompany te
be an inferior technique for promoting learning.
8.1 General description of practice testing and why i
7.5 Rereading: Overall assessment. Based on the available should work. As an illustrative example of the
evidence, we rate rereading as having low utility. Although ing, Runquist (1983) presented undergraduates wi
benefits from rereading have been shown across a relatively word pairs for initial study. After a brief interval
wide range of text materials, the generality of rereading effects participants completed filler tasks, half of the p
across the other categories of variables in Table 2 has not been via cued recall and half were not. Participants
well established. Almost no research on rereading has involved final cued-recall test for all pairs either 10 minu
learners younger than college-age students, and an insufficient later. Final-test performance was better for p
amount of research has systematically examined the extent to practice tested than pairs that were not (53% ver

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
30 Dunlosky et al.

10 minutes, 35% versus


illustrates the method o
conditions that do and d
other studies have comp
more stringent conditio
of the to-be-learned inf
Karpicke (2006b) prese
expository text for init
study trial or by a pract
recall was considerably b
the practice test than
versus 42%). As another
tion of the potency of
picke and Roediger (200
Swahili-English transla
cued recall until items w
first correct recall, ite
study cycles with no fur
cycles with no further
week later was substan
(80%) than after continued study (36%). greater when items had received practice tests (39%) than
Why does practice testing improve learning? Whereas a when they had only been studied ( 17%). Importantly, the prac
wealth of studies have established the generality of testing tice test condition also outperformed the study condition on
effects, theories about why it improves learning have lagged secondaiy measures primarily tapping organizational process
behind. Nonetheless, theoretical accounts are increasingly ing and idiosyncratic processing,
emerging to explain two different kinds of testing effects,
which are referred to as direct effects and mediated effects of 8.2 How general are the effects of practice testing? Given
testing (Roediger & Karpicke, 2006a). Direct effects refer to the volume of research on testing effects, an exhaustive review
changes in learning that arise from the act of taking a test of the literature is beyond the scope of this article. Accord
itself, whereas mediated effects refer to changes in learning ingly, our synthesis below is primarily based on studies from
that arise from an influence of testing on the amount or kind of the past 10 years (which include more than 120 articles),
encoding that takes place after the test (e.g., during a subse- which we believe represent the current state of the field. Most
quent restudy opportunity). of these studies compared conditions involving practice tests
Concerning direct effects of practice testing, Carpenter with conditions not involving practice tests or involving only
(2009) recently proposed that testing can enhance retention by restudy; however, we also considered more recent work pitting
triggering elaborative retrieval processes. Attempting to different practice-testing conditions against one another to
retrieve target information involves a search of long-term explore when practice testing works best,
memory that activates related information, and this activated 8.2a Learning conditions. The majority of research on prac
information may then be encoded along with the retrieved tar- tice testing has used test formats that involve cued recall of
get, forming an elaborated trace that affords multiple path- target information from memory, but some studies have also
ways to facilitate later access to that information. In support of shown testing effects with other recall-based practice-test for
tius account, Carpenter (2011) had learners study weakly mats, including free recall, short-answer questions, and fill
related word pairs (e.g., "mother"-"child") followed either by in-the-blank questions. A growing number of studies using
additional study or a practice cued-recall test. On a later final multiple-choice practice tests have also reported testing effects,
test, recall of the target word was prompted via a previously Across these formats, most prior research has involved prac
unpresented but strongly related word (e.g., "father"). Perfor- tice tests that tap memory for explicitly presented information,
manee was greater following a practice test than following However, several studies have also shown testing effects for
restudy, presumably because the practice test increased the practice tests that tap comprehension, including short-answer
likelihood that the related information was activated and application and multiple-choice inference-based questions
encoded along with the target during learning. (e.g., Agarwal & Roediger, 2011 ; Butler, 2010; C. I. Johnson &
Concerning mediated effects of practice testing, Pyc and Mayer, 2009). Testing effects have also been shown in a study
Rawson (2010, 2012b) proposed a similar account, according in which practice involved predicting (vs. studying) input-out
to which practice testing facilitates the encoding of more put values in an inductive function learning task (Kang,
effective mediators (i.e., elaborative information connecting McDaniel, & Pashler, 2011) and a study in which participants
cues and targets) during subsequent restudy opportunities. Pyc practiced (vs. restudied) resuscitation procedures (Kromann,

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 31

Jensen, & Ringsted, 2009). Some research has


testing effects even when practice tests are o
wal et ah, 2008; Weinstein, McDermott, & R
It is important to note that practice tests ca
ing even when the format of the practice tes
the format of the criterion test. For examp
shown cross-format effects of multiple-choic
on subsequent cued recall (Fazio, Agarwal, Mar
ger, 2010; Marsh, Agarwal, & Roediger, 2009;
Marsh, 2005), practice free recall on subseq
choice and short-answer inference tests (Mc
& Einstein, 2009), and practice cued recall on s
recall and recognition (Carpenter, Pashler,
Vaughn & Rawson, 2011). when lags between trials within a session are longer rather
Although various practice-test formats work, some work than shorter (e.g., Pashler, Zarow, & Tripled, 2003;
better than others. Glover (1989) presented students with a Anderson, 2005; Pyc & Rawson, 2009,2012b), when
short expository text for initial study and then manipulated the completed in different practice sessions rather
format of the practice test (free recall, fill in the blank, or rec- same session (e.g., Bahrick, 1979; Bahrick & H
ognition) and the format of the final test (free recall, fill in the nell, 2009; Rohrer, 2009; Rohrer & Taylor, 20
blank, or recognition). On all three final-test formats, perfor- intervals between practice sessions are longer r
manee was greater following free-recall practice than follow- shorter (Bahrick et al., 1993; Carpenter, Pashler,
ing fill-in-the-blank practice, which in turn was greater than 2009, although the optimal lag between sessions
performance following recognition practice. Similarly, Carpen- on retention interval—see Cepeda et al., 2009; C
ter and DeLosh (2006) found that free-recall practice outper- Rohrer, Wixted, & Pashler, 2008). We discuss lag e
formed cued-recall and recognition practice regardless of ther in the Distributed Practice section below,
whether the final test was in a free-recall, cued-recall, or recog- 8.2b Student characteristics. A large majority of
nition format, and Hinze and Wiley (2011) found that perfor- involved college students as participants, but tes
manee on a multiple-choice final test was better following cued have also been demonstrated across participant
recall of paragraphs than following fill-in-the-blank practice, varying ages. Studies involving nonundergraduat
Further work is needed to support strong prescriptive conclu- have differed somewhat in the kind, dosage, or ti
sions, but the available evidence suggests that practice tests tice testing involved, but some form of testing ef
that require more generative responses (e.g., recall or short demonstrated with preschoolers and kinderga
answer) are more effective than practice tests that require less Morris, Nolan, & Singleton, 2007; Kratochwill,
generative responses (e.g., fill in the blank or recognition). Conzemius, 1977), elementary school students
In addition to practice-test format, two other conditions of Paulson, 1972; Bouwmeester & Verkoeijen, 2011;
learning that strongly influence the benefits of practice testing Keller, & Atkinson, 1968; Gates, 1917; Metcalf
are dosage and timing. Concerning dosage, the simplest con- 2007; Metcalfe, Kornell, & Finn, 2009; Myer
elusion is that more is better. Some studies supporting this con- Modigliani, 1985; Rohrer, Taylor, & Sholar, 20
elusion have manipulated the number of practice tests, and 1939), middle school students (Carpenter et al
final-test performance has consistently been better following Morris, Nolan, et al., 2007; Glover, 1989; McDani
multiple practice tests than following a single practice test Huelser, McDermott, & Roediger, 2011; Metcal
(e.g., Karpicke & Roediger, 2007a, 2010; Logan & Balota, Son, 2007; Sones & Stroud, 1940), high sch
2008; Pavlik & Anderson, 2005). In other studies, experiment- (Duchastel, 1981; Duchastel & Nungester, 1982; Ma
ers have varied the number of practice tests to manipulate the 2009; Nungester & Duchastel, 1982), and more a
level of success achieved during practice. For example, Vaughn dents, such as 3rd- and 4th-year medical-school
and Rawson (2011) observed significantly greater final-test mann et al., 2009; Rees, 1986; Schmidmaier et
performance when students engaged in cued-recall practice the other end of the continuum, testing effects ha
until target items were recalled four to five times versus only shown with middle-aged learners and wit
once. Several other studies have shown that final-test perfor- (Balota, Duchek, Sergent-Marshall, & Roedi
mance improves as the number of correct responses during hara & Jacoby, 2008; Logan & Balota, 2008; M
practice increases (e.g., Karpicke & Roediger, 2007b, 2008; Coane, & Duchek, 2011; Sumowski, Chiaravallo
Pyc & Rawson, 2009, 2012a; Rawson & Dunlosky, 2011), 2010; Tse, Balota, & Roediger, 2010).
albeit with diminishing returns as higher criterion levels are In contrast to the relatively broad range of ages
achieved. Whereas these studies have involved manipulations the testing-effect literature, surprisingly minima
of dosage within a practice session, other studies that have examined testing effects as a function of individua

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
32 Dunlosky et al.

in knowledge or abilit
learners with different
Ratcliffe, Murnane, and
undergraduates and ad
passages from an abno
completed a short-answ
and then took a final te
or 1 day later. Both gro
both time points (with
tively, on the material
the material that had n
vide encouraging eviden
across knowledge levels
conclusions can be draw
edge level moderates testing effects. the magnitude of the benefit depends on these factors remains
Likewise, minimal research has examined testing effects as an open question,
a function of academically relevant ability levels. In a study by 8.2c Materials. Many of the studies that have demonstrate
Spitzer (1939), 3,605 sixth graders from 91 different elemen- testing effects have involved relatively simple verbal mate
tary schools read a short text and took an immediate test, to als, including word lists and paired associates. However, mos
provide a baseline measure of reading comprehension ability, of the sets of materials used have had some educational rele
In the groups of interest here, all students read an experimental vanee. A sizable majority of studies using paired-associat
text, half completed a practice multiple-choice test, and then materials have included foreign-language translations (inclu
all completed a multiple-choice test either 1 or 7 days later, ing Chinese, Iñupiaq, Japanese, Lithuanian, Spanish,
Spitzer reported final-test performance for the experimental Swahili) or vocabulary words paired with synonyms. Other
text separately for the top and bottom thirds of performers on studies have extended effects to paired book titles and auth
the baseline measure. As shown in Figure 7, taking the practice names, names and faces, objects and names, and pictures
test benefited both groups of students. With that said, the testing foreign-language translations (e.g., Barcroft, 2007; Carpe
effect appeared to be somewhat larger for higher-ability readers & Vul, 2011 ; Morris & Fritz, 2002; Rohrer, 2009).
than for lower-ability readers (with approximately 20%, vs. A considerable number of studies have also shown test
12%, improvements in accuracy), although Spitzer did not effects for factual information, including trivia facts and gen
report the relevant inferential statistics. eral knowledge questions (e.g., Butler, Karpicke, & Roediger,

Q Practice Test
□ No Practice Test

Day Delay Week Delay Day Delay Week Delay


Top Third on Bottom Third on
Baseline Measure Baseline Measure

Fig. 7. Mean accuracy on a final test administered I day or I week after a learning s
that either did or did not include a practice test, for the top and bottom thirds of s
on a baseline measure of ability, in Spitzer (1939). Error bars represent standard err

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 33

2008; T. A. Smith
& Kimball, 2010) and
2009), although facts h
benefits d
classroom unitsinformation
science, in (see psycholo
history, and Carroll e
penter et al., 2009; McDaniel
Although et most
al., 2011; McD
research
man, & Anderson, 2012). tice tests
Earlier and criterion
research showed met
also multiplication
tests helped children learn reported encouraging
facts an
lists (Atkinson & Paulson,
which1972; Fishman
practice et al.,
testing can
effects
Modigliani, 1985), and recent have been
studies have shown o
reporte
learning of definitions ences or the application
of vocabulary words (Met o
2007) and definitions of(Agarwal
key term &concepts
Roediger,from20
1988; C. I.
material (Rawson & Dunlosky, Johnson & May
2011).
An increasing numberMcDaniel
of studieset have
al., 2009),
shown incl
be
learning from text materials of various
ferent questions length
or differe
words to 2,000 words oring
more),
practice.
of various
For example,
text g
that practicing
encyclopedia entries, scientific journal free recall
articles, t
mance
sages), and on a wide range of on a subsequent
topics c
(e.g., Civil W
based
ics, bat echolocation, sea short-answer
otters, the big bang questio
the
Arctic exploration, toucans).
test. In Practice tests hav
fact, concept-mapp
ing free-recall
learning from video lectures and from practice
narrateddu a
mapping during
topics such as adult development, study.
lightning, Sim
neur
and art history (Butler dents with expository
& Roediger, 2007; Cranneytex
Vojdanoska, Cranney, &lowed either
Newell, 2010).by repeated
Although much of the short-answer
work on testing tests (withha
effects f
cepts from
bal materials, practice testing the been
has also texts. One
show
inference-based
learning of materials that include visual short-answ
or spatial
and concepts
including learning of features was better
and locations on mf
ter & Pashler, 2007; lowing
Rohrer etrestudy (see
al., 2010), Fig.
identi
(Jacoby, Wahlheim, & Coane,experiment
2010),are particularl
naming objec
et al., 2009; Fritz et al.,test involved
2007), far transfer
associating names
(Fielder & Shaughnessy, cepts
2008;from one
Morris & domain to
Fritz, 200
spatial locations of objects (Sommer,
students Schoell,
had to apply info
2008), learning symbols make inferences
(Coppens, Verkoeijen about t
2011), and identifying depicted parts of a flow
aircraft).
1989). Finally, recent workFinally, recent
has studies extended
have also shown testing effects
testing
involving
nondeclarative learning, other forms of transfer.
including Jacoby etlearning
the al. (2010) o
tion skills (Kromann et al., 2009) and inductive
input-output functions (Kang, □ Practice TestsMcDaniel, et al., 2
8.2d Criterion tasks. Although□ Restudy cued recall is the
monly used criterion measure,
80 r testing effects ha
shown with other forms of memory tests, including
recognition, and fill-in-the-blank
well as tests, as
and multiple-choice questions that tap memory f
tion explicitly stated in text material.
Regarding transfer, the modal method in testin
research has involved using the same questions ta
same target information (e.g., the same cued-recal
multiple-choice questions) on practice tests and cr
However, as described in the subsection on lea
tions (8.2a) above, many studies have also sho
effects when learning of the same
Facts Concepts target
Facts informa
Concepts Concepts
ated using different test formats for
Experiment practice
1 b Experiment 2 Exp. 3 and cr
Furthermore, an increasing number of studies hav
Fig. 8. Accuracy on final tests that consisted of inference-based transfer
practice testing a subset of information influences
questions tapping key facts or concepts, administered I week after a learn
related but untested information
ing session that involved either (J. C.or restudy,
practice tests K. Chan, in Butler (2010). 2
C. K. Chan, McDermott,
Error bars & Roediger,
represent standard errors. 2006; Cra

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
34 Dunlosky et al.

presented measureslearners
in these studies involved experimenter-devised tests w
names foror no-stakes
initial
pop quizzes, research has also shown effects
studof
tional practice testing on of
study actual summative course
the assessments pi
which learners were
(Balch, 1998; Daniel & Broida, 2004; Lyle & Crawford, 2011;
retrieve McDaniel et al., 2011;
the McDaniel et al., 2012).
appropriat
correct For example, a study by McDaniel et al. (2012)
answer. involved
The
same undergraduates enrolled in of
families an online psychologybirds
course on the
those brain and behavior. Each week, students could
families. Learnearn course
new birdspointsfollowing by completing an online practice activity up to four p
only. Similarly, times. In the online activity, some information was Kang
presented
inductive function
for practice testing with feedback, some information was prelear
either studied pairs
sented for restudy, and some information was not presented. of
for a given Subsequent unit exams included questions that had
input been pre
value
The prediction sented during the practice tests and group
also new, related questions o
criterion focusing on different aspects
test for of the practiced concepts.
both As
pairs. shown in Figure 9, grades on unit exams were higher for infor
In addition to establishing testing effects across an array of mation that had been practice tested than for restudied infor
outcome measures, studies have also demonstrated testing mation or unpracticed information, for both repeated questions
effects across many retention intervals. Indeed, in contrast to and for new related questions.
literatures on other learning techniques, contemporary research
on testing effects has actually used short retention intervals 8.4 Issues for implementation. Practice testing appears to be
less often than longer retention intervals. Although a fair num relatively reasonable with respect to time demands. Most
ber of studies have shown testing effects after short delays research has shown effects of practice testing when the amount
(0-20 minutes), the sizable majority of recent research has of time allotted for practice testing is modest and is equated
involved delays of at least 1 day, and the modal retention inter with the time allotted for restudying. Another merit of practice
val used is 1 week. The preference for using longer retention testing is that it can be implemented with minimal training.
intervals may be due in part to outcomes from several studies Students can engage in recall-based self-testing in a relatively
reporting that testing effects are larger when final tests are straightforward fashion. For example, students can self-test via
administered after longer delays (J. C. K. Chan, 2009; Cop cued recall by creating flashcards (free and low-cost flashcard
pens et al., 2011 ; C. I. Johnson & Mayer, 2009; Kornell, Bjork, software is also readily available) or by using the Cornell
& Garcia, 2011 ; Roediger & Karpicke, 2006b; Runquist, 1983;
Schmidmaier et al., 2011; Toppino & Cohen, 2009; Wenger, H Practice Tests
Thompson, & Bartling, 1980; Wheeler, Ewers, & Buonanno, □ Restudy
2003). It is impressive that testing effects have been observed
] No Practice
after even longer intervals, including intervals of 2 to 4 weeks
(e.g., Bahrick & Hall, 2005; Butler & Roediger, 2007; Carpen
ter, Pashler, Wixted, & Vul, 2008; Kromann et al., 2009;
Rohrer, 2009), 2 to 4 months (e.g., McDaniel, Anderson, Der
bish, & Morrisette, 2007; Morris & Fritz, 2002; Rawson &
Dunlosky, 2011), 5 to 8 months (McDaniel et al., 2011; Rees,
1986), 9-11 months (Carpenter et al., 2009), and even 1 to 5
years (Bahrick et al., 1993). These findings are great news for
students and educators, given that a key educational goal is
durable knowledge and not just temporary improvements in
learning.

8.3 Effects in representative educational contexts. As


described above, much of the research on testing effects has
involved educationally relevant materials, tasks, and retention Repeated Questions New Questions
intervals. Additionally, several studies have reported testing Fig. 9. Grades on course exams covering items that were presented for
effects using authentic classroom materials (i.e., material practice testing, presented for restudy, or not presented during online
taken from classes in which student participants were enrolled; learning activities that students completed for course points. The course
exam included some questions that had been presented during practice
Carpenter et al., 2009; Cranney et al., 2009; Kromann et al.,
tests as well as new questions tapping the same information. For simplicity,
2009; McDaniel et al., 2007; Rawson & Dunlosky, 2011; outcomes reported here are collapsed across two experiments reported
Rees, 1986; Vojdanoska et al., 2010). Whereas the criterion by McDaniel, Wildman, and Anderson (2012).

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 35

note-taking system (which involves leaving a


when taking notes in class and entering key te
in it shortly after taking notes to use for sel
reviewing notes at a later time; for more deta
Ross, 2010). More structured forms of practic
multiple-choice, short-answer, and fill-in-the
often readily available to students via pract
questions included at the end of textbook c
electronic supplemental materials that accomp
books. With that said, students would likely b
basic instruction on how to most effectively
given that the benefits of testing depend on
dosage, and timing. As described above, practi
ticularly advantageous when it involves retr
ued until items are answered correctly mor
and across practice sessions, and with longe
shorter intervals between trials or sessions.
Concerning the effectiveness of practice test
other learning techniques, a few studies have
of practice testing over concept mapping, not
imagery use (Fritz et al., 2007; Karpicke &
McDaniel et al., 2009; Neuschatz, Preston,
Neuschatz, 2005), but the most frequent comp
involved pitting practice testing against unguid
modal outcome is that practice testing outper
although this effect depends somewhat on the
practice tests are accompanied by feedback inv
tation of the correct answer. Although many
shown that testing alone outperforms restudy,
have failed to find this advantage (in most of
racy on the practice test has been relatively l
the advantage of practice testing with feedbac
extremely robust. Practice testing with feedb
tently outperforms practice testing al
Another reason to recommend the implementa
back with practice testing is that it protects ag
tion errors when students respond incorrectly
test. For example, Butler and Roediger (2008
9 Distributed practice
multiple-choice practice test increased intru
alternatives on a final cued-recall test when no feedback was To-be-learned material is often encountered on more than one
provided, whereas no such increase was observed when feed- occasion, such as when students review their notes and then
back was given. Fortunately, the corrective effect of feedback later use flashcards to restudy the materials, or when a topic is
does not require that it be presented immediately after the covered in class and then later studied in a textbook. Even so,
practice test. Metcalfe et al. (2009) found that final-test perfor- students mass much of their study prior to tests and believe
manee for initially incorrect responses was actually better that this popular cramming strategy is effective. Although
when feedback had been delayed than when it had been imme- cramming is better than not studying at all in the short term,
diate. Also encouraging is evidence suggesting that feedback given the same amount of time for study, would students be
is particularly effective for correcting high-confidence errors better off spreading out their study of content? The answer to
(e.g., Butterfield & Metcalfe, 2001). Finally, we note that the this question is a resounding "yes." The term distributed
effects of practice-test errors on subsequent performance tend practice effect refers to the finding that distributing learning
to be relatively small, often do not obtain, and are heavily out- over time (either within a single study session or across ses
weighed by the positive benefits of testing (e.g., Fazio et al., sions) typically benefits long-term retention more than does
2010; Kang, Pashler, et al., 2011; Roediger & Marsh, 2005). massing learning opportunities back-to-back or in relatively
Thus, potential concerns about errors do not constitute a close succession.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
36 Dunlosky et al.

Given the sixthvolume


session. In contrast, when learning sessions were sepa of re
exhaustive review
rated by 30 days, forgetting was much greater across sessions, of the
article. Fortunately,
and initial test performance did not reach the level observed in this
sive review articles
the other two conditions, even after six sessions (see triangles (e.
Pashler, Vul,
in Fig. 10). The key point Wixted,
for our present purposes is that the &
Spirgel, 2010; Dempste
pattern reversed on the final test 30 days later, such that the
ich, 1999; bestJaniszewski
retention of the translations was observed in the condition
vided foundations
in which relearning sessions had been separated by 30 days. for
recent reviews (Cepeda
That is, the condition with the most intersession forgetting e
use the yielded the greatest long-term
term retention. Spaced practice
distribute
effects (i.e., the
(1 day or 30 days) was superior to massed practice (0advanta
days),
lag effects and the
(i.e.,
benefit was greater following a longer the
lag (30 days) adva
spacing with shorter
than a shorter lag ( 1 day). lag
our summary. Many theories of distributed-practice effects have been
proposed and tested. Consider some of the accounts currently
9.1 General description of distributed practice and why under
it debate (for in-depth reviews, see Benjamin & Tullis,
should work. To illustrate the issues involved, we begin with 2010; Cepeda et al., 2006). One theory invokes the idea of
a description of a classic experiment on distributed practice, deficient
in processing, arguing that the processing of material
which students learned translations of Spanish words to crite during a second learning opportunity suffers when it is close in
rion in an original session (Bahrick, 1979). Students then par time to the original learning episode. Basically, students do not
ticipated in six additional sessions in which they had the have to work very hard to reread notes or retrieve something
chance to retrieve and relearn the translations (feedback was from memory when they have just completed this same activ
provided). Figure 10 presents results from this study. In the ity, and furthermore, they may be misled by the ease of this
zero-spacing condition (represented by the circles in Fig. 10),second task and think they know the material better than they
the learning sessions were back-to-back, and learning was really do (e.g., Bahrick & Hall, 2005). Another theory involves
rapid across the six massed sessions. In the 1-day condition reminding; namely, the second presentation of to-be-learned
(represented by the squares in Fig. 10), learning sessions werematerial serves to remind the learner of the first learning
spaced 1 day apart, resulting in slightly more forgetting across
opportunity, leading it to be retrieved, a process well known to
enhance memory (see the Practice Testing section above).
sessions (i.e., lower performance on the initial test in each ses
sion) than in the zero-spacing condition, but students in the Some researchers also draw on consolidation in their explana
1-day condition still obtained almost perfect accuracy by the tions, positing that the second learning episode benefits from
any consolidation of the first trace that has already happened.
Given the relatively large magnitude of distributed-practice
0 Days Between Sessions
effects, it is plausible that multiple mechanisms may contrib
1 Day Between Sessions
30 Days Between Sessions ute to them; hence, particular theories often invoke different
combinations of mechanisms to explain the effects.
A


9.2 How general are the effects of distributed practice?
The distributed-practice effect is robust. Cepeda et al. (2006)
reviewed 254 studies involving more than 14,000 participants
altogether; overall, students recalled more after spaced study
(47%) than after massed study (37%). In both Donovan and
Radosevich's (1999) and Janiszewski et al.'s (2003) meta
analyses, distributed practice was associated with moderate
effect sizes for recall of verbal stimuli. As we describe below,
the distributed-practice effect generalizes across many of the
categories of variables listed in Table 2.
9.2a Learning conditions. Distributed practice refers to a par
ticular schedule of learning episodes, as opposed to a particular
4 kind of learning episode. That is, the distributed-practice effect
Session refers to better learning when learning episodes are spread out
in time than when they occur in close succession, but those
Fig. 10. Proportion of items answered correctly on an initial test adminis
learning episodes could involve restudying material, retrieving
tered in each of six practice sessions (prior to actual practice) and on the
final test 30 days after the final practice session as a function of lag between information from memory, or practicing skills. Because our
sessions (0 days, I day, or 30 days) in Bahrick (1979). emphasis is on educational applications, we will not

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 37

draw heavily on the skill literature, given that


tossing, gymnastics, and music memorization
to our purposes. Because much theory on the d
tice effect is derived from research on the spa
sodes, we focus on that research, but we al
studies on distributed retrieval practice. In ge
practice testing is better than distributed stud
et al., 2009), as would be expected from the la
the benefits of practice testing. education, because when students are studying, they presum
One of the most important questions about distributed prac- ably are intentionally trying to learn,
tice involves how to space the learning episodes—that is, how 9.2b Student characteristics. The majority of distrib
should the multiple encoding opportunities be arranged? practice experiments have tested undergraduates, but eff
Cepeda et al. (2006) noted that most studies have used rela- have also been demonstrated in other populations. In at
tively short intervals (less than 1 day), whereas we would some situations, even clinical populations can benefit fr
expect the typical interval between educational learning distributed practice, including individuals with multiple
opportunities (e.g., lecture and studying) to be longer. Recall rosis (Goverover, Hillary, Chiaravalloti, Arango-Lasprilla, &
that the classic investigation by Bahrick (1979) showed a DeLuca, 2009), traumatic brain injuries (Goverover, Aran
larger distributed-practice effect with 30-day lags between Lasprilla, Hillary, Chiaravalloti, & DeLuca, 2009), and a
sessions than with 1-day lags (Fig. 10); Cepeda et al. (2006) sia (Cermak, Verfaellie, Lanzoni, Mather, & Chase, 1996)
noted that "every study examined here with a retention inter- general, children of all ages benefit from distributed study.
val longer than 1 month demonstrated a benefit from distribu- example, when learning pictures, children as young as p
tion of learning across weeks or months" (p. 370; "retention schoolers recognize and recall more items studied after long
interval" here refers to the time between the last study oppor- lags than after shorter lags (Toppino, 1991; Toppino, Kas
tunity and the final test). man, & Mracek, 1991). Similarly, 3-year-olds are better able
However, the answer is not as simple as "longer lags are to classify new exemplars of a category if the category
better"—the answer depends on how long the learner wants to originally learned through spaced rather than massed st
retain information. Impressive data come from Cepeda, Vul, (Vlach, Sandhofer, & Kornell, 2008). Even 2-year-olds
Rohrer, Wixted, and Pashler (2008), who examined people's benefits of distributed practice, such that it increases their
learning of trivia facts in an internet study that had 26 different ability to produce studied words (Childers & Tomase
conditions, which combined different between-session inter- 2002). These benefits of spacing for language learning also
vals (from no lag to a lag of 105 days) with different retention occur for children with specific language impairment (Ric
intervals (up to 350 days). In brief, criterion performance was Tomasello, & Conti-Ramsden, 2005).
best when the lag between sessions was approximately 10- At the other end of the life span, older adults learning paire
20% of the desired retention interval. For example, to remem- associates benefit from distributed practice as much as you
ber something for 1 week, learning episodes should be spaced adults do (e.g., Balota, Duchek, & Paullin, 1989). Similar co
12 to 24 hours apart; to remember something for 5 years, the elusions are reached when spacing involves practice tests
learning episodes should be spaced 6 to 12 months apart. Of rather than study opportunities (e.g., Balota et al., 2006; Lo
course, when students are preparing for examinations, the & Balota, 2008) and when older adults are learning to classify
degree to which they can space their study sessions may be exemplars of a category (as opposed to paired associates; Kor
limited, but the longest intervals (e.g., intervals of 1 month or nell, Castel, Eich, & Bjork, 2010). In summary, learners
more) may be ideal for studying core content that needs to be different ages benefit from distributed practice, but an
retained for cumulative examinations or achievement tests that issue is the degree to which the distributed-practice effect
assess the knowledge students have gained across several be moderated by other individual characteristics, such as
years of education. knowledge and motivation.
Finally, the distributed-practice effect may depend on the 9.2c Materials. Distribut
type of processing evoked across learning episodes. In the observed with many types of to-
meta-analysis by Janiszewski et al. (2003), intentional pro- definitions (e.g., Dempster,
cessing was associated with a larger effect size (M = .35) than penter & DeLosh, 2005), trans
was incidental processing (M= .24). Several things should be words (e.g., Bahrick & Hall
noted. First, the distributed-practice effect is sometimes et al., 2008), texts (e.g., Raw
observed with incidental processing (e.g., R. L. Greene, 1989; (e.g., Glover & Corkill, 1987), a
Toppino, Fearnow-Kenney, Kiepert, & Teremula, 2009); it is Rogers, 1973). Distributed study
not eliminated across the board, but the average effect size is formance in a range of domain
slightly (albeit significantly) smaller. Second, the type of pro- Glaser, 1964) and advertisin
cessing learners engage in may covary with the intentionality Wickens, 2005). If we includ

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
38 Dunlosky et al.

skills, then the list of d


practice have been succe
to include mathematic
Rohrer, 2009), history
mons, 2011), and surger
others. teacher read and defined words; the students wrote down the
Not all tasks yield comparably large distributed-practice definitions; the teacher r
effects. For instance, distributed-practice effects are large for in sentences, and stu
free recall but are smaller (or even nonexistent) for tasks that students wrote down
are very complex, such as airplane control (Donovan & Rados- tences using the w
evich, 1999). It is not clear how to map these kinds of complex (including reading f
tasks, which tend to have a large motor component, onto the teacher instruction) a
types of complex tasks seen in education. The U.S. Institute of tions and sentences)
Education Sciences guide on organizing study to improve rion test was administ
learning explicitly notes that "one limitation of the literature is session, and students
that few studies have examined acquisition of complex bodies of GRE vocabulary wo
of structured information" (Pashler et al., 2007, p. 6). The data spaced a week apart
that exist (which are reviewed below) have come from class- spaced a minute ap
room studies and are promising. instruction and student practice was also involved in a demon
Rid Criterion tasks. We alluded earlier to the fact that dis- stration of the benefits of distributed practice for learning p
tributed-practice effects are robust over long retention inter- nics in first graders (Seabrook, Brown, & Solity, 2005).
vals, with Cepeda and colleagues (2008) arguing that the ideal Another study examined learning of statistics across
lag between practice sessions would be approximately 10- sections of the same course, one of which was taught ove
20% of the desired retention interval. They examined learning 6-month period and the other of which covered the same ma
up to 350 days after study; other studies have shown benefits rial in an 8-week period (Budé, Imbos, van de Wiel, & Berg
of distributed testing after intervals lasting for months (e.g., 2011). The authors took advantage of a curriculum change
Cepeda et al., 2009) and even years (e.g., Bahrick et al., 1993; their university that allowed them to compare learning in a
Bahrick & Phelps, 1987). In fact, the distributed-practice class taught before the university reduced the length of
effect is often stronger on delayed tests than immediate ones, course with learning in a class taught after the change. The
with massed practice (cramming) actually benefitting perfor- riculum change meant that lectures, problem-based group
manee on immediate tests (e.g., Rawson & Kintsch, 2005). meetings, and lab sessions (as well as student-driven stu
Much research has established the durability of distributed- assignments, etc.) were implemented within a much short
practice effects over time, but much less attention has been time period; in other words, a variety of study and retrieva
devoted to other kinds of criterion tasks used in educational activities were more spaced out in time in one class than in
contexts. The Cepeda et al. (2009) meta-analysis, for example, other. Students whose course lasted 6 months outperformed
focused on studies in which the dependent measure was verbal students in the 8-week course both on an open-ended test
free recall. The distributed-practice effect has been general- ping conceptual understanding (see Fig. 11) and on the final
ized to dependent measures beyond free recall, including mul- exam (Fig. 12). Critically, the two groups performed simila
tiple-choice questions, cued-recall and short-answer questions on a control exam from another course (Fig. 12), suggesting
(e.g., Reynolds & Glaser, 1964), frequency judgments (e.g., that the effects of distributed practice were not due to ability
Hintzman & Rogers, 1973), and, sometimes, implicit memory differences across classes.
(e.g., R. L. Greene, 1990; Jacoby & Dallas, 1981). More gen- Finally, a number of classroom studies have examined
erally, although studies using these basic measures of memory benefits of distributed practice tests. Distributed practice tes
can inform the field by advancing theory, the effects of distrib- ing helps students in actual classrooms leam history fa
uted practice on these measures will not necessarily generalize (Carpenter et al., 2009), foreign language vocabulary (K. C.
to all other educationally relevant measures. Given that stu- Bloom & Shuell, 1981), and spelling (Fishman et al., 196
dents are often expected to go beyond the basic retention of
materials, this gap is perhaps the largest and most important to 9.4 Issues for implementation. Several obstacles may aris
fill for the literature on distributed practice. With that said, when implementing distributed practice in the classroom,
some relevant data from classroom studies are available; we Dempster and Farris (1990) made the interesting point t
turn to these in the next section. many textbooks do not encourage distributed learning, in th
they lump related material together and do not review previ
9.3 Effects in representative educational contexts. Most ously covered material in subsequent units. At least one for
of the classroom studies that have demonstrated distributed- content analysis of actual textbooks (specifically, elementar
practice effects have involved spacing of more than just study school mathematics textbooks; Stigler, Fuson, Ham, & Kim
opportunities. It is not surprising that real classroom exercises 1986) supported this claim, showing that American textbo

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 39

□ 8-Week Course A second issue involves how students naturally study.


SH 6-Month Michael
Course (1991) used the term procrastination scallop to
40 describe the typical study pattern—namely, that time spent
studying increases as an exam approaches. Mawhinney, Bos
tow, Laws, Blumenfield, and Hopkins (1971) documented this
pattern using volunteers who agreed to study in an observation
room that allowed their time spent studying to be recorded.
With daily testing, students studied for a consistent amount of
time across sessions. But when testing occurred only once
every 3 weeks, time spent studying increased across the inter
val, peaking right before the exam (Mawhinney et al., 1971).
In other words, less frequent testing led to massed study
immediately before the test, whereas daily testing effectively
led to study that was distributed over time. The implication is
that students will not necessarily engage in distributed study
unless the situation forces them to do so; it is unclear whether
this is because of practical constraints or because students do
not understand the memorial benefits of distributed practice.
During Course Immediately After Course With regard to the issue of whether students understand the
Fig. II. Points earned on an open-ended test tapping conceptual under benefits of distributed practice, the data are not entirely defini
standing of content from two sections of a course, one taught over an tive. Several laboratory studies have investigated students'
8-week period and the other taught over a 6-month period, in Budé, Im
choices about whether to mass or space repeated studying of
bos, van de Wiel, and Berger (2011). Error bars represent standard errors.
paired associates (e.g., GRE vocabulary words paired with
their definitions). In such studies, students typically choose
□ 8-Week Course between restudying an item almost immediately after learning
(massing) or restudying the item later in the same session
6-Month Course
(spacing). Although students do choose to mass their study
under some conditions (e.g., Benjamin & Bird, 2006; Son,
2004), they typically choose to space their study of items (Pyc
& Dunlosky, 2010; Toppino, Cohen, Davis, & Moors, 2009).
This bias toward spacing does not necessarily mean that stu
dents understand the benefits of distributed practice per se
(e.g., they may put off restudying a pair because they do not
want to see it again immediately), and one study has shown
:

that students rate their overall level of learning as higher after


massed study than after spaced study, even when the students
had experienced the benefits of spacing (e.g., Kornell & Bjork,
2008). Other recent studies have provided evidence that stu
dents are unaware of the benefits of practicing with longer, as
.

opposed to shorter, lags (Pyc & Rawson, 2012b; Wissman


et al., 2012).
Critical Course Control Course In sum, because of practical constraints and students'
potential lack of awareness of the benefits of this technique,
Fig. 12. Final-exam scores in a critical course and a control course as a
students may need some training and some convincing that
function of the length of the course (8 weeks or 6 months); data drawn
from Budé, Imbos, van de Wiel, and Berger (2011). Standard distributed
errors are practice is a good way to learn and retain informa
not available.
tion. Simply experiencing the distributed-practice effect may
not always be sufficient, but a demonstration paired with
instruction about the effect may be more convincing to stu
grouped to-be-worked problems together (presumably atdents the (e.g., Balch, 2006).
end of chapters) as opposed to distributing them throughout
the pages. These textbooks also contained less variability 9.5
in Distributed practice: Overall assessment. On the basis
sets of problems than did comparable textbooks from theof for
the available evidence, we rate distributed practice as hav
mer Soviet Union. Thus, one issue students face is that their
ing high utility: It works across students of different ages, with
study materials may not be set up in a way that encourages
a wide variety of materials, on the majority of standard labora
distributed practice. tory measures, and over long delays. It is easy to implement

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
40 Dunlosky et al.

(although ait
given solid, may
which was immediately followed
requiby the four
cessfully practice
in a
problems for numbe
that kind of solid. Practice solving vol
research has
umes for a given solid
examined
was then followed by the tutorial and
plex practice problems for the next kind
materials, the of solid, and so on.exStu
that dents in an interleaved-practice group first
distributed practiread all four tutori
well. Future research
als and then completed all the practice problems, with the
possible individual
constraint that every set of four consecutive problemsdif included
that require one problem for each of the four kinds of solids. One week
higher-lev
isolate the aftercontribution the second practice session, all students took a criterion
tributed retrieval in educational contexts. test in which they solved two novel problems for each of the
four kinds of solids. Students' percentages of correct responses
during the practice sessions and during the criterion test are
10 Interleaved practice presented in Figure 13, which illustrates a typical interleaving
effect: During practice, performance was better with blocked
In virtually every kind of class at every grade level, students
are expected to learn content from many different subtopicspractice
or than interleaved practice, but this advantage dramati
problems of many different kinds. For example, studentscally in a reversed on the criterion test, such that interleaved prac
neuroanatomy course would learn about several different ticediviboosted accuracy by 43%.
sions of the nervous system, and students in a geometry course One explanation for this impressive effect is that interleav
ing of
would learn various formulas for computing properties gave students practice at identifying which solution
objects such as surface area and volume. Given that the goal method
is (i.e., which of several different formulas) should be
to learn all of the material, how should a student scheduleused his for a given solid (see also, Mayfield & Chase, 2002). Put
or her studying of the different materials? An intuitive differently, interleaved practice helps students to discriminate
approach, and one we suspect is adopted by most students, between the different kinds of problems so that they will be
more likely to use the correct solution method for each one.
involves blocking study or practice, such that all content from
one subtopic is studied or all problems of one type are prac Compelling evidence for this possibility was provided by Tay
ticed before the student moves on to the next set of material.lor
In and Rohrer (2010). Fourth graders learned to solve mathe
contrast, recent research has begun to explore interleaved problems involving prisms. For a prism with a given
matical
number of base sides (b), students learned to solve for the
practice, in which students alternate their practice of different
kinds of items or problems. Our focus here is on whether number
inter of faces (b + 2), edges (b x 3), corners (b x 2), or
leaved practice benefits students' learning of educationally angles (b x 6). Students first practiced partial problems: A
relevant material. term for a single component of a prism was presented (e.g.,
Before we present evidence of the efficacy of this tech corners), the student had to produce the correct formula (i.e.,
nique, we should point out that, in contrast to the other techfor corners, the correct response would be "b x 2"), and then
niques we have reviewed in this monograph, many fewer
studies have investigated the benefits of interleaved practice □ Blocked
on measures relevant to student achievement. Nonetheless, we
m Interleaved
elected to include this technique in our review because (a)
plenty of evidence indicates that interleaving can improve
motor learning under some conditions (for reviews, see Brady,
1998; R. A. Schmidt & Bjork, 1992; Wulf & Shea, 2002) and
(b) the growing literature on interleaving and performance on
cognitive tasks is demonstrating the same kind of promise.

10.1 General description of interleaved practice and why


it should work. Interleaved practice, as opposed to blocked
practice, is easily understood by considering a method used by
Rohrer and Taylor (2007), which involved teaching college
students to compute the volumes of different geometric solids.
Students had two practice sessions, which were separated by 1
week. During each practice session, students were given tuto
rials on how to find the volume for four different kinds of geo
metric solids and completed 16 practice problems (4 for each Practice Performance Test Performance
solid). After the completion of each practice problem, the cor
Fig. 13. Percentage of correct responses on sets of problems completed
rect solution was shown for 10 seconds. Students in a blocked
In practice sessions and on a delayed criterion test in Rohrer and Taylor
practice condition first read a tutorial on finding the volume (2007).
of Error bars represent standard errors.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 41

feedback (the correct moreanswer)


common after blockedwas practice than after in
provided
partial problems, students
practiced. Students
practiced
who received interleaved
full pract pr
they were shown a prismently were better
with at discriminating
a number among theof kin
sides) and a term for lems anda consistently
single component
applied the correct formula
dents had to produce How the doescorrect
interleaving produce these benefits?
formula (bO
problem by substitutingnation is that
the interleaved practice promotes va
appropriate org
Most important, students in a blocked-pr
processing and item-specific processing because it
pleted all partial- and full-practice
dents problems
to more readily compare different kinds of pro
instance, moving
ture (e.g., angles) before in Rohrer and Tayloronto (2007),the
it is possible
ne
an interleaved-practice group,
students were solving for theeach
volume ofblock
one kind o
problems included one a wedge)
problem
during interleavedfor practice,
each the solution
of m
tures. One day after for the immediately a
practice, prior problem involvingtes
criterion a dif
in which students were of solid (e.g.,
asked a spheroid)
towas still in working
solve fullm
not appeared during hence encouraged a comparison of the two problem
practice.
different formulas.
Accuracy during practice wasAnother greater possible explanation
for s
the distributed
received blocked practice than retrieval
for from students
long-term mem
interleaved practice, afforded
both by interleaved
for partial practice. In particular,
probl fo
respectively) and practice,
for the information
full problems relevant to(98%
complet
trast, accuracy 1 day (whether
later it be a solution
was to a problem or memoryh
substantially
related items) should
who had received interleaved reside in working
practice memory;
(77%)
ticipants should
who had received blocked not have to retrieve
practice the solution.
(38%). As S
dent completes
Taylor (2006), a plausible a block of problems solving
explanation forfor thvo
interleaved practice wedges, helped the solution to each new problem
students to disc will
various kinds of problems available from working
and to memory.
learn By contrast,
thefor a
to apply for each one. practice, This explanation
when the next w
type of problem is present
detailed analysis of errors tion method for
theit must be retrieved from
fourth long-ter
grader
ing the full problems So, ifduring
a student has just
thesolved criterion
for the volume of a
errors involved cases in which students used a formula that then must solve for the volume of a spheroid, he o
retrieve the formula for spheroids from memory.
was not originally trained (e.g., b x 8), whereas discrimination
errors involved cases in which students used one of the four practice testing would boost memory for the retr
formulas that had been practiced but was not appropriate for mation
a (for details, see the Practice Testing sec
given problem. As shown in Figure 14, the two groups didThis not retrieval-practice hypothesis and the discrim
differ in fabrication errors, but discrimination errors were trast hypothesis are not mutually exclusive, and o
nisms may also contribute to the benefits of i
□ Blocked practice.
H! Interleaved
70 r 10.2 How general are the effects of interleaved practice?
10.2a Learning conditions. Interleaved practice itself repre
sents a learning condition, and it naturally covaries with dis
tributed practice. For instance, if the practice trials for tasks of
a given kind are blocked, the practice for the task is massed.
By contrast, by interleaving practice across tasks of different
kinds, any two instances of a task from a given set (e.g., solv
ing for the volume of a given type of geometrical solid) would
be separated by practice of instances from other tasks. Thus, at
least some of the benefits of interleaved practice may reflect
the benefits of distributed practice. However, some research

ii
ers have investigated the benefits of interleaved practice with
spacing held constant (e.g., Kang & Pashler, 2012; Mitchell,

■ Nash, & Hall, 2008), and the results suggested that spacing is
not responsible for interleaving effects. For instance, Kang
Fabrication Discrimination and Pashler (2012) had college students study paintings by
various artists with the goal of developing a concept of each
Fig. 14. Types of errors made by fourth graders while solving mathemati
cal problems on a delayed criterion test in Taylor andartists'
Rohrer style,(2010).
so that the students could later correctly identify
Error
bars represent standard errors. the artists who had produced paintings that had not been

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
42 Dunlosky et al.

presented during practi


paintings was either blo
Blencowe were presente
Richard Lindenberg, a
tant, a third group re
viewing the paintings o
ion, a cartoon drawing w
tion of each painting (t
temporal spacing in thi
same as that for the int
was best after interleav
than after either standa
tice. No differences oc
blocked-practice groups
will not consistently
This outcome is more c
contrast hypothesis tha
particular, on each trial
blocked practice presum
term memory) what the
style, yet doing so did
interleaved practice enc
cal differences among t
helped students discrim
the criterion test. Acc
practice may further e
rate concepts (e.g., a co
plars of different conce
instance, instead of pain
an interleaved fashion,
the same time. In this c
the paintings of the va
among them. Kang an
ous presentation of pa
about the same level o
dard interleaving did (65
practice were superior
finding involving stude
heim, Dunlosky,
Finally, the amount of
initially receive with
which interleaving all t
educational contexts,
type (e.g., how to find t
rally begin with initial
that concept or problem
in this section involved
interleaving began. The
is enough, and whether
dents learning to solve
practice before interle
task difficulty have bee
interleaving in the liter
1998; Wulf & Shea, 2002
same for cognitive tasks

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 43

10.2c Materials. The benefits of interleaved pr


been explored using a variety of cognitive task
from the simple (e.g., paired associate learning
tively complex (e.g.,
diagnosing failures of a
piece of machinery).Outcomes have been mi
Healy, and Bourne (1998, 2002) had college st
French vocabulary words from different ca
body parts, dinnerware, and foods. Across mul
translation equivalents from the same category
during practice or were interleaved. Immedi
tice, students who had received blocked practice
translations than did students who had receive
practice (Schneider et al., 2002). One week after
rect recall was essentially the same in the bloc
group as in the interleaved-practice group. In a
(Schneider et al., 1998, Experiment 2), interleav
to somewhat better performance than blocked
delayed test, but this benefit was largely due to
error rate. Based on these two studies, it does
interleaved practice of vocabulary boosts re
More promising are results from studies that h
gated students' learning of mathematics. We
described some of these studies above (Rohrer &
Taylor & Rohrer, 2010; but see Rau et al., 2010)
skills that have been trained include the use of
tions (Carlson & Shin, 1996; Carlson & Yaure, 19
algebraic skills (Mayfield & Chase, 2002). For
interleaved practice improved students' speed i
tistep Boolean problems, especially when studen
view the entire multistep problem during solu
Shin, 1996). For the latter, interleaving substan
students' ability to solve novel algebra problem
cuss in detail below). vais as long as 1 to 2 weeks. In some of these cases, inter
Van Merriënboer and colleagues (de Croock & van Merriën- leaved practice benefited perform
boer, 2007; de Croock et al., 1998; van Merriënboer, de Croock, Chase, 2002; Rohrer & Taylor, 2007), b
& Jelsma, 1997; van Merriënboer, Schuurman, de Croock, & tial benefits of interleaving did not man
Paas, 2002) trained students to diagnose problems that occurred retention interval (e.g., de Croock &
in a distiller system in which different components could fail; Rau et al., 2010). In the latter cases, i
practice at diagnosing failures involving each component was not have been potent at any retention
either blocked or interleaved during practice. Across their stud- interleaved practice may not be poten
ies, interleaved practice sometimes led to better performance on language vocabulary (Schneider et a
transfer tasks (which involved new combinations of system fail- who have not received enough prac
ures), but it did not always boost performance, leading the (de Croock & van Merriënboer, 2007
authors to suggest that perhaps more practice was needed to
demonstrate the superiority of interleaved practice (de Croock 10.3 Effects in representative
& van Merriënboer, 2007). Blocked and interleaved practice seems plausible that motivated students c
have also been used to train students to make complex multidi- leaving without help. Moreover, seve
mensional judgments (Flelsdingen, van Gog, & van Merriën- cedures for instruction that could b
boer, 201 la, 201 lb), with results showing that decision making (e.g., Hatala et al., 2003; Mayfield & C
on criterion tests was better after interleaved than blocked prac- 2006; Rau et al, 2010). We highligh
tice. One impressive outcome was reported by Hatala et al. here. Mayfield and Chase (2002) taug
(2003), who trained medical students to make electrocardio- lege students with poor math skills acro
gram diagnoses for myocardial infarction, ventricular hypertro- ferent sessions, either a single algebr
phy, bundle branch blocks, and ischemia. The criterion test was previously introduced rules we

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
44 Dunlosky et al.

sessions, either the rul


session was reviewed (
the rule learned in the
with the rules from ea
interleaved practice). Te
ing, during the session a
after practice ended. On
rules they had learned
novel combinations of
similarly at the begin
performance on both ap
was substantially better
benefits were still evid
cant) on the delayed retention test. judge its utility for his or her own instructional or learning
goals. We also realized that offering some general ratings (and
10.4 Issues for implementation. Not only is the result from the reasons behind them) might be useful to readers interested
Mayfield and Chase (2002) promising, their procedure offers in quickly obtaining an overview on what technique may work
a tactic for the implementation of interleaved practice, both by best. To do so, we have provided an assessment of how each
teachers in the classroom and by students regulating their technique fared with respect to the generalizability of its
study (for a detailed discussion of implementation, see Rohrer, benefits across the four categories of variables listed in
2009). In particular, after a given kind of problem (or topic) Table 2, issues for implementation, and evidence for its effec
has been introduced, practice should first focus on that partie- tiveness from work in representative educational contexts (see
ular problem. After the next kind of problem is introduced Table 4). Our goal for these assessments was to indicate both
(e.g., during another lecture or study session), that problem (a) whether sufficient evidence is available to support conclu
should first be practiced, but it should be followed by extra sions about the generalizability of a technique, issues for its
practice that involves interleaving the current type of problem implementation, or its efficacy in educational contexts, and, if
with others introduced during previous sessions. As each new sufficient evidence does exist, (b) whether it indicates that the
type of problem is introduced, practice should be interleaved technique works.3 For instance, practice testing received an
with practice for problems from other sessions that students assessment of Positive (P) for criterion tasks; this rating indi
will be expected to discriminate between (e.g., if the criterion cates that we found enough evidence to conclude that practice
test will involve a mixture of several types of problems, then testing benefits student performance across a wide range of
these should be practiced in an interleaved manner during criterion tasks and retention intervals. Of course, it does not
class or study sessions). Interleaved practice may take a bit mean that further work in this area (i.e., testing with different
more time to use than blocked practice, because solution times criterion tasks) would not be valuable, but the extent of the
often slow during interleaved practice; even so, such slowing evidence is promising enough to recommend it to teachers and
likely indicates the recruitment of other processes—such as students.
discriminative contrast—that boost performance. Thus, teach- A Negative (N) rating indicates that the available evidence
ers and students could integrate interleaved practice into their shows that the learning technique does not benefit perfor
schedules without too much modification. manee for the particular category or issue. For instance, despite
its popularity, highlighting did not boost performance across a
10.5 Interleaved practice: Overall recommendations. On variety of criterion tasks, so it received a rating of N for this
the basis of the available evidence, we rate interleaved prac- variable.
tice as having moderate utility. On the positive side, inter- A Qualified (Q) rating indicates that both positive and neg
leaved practice has been shown to have relatively dramatic ative evidence has been reported with respect to a particular
effects on students' learning and retention of mathematical category or issue. For instance, the keyword mnemonic
skills, and teachers and students should consider adopting it in received a Q rating for materials, because evidence indicates
the appropriate contexts. Also, interleaving does help (and that this technique does work for learning materials that are
rarely hinders) other kinds of cognitive skills. On the negative imagery friendly but does not work well for materials that can
side, the literature on interleaved practice is currently small, not be easily imagined.
but it contains enough null effects to raise concern. Although A rating of Insufficient (!) indicates that insufficient evi
the null effects may indicate that the technique does not con- dence is available to draw conclusions about the effects of a
sistently work well, they may instead reflect that we do not given technique for a particular category or issue. For instance,
fully understand the mechanisms underlying the effects of elaborative interrogation received an 1 rating for criterion tasks
interleaving and therefore do not always use it appropriately, because we currently do not know whether its effects are dura
For instance, in some cases, students may not have had enough ble across educationally relevant retention intervals. Any cell

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 45

Table 4. Utility Assessment and Ratings of Generali

Criterion
Issues Educational
for
Technique Utility Learners Materials tasks implementation contexts

Elaborative interrogation Moderate P-l P 1 P 1

Self-explanation Moderate P-l P P-l Q 1


Summarization Low Q P-l Q Q 1

Highlighting Low Q Q N P N

The keyword mnemonic Low Q Q Q-l Q Q-l


Imagery use for text learning Low Q Q Q-l P 1

Rereading Low 1 p Q-l P 1

Practice testing High p-l p P P P

Distributed practice High p-l p P-l P P-l

Interleaved practice Moderate 1 Q P-l P P-l

Note: A positive (P) rating indicates that available evidence demonstrates efficacy of a learning technique with respect to a given variable or issue. A
negative (N) rating indicates that a technique is largely ineffective for a given variable.A qualified (Q) rating indicates that the technique yielded positive
effects under some conditions (or in some groups) but not others.An insufficient (I) rating indicates that there is insufficient evidence to support a
definitive assessment for one or more factors for a given variable or issue.

in Table 4 with an I rating highlights the need for further sys- review to make informed decisions about which techniques
tematic research. will best meet their instructional and learning goals.
Finally, some cells include more than one rating. In these
cases, enough evidence exists to evaluate a technique on one
dimension of a category or issue, yet insufficient evidence is
Implications for research on learning
available for some other dimension. For instance, self-expla- techniques
nation received a P-I rating for criterion tasks because the Amain goal of this monograph wa
available evidence is positive on one dimension (generaliz- recommendations for teachers and
ability across a range of criterion tasks) but is insufficient on utility of various learning tec
another key dimension (whether the benefit of self-explana- identify areas that have been u
tion generalizes across longer retention intervals). As another require further research befor
example, rereading received a Q-I rating for criterion tasks tions for their use in education c
because evidence for the effectiveness of this technique over gaps are immediately apparent
long retention intervals is qualified (i.e., under some learning highlight a few, we do not yet k
conditions, it does not produce an effect for longer retention of the learning techniques will
intervals), and insufficient evidence is available that is rele- ages, abilities, and levels of prior
vant to its effectiveness across different kinds of criterion few exceptions (e.g., practice tes
tasks (e.g., rereading does boost performance on recall tasks, the degree to which many of th
but little is known as to its benefits for comprehension). When learning (e.g., over a number
techniques have multiple ratings for one or more variables, partly because investigations
readers will need to consult the reviews for details. cally involved a single session tha
Finally, we used these ratings to develop an overall utility criterion tests (for a discussion
assessment for each of the learning techniques. The utility gle-session research, see Raw
assessments largely reflect how well the benefits of each learn- few techniques have been eva
ing technique generalize across the different categories of tional contexts.
variables (e.g., for how many variables the technique received This appraisal (along with Table
a P rating). For example, the keyword mnemonic and imagery for future research that could h
use for text learning were rated low in utility in part because education. First, more research
their effects are limited to materials that are amenable to imag- degree to which the benefits
ery and because they may not work well for students of all the variables listed in Table 2. Pa
ages. Even so, some teachers may decide that the benefits of investigations that evaluate the
techniques with low-utility ratings match their instructional among the variables limit or m
goals for their students. Thus, although we do offer these easy- technique. Second, the benef
to-use assessments of each learning technique, we also encour- representative educational sett
age interested teachers and students to carefully read each explored. Easy-to-use version

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
46 Dunlosky et al.

niques should be develo


tigations conducted
contexts. Ideally, the cr
stakes tests, such as p
achievement tests. We r
time-consuming and cos
cial for recommendin
a reasonable likelihood
achievement. less time is spent teaching students to develop effective tech
niques and strategies to guide learning. As noted
mara (2010), "there is an overwhelming assumptio
Implications for students, teachers, and student educational system that the most important thing to
achievement
students is content" (p. 341, italics in original). One
Pressley and colleagues (Pressley, 1986; Pressley, Goodchild, here is that students who do well in e
et al., 1989) developed a good-strategy-user model, according learning is largely supervised, may s
to which being a sophisticated strategy user involves "know- are expected to regulate much of the
ing the techniques that accomplish important life goals (i.e., high school or college. Teaching
strategies), knowing when and how to use those methods . . . niques would not take much time aw
and using those methods in combination with a rich network and would likely be most beneficial if
of nonstrategic knowledge that one possesses about the world" was consistently taught across mul
(p. 302). However, Pressley, Goodchild, et al. (1989) also students could broadly experience the
noted that "many students are committed to ineffective strate- class grades. Even here, however, re
gies ... moreover, there is not enough professional evaluation train students to use the most effec
of techniques that are recommended in the literature, with efit from further research. One key
many strategies oversold by proponents" (p. 301). We agree age at which a given technique co
and hope that the current reviews will have a positive impact Teachers can expect that upper elem
with respect to fostering further scientific evaluation of the capable of using many of the techn
techniques. dents may need some guidance on how to most effectively
Concerning students' commitment to ineffective strategies, implement them. Certainly, i
recent surveys have indicated that students most often endorse dents have the self-regulatory
the use of rereading and highlighting, two strategies that we technique (and how much trai
found to have relatively low utility. Nevertheless, some stu- is an important objective for f
dents do report using practice testing, and these students how often students will need to b
appear to benefit from its use. For instance, Gurung (2005) the techniques to ensure tha
had college students describe the strategies they used in pre- them when they are not instruc
paring for classroom examinations in an introductory psychol- of some of the learning tech
ogy course. The frequency of students' reported use of practice development that involves t
testing was significantly correlated with their performance on use the techniques would be
a final exam (see also Hartwig & Dunlosky, 2012). Given that Beyond training students t
practice testing is relatively easy to use, students who do not could also incorporate some of
currently use this technique should be able to incorporate it For instance, when beginning a
into their study routine. could begin with a practice test (with feedback) on the most
Why don't many students consistently use effective tech- important ideas from the previous section. When student
niques? One possibility is that students are not instructed practicing problems from a unit on mathematics, rec
about which techniques are effective or how to use them effec- studied problems could be interleaved with related pro
tively during formal schooling. Part of the problem may be from previous units. Teachers could also harness distribut
that teachers themselves are not told about the efficacy of vari- practice by re-presenting the most important concep
ous learning techniques. Given that teachers would most likely activities over the course of several classes. When intro
leam about these techniques in classes on educational psy- key concepts or facts in class, teachers could engage st
chology, it is revealing that most of the techniques do not in explanatory questioning by prompting them to co
receive sufficient coverage in educational-psychology text- how the information is new to them, how it relates to wha
books. We surveyed six textbooks (cited in the Introduction), they already know, or why it might be true. Even home
and, except for mnemonics based on imagery (e.g., the key- assignments could be designed to take advantage of many
word mnemonic), none of the techniques was covered by all of these techniques. In these examples (and in others provi
the books. Moreover, in the subset of textbooks that did the Issues for Implementation subsections), teachers c

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 47

implement a technique open-


toandhelp
closed-bookstudents learn,
tests. Applied Cognitive Psychology, 22
whether 861-876.
students are themselves aware that a p
nique is being used. Agarwal, P. K., & Roediger, H. L., III. (2011). Expectancy of an
We realize that many factors are
open-book test decreases responsible
performance on a delayed closed-book w
one student fails to achieve
test. Memory,in school (Hattie, 20
19, 836-852.
that a change to any Ainsworth, S.,factor
single & Burcham, S. (2007).
may The impact of text coherence
have a rel
effect on student learning and
on learning achievement.
by self-explanation. Learning and Instruction,The
17,
niques described in 286-303.
this monograph will not be
improving achievement
Aleven, V., for all
& Koedinger, students,
K. R. (2002). and
An effective metacognitive
ously, they will strategy:
benefit only Learning by doing and explaining
students whowith a computer
are m
capable of using them. based
Nevertheless,
cognitive tutor. Cognitive Science, when
26, 147-179. used
suspect that they will produce
Amer, meaningful
A. A. (1994). The effect of knowledge-map and underlininggain
mance in the classroom, on
training on the achievement tests,
reading comprehension of scientific texts. English
tasks encountered across the
for Specific life
Purposes, span. It is obvi
13, 35—45.
students are not usingAmlund, J. T., Kardash, C. A.learning
effective M., & Kulhavy, R. W. (1986).
techniRepeti
use the more effectivetive reading and recall of expository
techniques text. Reading Research
without muc
teachers should be encouraged
Quarterly, 21, 49-58. to more cons
Anderson,
explicitly) train students toL. W., & Krathwohl,
use D. R. (Eds.). (2001).
learning A taxonomy
techniq
for learning, teaching
engaged in pursuing various and assessing: A revision of Bloom's
instructional and tax le
onomy of educational objectives: Complete edition. New York,
Acknowledgments NY: Longman.
We thank Reed Hunt, Anderson, M. C.McDaniel,
Mark M., & Thiede, K. W. (2008). Why do delayed sum
Roddy Roedi
Thiede for input about maries improve metacomprehension
various aspects accuracy? of Acta
this Psychologmonog
ciate constructive ica, 128, 110-118.
comments about a draft from Sea
MacKendrick, Richard Anderson, R. C., & Hidde,
Mayer, Hal J. L. (1971). Imagery and sentenceDan
Pashler, learn Rob
Rohrer. Thanks also to ing. Journal of Educational
Robert Goldstone,Psychology, 62, 526-530. Detlev Leu
Rohrer for providing Anderson, R. C.,
details of & Kulhavy,
theirR. W. (1972).research,
Imagery and prose learn and t
and Cindy Widuck for ing. Journal of Educational
technical support.Psychology, 63, 242-243.
Anderson, T. H., & Armbruster, B. B. (1984). Studying. In R. Barr
Declaration of Conflicting Interests
(Ed.), Handbook of reading research (pp. 657-679). White
The authors declared Plains,they
that NY: Longman. had no conflicts of
respect to their Annis, L.,
authorship or & Davis,
theJ. K. (1978). Study techniques: Comparing their
publication of this a
effectiveness. American Biology Teacher, 40, 108-110.
Funding Annis, L. F. (1985). Student-generated paragraph summaries and
This research was supported by a Bridging Brain, Mind and Behavior the information-processing theory of prose learning. Journal of
Collaborative Award through the James S. McDonnell Foundation's Experimental Education, 51, 4-10.
21st Century Science Initiative. Appleton-Rnapp, S. L., Bjork, R. A., & Wickens, T. D. (2005). Exam
ining the spacing effect in advertising: Encoding variability,
Notes
retrieval processes, and their interaction. Journal of Consumer
Research, 32, 266-276.
1. We also recommend a recent practice guide from the U.S. Institute
Armbruster,
of Education Sciences (Pashler et al., 2007), which discusses some B. B., Anderson, T. H., & Ostertag, J. (1987).
Does text structure/summarization instruction facilitate learn
of the techniques described here. The current monograph, however,
ing from
provides more in-depth and up-to-date reviews of the techniques and expository text? Reading Research Quarterly, 22,
331-346.
also reviews some techniques not included in the practice guide.
Arnold,
2. Although this presentation mode does not involve reading H. F. (1942). The comparative effectiveness of certain study
per se,
reading comprehension and listening comprehension processes techniques
are in the field of history. Journal of Educational Psy
highly similar aside from differences at the level of decoding chology,
the per 32, 449-457.
ceptual input (Gemsbacher, Varner, & Faust, 1990). Atkinson, R. C., & Paulson, J. A. (1972). An approach to the psychol
ogy of
3. We did not include learning conditions as a category of variable ininstruction. Psychological Bulletin, 78, 49-61.
this table because the techniques vary greatly with respect Atkinson,
to relevantR. C., & Raugh, M. R. (1975). An application of the mne
monic
learning conditions. Please see the reviews for assessments of howkeyword method to the acquisition of a Russian vocabu
lary. Journal of Experimental Psychology: Human Learning and
well the techniques generalized across relevant learning conditions.
Memory, 104, 126-133.
References Bahrick, H. P. (1979). Maintenance of knowledge: Questions about
Agarwal, P. K., Karpicke, J. D., Kang, S. H. K., Roediger, H. memory
L., Ill, we forgot to ask. Journal of Experimental Psychology:
& McDermott, K. B. (2008). Examining the testing effectGeneral,
with 108, 296-308.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
48 Dunlosky et al.

Bahrick, H.Bloom,
P.,B. S., Engelhart, M., Furst, E. J., Hill, W., & Krathwohl, D. R.
Bahrick, L
(1956). Taxonomy
Maintenance ofof educational foreig
objectives, Handbook I: Cogni
effect. tive domain. New York, NY: Longman.
Psychological Sc
Bahrick, Bloom,P.,
H. K. C., & Shuell, T.
& J. (1981). Hall,
Effects of massed and distrib
L
ures to uted practice on the learning and retention
long-term retenof second-language
spacing vocabulary. Journal of Educational
effect. Journal Research, 74, 245-248.
Bahrick, Bouwmeester,
H. P.,S., & Verkoeijen,&
R R J. L. (2011).
Phelp Why do some chil
lary over 8 dren benefit more from testing than others?
years. Journ Gist trace processing
Memory, to explain the testing
and effect. Journal of Memory and Language,
Cognition
Balch, W. 65, 22-A\.
R. (1998). Pra
performance. Teaching
Brady, F. ( 1998). A theoretical and empirical review of the contextual
Balch, W. R. (2006).
interference effect and the learning of motor skills. QUEST, 50, En
266-293. on
experiment the spac
249-252. Bransford, J. D., & Franks, J. J. (1971). The abstraction of linguistic
ideas. Cognitive Psychology, 2, 331-350.
Balota, D. A., Duchek, J. M., & Paullin, R. (1989). Age-related dif
Bretzing,
ferences in the impact of spacing, lag, and retention interval. Psy B. H., & Kulhavy, R. W. (1979). Notetaking and depth of
chology and Aging, 4, 3-9. processing. Contemporary Educational Psychology, 4, 145-153.
Balota, D. A., Duchek, J. M., Sergent-Marshall, S. D., & Roediger,
Bretzing, B. H., & Kulhavy, R. W. (1981). Note-taking and passage
H. L., III. (2006). Does expanded retrieval produce benefits
style. Journal of Educational Psychology, 73, 242-250.
over equal-interval spacing? Explorations of spacing effects
Bromage,
in B. K., & Mayer, R. E. (1986). Quantitative and qualitative
healthy aging and early stage Alzheimer's disease. Psychology
effects of repetition on learning from technical text. Journal of
and Aging, 21, 19-31. Educational Psychology, 78,271-278.
Brooks, L. R. (1967). The suppression of visualization by reading.
Bangert-Drowns, R. L., Kulik, J. A., & Kulik, C.-L. C. (1991). Effects
of frequent classroom testing. Journal of Educational Research,
The Quarterly Journal of Experimental Psychology, 19,289-299.
85, 89-99. Brooks, L. R. (1968). Spatial and verbal components of the act of
Barcroft, J. (2007). Effect of opportunities for word retrieval during recall. Canadian Journal of Psychology, 22, 349-368.
second language vocabulary learning. Language Learning, 57, Brooks, L. W., Dansereau, D. F., Holley, C. D., & Spurlin, J. E.
35-56. (1983). Generation of descriptive text headings. Contemporary
Barnett, J. E., & Seefeldt, R. W. (1989). Read something once, why Educational Psychology, 8, 103-108.
read it again? Repetitive reading and recall. Journal of Reading
Brown, A. L., Campione, J. C., & Day, J. D. (1981). Learning to learn:
Behavior, 21, 351-360. On training students to leam from texts. Educational Researcher,
Bean, T. W., & Steenwyk, R L. (1984). The effect of three forms of 10, 14-21.
Brown, A. L., & Day, J. D. (1983). Macrorules for summarizing texts:
summarization instruction on sixth graders' summary writing and
comprehension. Journal of Reading Behavior, 16, 297-306. The development of expertise. Journal of Verbal Learning and
Bednall, T. C., & Kehoe, E. J. (2011). Effects of self-regulatory Verbal Behavior, 22, 1-14.
instructional aids on self-directed study. Instructional Science,
Brown, A. L., Day, J. D., & Jones, R. S. (1983). The development of
39, 205-226. plans for summarizing texts. Child Development, 54, 968-979.
Bell, K. E., & Limber, J. E. (2010). Reading skill, textbook marking,
Brown, L. B., & Smiley, S. S. (1978). The development of strategies
and course performance. Literacy Research and Instruction, 49, for studying texts. Child Development, 49, 1076-1088.
56-67. Brozo, W. G., Stahl, N. A., & Gordon, B. (1985). Training effects of
Benjamin, A. S., & Bird, R. D. (2006). Metacognitive control of the
summarizing, item writing, and knowledge of information sources
spacing of study repetitions. Journal of Memory and Language, on reading test performance. Issues in Literacy: A Research Per
55, 126-137. spective—34th Yearbook of the National Reading Conference
Benjamin, A. S., & Tullis, J. (2010). What makes distributed practice (pp. 48—54). Rochester, NY: National Reading Conference.
effective? Cognitive Psychology, 61, 228-247. Budé, L., Imbos, T., van de Wiel, M. W., & Berger, M. R (2011). The
Berry, D. C. (1983). Metacognitive experience and transfer of logical effect of distributed practice on students' conceptual understand
reasoning. Quarterly Journal of Experimental Psychology, 35A,
ing of statistics. Higher Education, 62, 69-79.
39-49.
Butler, A. C. (2010). Repeated testing produces superior transfer of
Bishara, A. J., & Jacoby, L. L. (2008). Aging, spaced retrieval, learning relative to repeated studying. Journal of Experimental
and inflexible memory performance. Psychonomic Bulletin & Psychology: Learning, Memory, and Cognition, 36, 1118-1133.
Review, 15, 52-57. Butler, A. C., Karpicke, J. D., & Roediger, H. L., III. (2008). Cor
Blanchard, J., & Mikkelson, V. (1987). Underlining performance out recting a metacognitive error: Feedback increases retention of
comes in expository text. Journal of Educational Research, 80, low-confidence correct responses. Journal of Experimental Psy
197-201.
chology: Learning, Memory, and Cognition, 34, 918-928.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 49

Butler, A. C., & Yearbook


Roediger, H. of the
L.,National Reading (2007).
III. Conference (pp. 193-200).
Testing
term retention in a Rochester, NY: National
simulated Reading Conference.
classroom setting. E
nal of Cognitive Carrier, L. M. (2003). College
Psychology, 19, students' choices of study strategies.
514-527.
Butler, A. C., &
Roediger,Perceptual & H. L.,
Motor Skills, III. (2008). Feed
96, 54-56.
the positive Carroll,
effects and M., Campbell-Ratcliffe,
reduces the J., Murnane,
negativeH., & Perfect, T. effect
choice testing. Memory (2007).
& Retrieval-induced
Cognition, forgetting in educational
36, contexts:
604-616.
Butterfield, B., & Metcalfe,
Monitoring,J. (2001).
expertise, text integration, andErrors
test format. European comm
confidence are Journal of Cognitive Psychology,
hypercorrected. Journal 19 , 580-606. of Experime
ogy: Learning, Carvalho,and
Memory, P. E, & Goldstone, R. L. (2011, November). Comparison
Cognition, 21, 1491-1
Callender, A. A., & McDaniel, M.
between successively A.
presented (2007).
stimuli during blocked and The
inter ben
ded question adjuncts forleaved
low presentations
and in category
high learning. structure
Paper presented at bui
of Educational the 52nd Annual
Psychology, 99, Meeting339-348.
of the Psychonomic Society, Seattle,
Callender, A. A., & WA.
McDaniel, M. A. (2009). The limi
rereading educational Cashen, M. C., & Leicht,
texts. K. L. (1970). Role of the isolation effect in
Contemporary Educat
ogy, 34, 30-41. a formal educational setting. Journal of Educational Psychology,
Carlson, R. A., & Shin, 61, 484-486.
J. C. (1996). Practice schedu
instantiation in Cepeda,
cascaded N. J., Coburn, N., Rohrer,
problem D., Wixted, J. T., Mozer,
solving. M. C.,
Journal o
tal Psychology: & Pashler,
Learning, H. (2009). Optimizing distributed
Memory, and practice: Theoreti
Cognition,
Carlson, R. A., & Yaure, R. G.
cal analysis (1990).
and practical Practice
implications. Experimental Psychology, sched
of component skills in 56, 236-246.
problem solving. Journal of
Psychology: Learning, Cepeda,
Memory,N. J., Pashler, H., Vul, E.,and
Wixted, J. T., & Rohrer, D. (2006).
Cognition, 15
Carpenter, S. K. (2009). Distributed
Cue practice in verbal recall tasks:
strength as A review
a and quantita
moderator
effect: The benefits of tive synthesis. Psychological Bulletin,
elaborative 132, 354-380.
retrieval. Jour
mental Psychology: Learning, Memory,
Cepeda, N. J., Vul, E., Rohrer, D., and
Wixted, J. T., & Pashler, H. (2008). Co
1563-1569. Spacing effects in learning: A temporal ridgeline of optimal
Carpenter, S. K. (2011). Semantic information activated during retention. Psychological Science, 19, 1095-1102.
Cermak, L. S., Verfaellie, M., Lanzoni, S., Mather, M., & Chase,
retrieval contributes to later retention: Support for the mediator
K. A. (1996). Effect of spaced repetitions on amnesia patients'
effectiveness hypothesis of the testing effect. Journal of Experi
mental Psychology: Learning, Memory, and Cognition, 37,recall and recognition performance. Neuropsychology, 10, 219—
1547-1552. 227.

Challis, B. H. (1993). Spacing effects on cued-memory tests depend


Carpenter, S. K., & DeLosh, E. L. (2005). Application of the testing
and spacing effects to name learning. Applied Cognitive Psycholon level of processing. Journal of Experimental Psychology:
ogy, 19, 619-636. Learning, Memory, and Cognition, 19, 389-396.
Carpenter, S. K., & DeLosh, E. L. (2006). Impoverished cueChan,
sup J. C. K. (2009). When does retrieval induce forgetting and
port enhances subsequent retention: Support for the elaborative when does it induce facilitation? Implications for retrieval inhibi
tion, testing effect, and text processing. Journal of Memory and
retrieval explanation of the testing effect. Memory & Cognition,
34, 268-276. Language, 61, 153-170.
Carpenter, S. K., & Pashler, H. (2007). Testing beyond words: UsingChan, J. C. K. (2010). Long-term effects of testing on the recall of
tests to enhance visuospatial map learning. Psychonomic Bulletin nontested materials. Memory, 18, 49-57.
& Review, 14, 474-478. Chan, J. C. K., McDermott, K. B., & Roediger, H. L., III. (2006).
Carpenter, S. K., Pashler, H., & Cepeda, N. J. (2009). Using tests Retrieval-induced facilitation: Initially nontested material can
to enhance 8th grade students' retention of U.S. history facts. benefit from prior testing of related material. Journal of Experi
Applied Cognitive Psychology, 23, 760-771. mental Psychology: General, 135, 553-571.
Carpenter, S. K., Pashler, H., & Vul, E. (2006). What types of learn Chan, L. K. S., Cole, P. G., & Morris, J. N. (1990). Effects of instruc
ing are enhanced by a cued recall test? Psychonomic Bulletin & tion in the use of a visual-imagery strategy on the reading
Review, 13, 826-830. comprehension competence of disabled and average readers.
Carpenter, S. K., Pashler, H., Wixted, J. T., & Vul, E. (2008). The Learning Disability Quarterly, 13, 2-11.
effects of tests on learning and forgetting. Memory & Cognition, Chi, M. T. H. (2009). Active-constructive-interactive: A conceptual
36, 438-448. framework for differentiating learning activities. Topics in Cogni

Carpenter, S. K., & Vul, E. (2011). Delaying feedback by three tive Science, 1, 73-105.
seconds benefits retention of face-name pairs: The role of Chi, M. T. H., de Leeuw, N., Chiu, M.-H., & LaVancher, C. (1994).
active anticipatory processing. Memory & Cognition, 39, 1211— Eliciting self-explanations improves understanding. Cognitive
1221. Science, 18, 439—477.
Carr, E., Bigler, M., & Momingstar, C. (1991). The effects of Childers, J. B., & Tomasello, M. (2002). Two-year-olds leam novel
the CVS strategy on children's learning. Learner Factors/Teacher nouns, verbs, and conventional actions from massed or distrib
Factors: Issues in Literacy Research and Instruction—40th uted exposures. Developmental Psychology, 38, 967-978.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
50 Dunlosky et al.

Ciofïï, G. (1986). Relation


Delaney, P. F., Verkoeijen, P. P. J. L., & Spirgel, A. (2010). Spacing
reported byand the testing effects: A deeply critical, lengthy,
college stude and at times
25, 220-231. discursive review of the literature. Psychology of Learning and
Condus, M. M., Marshall, K. J., & Miller, S. R. (1986). Effects of Motivation, 53, 63-147.
the keyword mnemonic strategy on vocabulary acquisition and Dempster, F. N. (1987). Effects of variable encoding and spaced pre
maintenance by learning disabled children. Journal of Learning sentations on vocabulary learning. Journal of Educational Psy
Disabilities, 19, 609-613. chology, 79, 162-170.
Coppens, L. C., Verkoeijen, P. P. J. L., & Rikers, R. M. J. P. (2011). Dempster, F. N., & Farris, R. (1990). The spacing effect: Research
Learning Adinkra symbols: The effect of testing. Journal of Cog and practice. Journal of Research and Development in Educa
nitive Psychology, 3, 351-357. tion's, 97-101.
Craik, F. I., & Lockhart, R. S. (1972). Levels of processing: A frame Denis, M. (1982). Imaging while reading text: A study of individual
work for memory research. Journal of Verbal Learning and Ver differences. Memory & Cognition, 10, 540-545.
bal Behavior, 11, 671-684. Didieijean, A., & Cauzinille-Marmèche, E. (1997). Eliciting self
Cranney, J., Ahn, M., McKinnon, R., Morris, S., & Watts, K. (2009). explanations improves problem solving: What processes are
The testing effect, collaborative learning, and retrieval-induced involved? Current Psychology of Cognition, 16, 325-351.
facilitation in a classroom setting. European Journal of Cognitive Di Vesta, F. J., & Gray, G. S. (1972). Listening and note taking. Jour
Psychology, 21, 919-940. nal of Educational Psychology, 63, 8-14.
Crawford, C. C. (1925a). The correlation between college lecture Doctorow, M., Wittrock, M. C., & Marks, C. (1978). Generative pro
notes and quiz papers. Journal of Educational Research, 12, cesses in reading comprehension. Journal of Educational Psy
282-291. chology, 70, 109-118.
Crawford, C. C. (1925b). Some experimental studies of the results
Donovan, J. J., & Radosevich, D. J. ( 1999). A meta-analytic review of
of college note-taking. Journal of Educational Research, 12,
the distribution of practice effect: Now you see it, now you don't.
379-386. Journal of Applied Psychology, 84, 795-805.
Crouse, J. H., & Idstein, P. ( 1972). Effects of encoding cues on prose
Domisch, M. M., & Sperling, R. A. (2006). Facilitating learning from
learning. Journal of Educational Psychology, 68, 309-313. technology-enhanced text: Effects of prompted elaborative inter
rogation. Journal of Educational Research, 99, 156-165.
Cull, W. L. (2000). Untangling the benefits of multiple study oppor
tunities and repeated testing for cued recall. Applied Cognitive
Duchastel, P. C. (1981). Retention of prose following testing with
Psychology, 14, 215-235. different types of test. Contemporary Educational Psychology,
6, 217-226.
Daniel, D. B., & Broida, J. (2004). Using web-based quizzing to
improve exam performance: Lessons learned. Teaching of Psy
Duchastel, P. C., & Nungester, R. J. (1982). Testing effects measured
chology, 31, 207-208. with alternate test forms. Journal of Educational Research, 75,
309-313.
De Beni, R., & Moè, A. (2003). Presentation modality effects in
studying passages. Are mental images always effective? Applied
Dunlosky, J., & Rawson, K. A. (2005). Why does rereading improve
Cognitive Psychology, 17, 309-324. metacomprehension accuracy? Evaluating the levels-of-disrup
de Bruin, A. B. H„ Rikers, R. M. J. P., «fe Schmidt, H. G. (2007).
tion hypothesis for the rereading effect. Discourse Processes, 40,
37-56.
The effect of self-explanation and prediction on the development
Durgunoglu, A. Y., Mir, M., & Ariño-Martí, S. (1993). Effects of
of principled understanding of chess in novices. Contemporary
Educational Psychology, 32, 188-205. repeated readings on bilingual and monolingual memory for text.
de Croock, M. B. M., <fe van Merriënboer, J. J. G. (2007). Paradoxical
Contemporary Educational Psychology, 18, 294-317.
effect of information presentation formats and contextual inter
Dyer, J. W., Riley, J., & Yekovich, F. R. (1979). An analysis of three
ference on transfer of a complex cognitive skill. Computers in
study skills: Notetaking, summarizing, and rereading. Journal of
Human Behavior, 23, 1740-1761. Educational Research, 73, 3-1.
de Croock, M. B. M., van Merriënboer, J. J. G., & Pass, F. (1998).
Einstein, G. O., Morris, J., & Smith, S. (1985). Note-taking, indi
High versus low contextual interference in simulation-based vidual differences, and memory for lecture information. Journal
training of troubleshooting skills: Effects on transfer performance of Educational Psychology, 77, 522-532.
and invested mental effort. Computers in Human Behavior,Fass,
14, W., & Schumacher, G. M. (1978). Effects of motivation, subject
249-267.
activity, and readability on the retention of prose materials. Jour
de Koning, B. B., Tabbers, H. K., Rikers, R. M. J. P., & Paas, F.
nal of Educational Psychology, 70, 803-807.
(2011). Improved effectiveness of cueing by self-explanations
Faw, H. W., & Waller, T. G. (1976). Mathemagenic behaviors and
when learning from a complex animation. Applied Cognitive efficiency in learning from prose materials: Review, critique and
Psychology, 25, 183-194. recommendations. Review of Educational Research, 46,691-720.
Fazio, L. K., Agarwal, P. K., Marsh, E. J., & Roediger, H. L., III.
Delaney, P. F., <fe Knowles, M. E. (2005). Encoding strategy changes
and spacing effects in free recall of unmixed lists. Journal of
(2010). Memorial consequences of multiple-choice testing on
Memory and Language, 52, 120-130. immediate and delayed tests. Memory & Cognition, 38, 407^-18.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 51

Fishman, E. J., Keller, L.,


Goverover,& Atkinson,
Y., Arango-Lasprilla, R.
J. C., Hillary, F. G., C. (1968).
Chiaravalloti,
distributed practice in N., & DeLuca, J. (2009). Application of the
computerized spacing effect to
spelling drill
Educational Psychology,
improve59,290-296.
learning and memory for functional tasks in traumatic
Foos, P. W. (1995). The effect
brain injury: A pilot
of study.
variations
The American Journal of Occupa
in text
opportunities on test tional Therapy, 63, 543-548.
performance. Journal of Expe
cation, 63, 89-95. Goverover, Y., Hillary, F. G., Chiaravalloti, N., Arango-Lasprilla, J.
Foos, P. W., & Fisher, R. C., & (1988).
P. DeLuca, J. (2009). A Using
functional application of the spacing
tests as learn
ties. Journal of Educational Psychology,
effect to improve 80,179-18
learning and memory in persons with multiple
Fowler, R. L., & Barker, sclerosis.S.
A. Journal of Clinical and Experimental
(1974). Neuropsychol
Effectiveness o
for retention of text ogy, 31, 513-522. Journal
material. of Applied Ps
358-364. Greene, C., Symons, S., & Richards, C. (1996). Elaborative interro
gation effects for children with learning disabilities: Isolated facts
Friend, R. (2001). Effects of strategy instruction on summary writing
of college students. Contemporary Educational Psychology, 26,versus connected prose. Contemporary Educational Psychology,
3-24. 21, 19-42.
Fritz, C. O., Morris, P. E., Acton, M., Voelkel, A. R., & Etkind, R. Greene, R. L. (1989). Spacing effects in memory: Evidence for a two
(2007). Comparing and combining retrieval practice and the key process account. Journal of Experimental Psychology: Learning,
word mnemonic for foreign vocabulary learning. Applied Cogni Memory, and Cognition, 15, 371-377.
tive Psychology, 21, 499-526. Greene, R. L. (1990). Spacing effects on implicit memory tests. Jour
Fritz, C. O., Morris, P. E., Nolan, D., & Singleton, J. (2007). Expand nal of Experimental Psychology: Learning, Memory, and Cogni
ing retrieval practice: An effective aid to preschool children's tion, 16, 1004-1011.
learning. Quarterly Journal of Experimental Psychology, 60, Griffin, T. D„ Wiley, J., & Thiede, K. W. (2008). Individual differ
991-1004. ences, rereading, and self-explanation: Concurrent processing
and cue validity as constraints on metacomprehension accuracy.
Gagne, E. D., & Memory, D. (1978). Instructional events and com
Memory & Cognition, 36, 93-103.
prehension: Generalization across passages. Journal of Reading
Behavior, 70(4), 321-335. Gurung, R. A. R. (2005). How do students really study (and does it
matter)? Teaching of Psychology, 32, 239-241.
Gajria, M., & Salvia, J. (1992). The effects of summarization instruc
Gurung, R. A. R., Weidert, J., & Jeske, A. (2010). Focusing on how
tion on text comprehension of students with learning disabilities.
Exceptional Children, 58, 508-516. students study. Journal of the Scholarship of Teaching and Learn
ing, 10, 28-35.
Gambrell, L. B., & Jawitz, P. B. (1993). Mental imagery, text illus
Guttman, J., Levin, J. R., & Pressley, M. (1977). Pictures, partial pic
trations, and children's story comprehension and recall. Reading
Research Quarterly, 28(3), 265-276. tures, and young children's oral prose learning. Journal of Edu
cational Psychology, 69(5), 473^180.
Gamer, R. (1982). Efficient text summarization: Costs and benefits.
Journal of Educational Research, 75, 275-279. Gyeselinck, V., Meneghetti, C., De Beni, R., & Pazzaglia, F. (2009).
The role of working memory in spatial text processing: What
Gates, A. I. (1917). Recitation as a factor in memorizing. Archives of
Psychology, 6, 1-104. benefit of imagery strategy and visuospatial abilities? Learning
Gernsbacher, M. A., Vamer, K. R., & Faust, M. E. (1990). Invesand Individual Differences, 19, 12-20.
Hall,ofJ. W. (1988). On the utility of the keyword mnemonic for
tigating differences in general comprehension skill. Journal
Experimental Psychology: Learning, Memory, and Cognition,vocabulary learning. Journal of Educational Psychology, 80,
16, 430-445. 554-562.

Giesen, C., & Peeck, J. (1984). Effects of imagery instructionHamilton,


on R. J. (1997). Effects of three types of elaboration on learn
ing concepts from text. Contemporary Educational Psychology,
reading and retaining a literary text. Journal of Mental Imagery,
8, 79-90. 22, 299-318.
Glover, J. A. (1989). The "testing" phenomenon: Not gone but nearly Hare, V. C., & Borchardt, K. M. (1984). Direct instruction of summa
forgotten. Journal of Educational Psychology, 81, 392-399. rization skills. Reading Research Quarterly, 20, 62-78.
Glover, J. A., & Corkill, A. J. (1987). Influence of paraphrased repeti Hartley, J., Bartlett, S., & Branthwaite, A. (1980). Underlining can
tions on the spacing effect. Journal of Educational Psychology, make a difference: Sometimes. Journal of Educational Research,
79, 198-199. 73, 218-224.
Glover, J. A., Zimmer, J. W., Filbeck, R. W., & Plake, B. S. (1980). Hartwig, M. K., & Dunlosky, J. (2012). Study strategies of college
Effects of training students to identify the semantic base of prose students: Are self-testing and scheduling related to achievement?
materials. Journal of Applied Behavior Analysis, 13, 655-667. Psychonomic Bulletin & Review, 19, 126-134.
Goldenberg, G. (1998). Is there a common substrate for visual recog Hatala, R. M., Brooks, L. R., & Norman, G. R. (2003). Practice
nition and visual imagery? Neurocase, 141-147. makes perfect: The critical role of mixed practice in the acquisi
Goldstone, R. L. (1996). Isolated and interrelated concepts. Memory tion of ECG interpretation skills. Advanced in Health Sciences
& Cognition, 24, 608-628. Education, 8, 17-26.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
52 Dunlosky et al.

Hattie, J. A. Jitendra,
C. (2009). A. K
Visib
analyses What
relating researc
to achiev
Hayati, A. with
M., & learning
Shariatifa
of College Reading and
Johnson, C. Le
I
Head, M. H., Readence,
media J
learni
tion of 629.
summary writing
Reading Research and L.
Johnson, Ins
L
Helder, E., & passage and
Shaughnessy,s
multitaskingtion, 28,
improve 18-3
nam
Helsdingen, Kahl,
A., B.,
van &
Gog,W
The effects of practice
tion to sc
facilit
on learning learning
and sett
transfer o
Educational Psychology,
Cognitive Psy
Helsdingen, Kang,
A., van S. H.
Gog, K
T
effects of practice
testingschedul
on lea
Learning and18, 998-1005.
Instruction,
Hidi, S., & Anderson,
Kang, S. V. K.
H. (1
demands, cognitive
ing is operat
advant
Review of Educational
Applied Re
Cogn
Hintzman, D. L.,
Kang, & S.
Rogers
H. K.
memory. Memory
K., & & Cogn
Mozer,
Hinze, S. R., & Wiley,
learning?J. (2
Jou
using completion tests.
Kardash, C. M
M.
Hodes, C. L. (1992).
and The ef
repeated
illustrations: A compariso
recall of pers
nal of 20, 201-221.
Research and Devel
Hoon, P. W. Karpicke,
(1974). Efficac
J. D., & Bauemschmidt, A. (2011). Spaced retrieval: Abso
chological Reports, 35,
lute spacing enhances learning regardless of relative spacing. 1
Hunt, R. R. ( 1995).
Journal of Experimental Psychology: The
Learning, Memory, and sub
really did. Psychonomic
Cognition, 37, 1250-1257. B
Hunt, R. R.
Karpicke,(2006). The
J. D., & Blunt, J. R. (2011). Retrieval practice produces c
research. Inmore R. R.
learning than elaborative studying Hunt
with concept mapping. &
and memoryScience,(pp.
331, 772-775. 3-25). Ne
Hunt, R. R., Karpicke,
&J. D.,Smith, R.
Butler, A. C., & Roediger, H. L., III. (2009). Meta E.
general: The cognitive
power strategies in student learning: Doof dist
students practice
tion. Memory &
retrieval when they Cognition
study on their own? Memory, 17, 471-479.
Hunt, R. R.,
Karpicke, J.& Worthen,
D., & Roediger, H. L., III. (2007a). Expanding retrieval
memory. New York,
practice promotes NY:
short-term retention, but equally spaced O
Idstein, P., & Jenkins,
retrieval enhances long-term retention. Journal of Experimental J.
reading. The Journal
Psychology: Learning, Memory, and Cognition, 33, 704—719.of E
Jacoby, L. L.,
Karpicke, J. D.,&
& Roediger,Dallas,
H. L., III. (2007b). Repeated retrieval M.
biographical during
memory and
learning is the key to long-term retention. Journal of
mental Psychology:
Memory and Language, 57, 151-162. Gene
Jacoby, L. L., Wahlheim,
Karpicke, J. D., & Roediger, H. L., III. (2008). The critical impor C
learning of tance
natural
of retrieval for learning. Science, 319, 966-968.conce
classification, and
Karpicke, J. D., & Roediger, H. L., III. (2010). metaco
Is expanding retrieval
chology: Learning,
a superior method for learning text materials? Memor
Memory & Cogni
Janiszewski,tion, 38,
C., 116-124. Noel, H.,
of the spacing
Keys, N. (1934). Theeffect
influence on learning and retention ofin
weekly ve
on advertising repetition
as opposed to monthly tests. Journal of Educational Psychology, a
sumer Research,
427^136. 30, 138-
Jenkins, J. J. (1979).
Kiewra, K. A., Mayer, R. E., Christensen, M., Four
Kim, S.-I., & Risch, N.
and memory experiments
(1991). Effects of repetition on recall and note-taking: Strategies
Levels of processing inPsychology,
for learning from lectures. Journal of Educational hu
NJ: Lawrence Erlbaum.
83, 120-123.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 53

Kika, F. M., McLaughlin, T.


Lesgold, F.,
A. M., & Dixon,
McCormick, C., & Golinkoff, J. (1992).
R. M. (1975). Imagery
quent testing of secondary algebra
Training and students.
children's prose learning. Jour
Journal of Educational
tional Research, 85, 159-162.
Psychology, 67(5), 663-667.
King, A. (1992). Leutner, D.,
Comparison of Leopold, C., & Sumfleth, E. (2009). Cognitive load
self-questioning, sum
notetaking-review as strategies
and science text for learning
comprehension: from
Effects of drawing and men le
can Education Research tally
Journal, 29, 303-323.
imagining text content. Computers in Human Behavior, 25,
King, J. R., Biggs, S., & 284-289.
Lipsky, S. (1984). Students' se
and summarizing as reading
Levin, J. R., study strategies.
& Divine-Hawkins, Journ
R (1974). Visual imagery as aprose
Behavior, 16, 205-218. leaming process. Journal of Reading Behavior, 6, 23-30.
Klare, G. R., Mabry, J. Levin,
E., &Divine-Hawkins,
J. R., Gustafson, L.
P., Kerst, S. M., & M.
Guttman, (195
J. (1974).
ship of patterning (underlining)
Individual differences into immediate
learning rete
from pictures and words: The
acceptability of technical material.
development and application of Journal of
an instrument. Journal App
of Educa
ogy, 39,40-42. tional Psychology, 66(3), 296-303.
Kornell, N. (2009). Levin, J. R., Pressley,
Optimising M., McCormick, C.
learning B., Miller, G. flashc
using E., & Shri
more effective than berg, L. K. (1979). Assessing
cramming. Appliedthe classroomCognitive
potential of the
23,1297-1317. keyword method. Journal of Educational Psychology, 71(5),
Kornell, N., & Bjork, 583-594.
R. A. (2007). The promise
self-regulated study. Logan, J. M., & Balota, D. A. (2008).
Psychonomic Expanded vs. equal
Bulletin &interval
Revi
224. spaced retrieval practice: Exploring different schedules of spac
Kornell, N., & Bjork, R. ing and(2008).
A. retention interval in younger and older adults.
Learning, Aging,
concep
ries: Is spacing the Neuropsychology,
"enemy and Cognition, 15,257-280.Psycholog
of induction?"
19, 585-592. Lonka, K., Lindblom-Ylànne, S., & Maury, S. (1994). The effect of
Kornell, N., Bjork, R. A., & Garcia, M. A. (2011). Why tests appear study strategies on learning from text. Learning and Instruction,
to prevent forgetting: A distribution-based bifurcation model. 4, 253-271.
Journal of Memory and Language, 65, 85-97. Lorch, R. F. (1989). Text-signaling devices and their effects on read
Kornell, N., Castel, A. D., Eich, T. S., & Bjork, R. A. (2010). Spacing ing and memory processes. Educational Psychology Review, 1,
as the friend of both memory and induction in young and older 209-234.

adults. Psychology and Aging, 25, 498-503. Lorch, R. F., Jr., Lorch, E. P., & Klusewitz, M. A. (1995). Effects of
Kosslyn, S. M. (1981). The medium and the message in mental imag typographical cues on reading and recall of text. Contemporary
ery: A theory. Psychological Review, 88, 46-66. Educational Psychology, 20, 51-64.
Kratochwill, T. R., Demuth, D. M., & Conzemius, W. C. (1977). The Lyle, K. B., & Crawford, N. A. (2011). Retrieving essential material
effects of overlearning on preschool children's retention of sight at the end of lectures improves performance on statistics exams.
vocabulary words. Reading Improvement, 14, 223-228. Teaching of Psychology, 38, 94-97.
Kromann, C. B., Jensen, M. L., & Ringsted, C. (2009). The effects of Maddox, G. B., Balota, D. A., Coane, J. H., & Duchek, J. M. (2011).
testing on skills learning. Medical Education, 43, 21-27. The role of forgetting rate in producing a benefit of expanded
Kulhavy, R. W., Dyer, J. W„ & Silver, L. (1975). The effects of over equal spaced retrieval in young and older adults. Psychology
notetaking and test expectancy on the learning of text material. of Aging, 26, 661-670.
Journal of Educational Research, 68, 363-365. Magliano, J. P., Trabasso, T., & Graesser, A. C. (1999). Strategic pro
Kulhavy, R. W., & Swenson, I. (1975). Imagery instructions and the cessing during comprehension. Journal of Educational Psychol
comprehension of text. British Journal of Educational Psychol ogy, 91, 615-629.
ogy, 45, 47-51. Maher, J. H., & Sullivan, H. (1982). Effects of mental imagery and
Lawson, M. J., & Hogben, D. (1998). Learning and recall of foreign oral and print stimuli on prose learning of intermediate grade
language vocabulary: Effects of a keyword strategy for immedi children. Educational Technology Research & Development, 30,
ate and delayed recall. Learning and Instruction, 8(2), 179-194. 175-183.

Lee, H. W., Lim, K. Y., & Grabowski, B. L. (2010). Improving self Malone, L. D., & Mastropieri, M. A. (1991). Reading comprehen
regulation, learning strategy use, and achievement with metacog sion instruction: Summarization and self-monitoring training
nitive feedback. Educational Technology Research Development, for students with learning disabilities. Exceptional Children, 58,
58, 629-648. 270-283.

Leeming, F. C. (2002). The exam-a-day procedure improves per Marschark, M., & Hunt, R. R. (1989). A reexamination of the role of
formance in psychology classes. Teaching of Psychology, 29, imagery in learning and memory. Journal of Experimental Psy
210-212. chology: Learning, Memory, and Cognition, 15, 710-720.
Marsh, E. J., Agarwal, P. K., & Roediger, H. L., III. (2009). Memorial
Leicht, K. L., & Cashen, V. M. (1972). Type of highlighted material
consequences of answering SAT II questions. Journal of Experi
and examination performance. Journal of Educational Research,
65, 315-316. mental Psychology: Applied, 15, 1-11.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
54 Dunlosky et al.

Marsh, E. Miccinati, J. L.
J., & (1982). TheButler,
influence of a six-week imagery training A
tings. In D. program on children's reading comprehension. Journal(Ed
Reisberg of Read
chology. ing Behavior, 14(2), 197-203.
Mastropieri, M. A., Scruggs, T. E., & Mushinski Fulk, B. J. (1990). Michael, J. (1991). A behavioral perspective on college teaching. The
Teaching abstract vocabulary with the keyword method: Effects Behavior Analyst, 14, 229-239.
on recall and comprehension. Journal of Learning Disabilities, Miller, G. E., & Pressley, M. (1989). Picture versus question elabora
23, 92-107. tion on young children's learning of sentences containing high
Mathews, C. O. (1938). Comparison of methods of study for imme and low-probability content. Journal of Experimental Child
diate and delayed recall. Journal of Educational Psychology, 29, Psychology, 48,431^150.
101-106. Mitchell, C., Nash, S., & Hall, G. (2008). The intermixed-blocked
Matthews, P., & Rittle-Johnson, B. (2009). In pursuit of knowledge: effect in human perceptual learning is not the consequence of trial
Comparing self-explanations, concepts, and procedures as peda spacing. Journal of Experimental Psychology: Learning, Mem
gogical tools. Journal of Experimental Child Psychology, 104, ory, and Cognition, 34,237-242.
1-21. Morris, P. E., & Fritz, C. O. (2002). The improved name game: Better
Mawhinney, V. T., Bostow, D. E., Laws, D. R., Blumenfield, G. J., & use of expanding retrieval practice. Memory, 10,259-266.
Hopkins, B. L. (1971). A comparison of students studying behavMoulton, C.-A., Dubrowski, A. E., MacRae, H., Graham, B.,
ior produced by daily, weekly, and three-week testing schedules. Grober, E., & Reznick, R. (2006). Teaching surgical skills: What
Journal of Applied Behavior Analysis, 4, 257-264. kind of practice makes perfect? Annals of Surgery, 244, 400
Mayer, R. E. (1983). Can you repeat that? Qualitative effects of rep 409.

etition and advance organizers on learning from science prose.Myers, G. C. (1914). Recall in relation to retention. Journal of Edu
Journal of Educational Psychology, 75, 40-49. cational Psychology, 5, 119-130.
Mayfield, K. H., & Chase, P. N. (2002). The effects of cumulative
Neuschatz, J. S., Preston, E. L., Toglia, M. P., & Neuschatz, J. S.
practice on mathematics problem solving. Journal of Applied (2005). Comparison of the efficacy of two name-leaming tech
Behavior Analysis, 35, 105-123. niques: Expanding rehearsal and name-face imagery. American
McDaniel, M. A., Agarwal, P. K„ Huelser, B. J., McDermott, K. B., Journal of Psychology, 118, 79-102.
& Roediger, H. L., III. (2011). Test-enhanced learning in a midNist, S. L., & Hogrebe, M. C. (1987). The role of underlining
dle school science classroom: The effects of quiz frequency and and annotating in remembering textual information. Reading
placement. Journal of Educational Psychology, 103, 399^114. Research and Instruction, 27, 12-25.
McDaniel, M. A., Anderson, J. L., Derbish, M. H., & Morrisette,Nist, S. L., & Kirby, K. (1989). The text marking patterns of college
N. (2007). Testing the testing effect in the classroom. European students. Reading Psychology: An International Quarterly, 10,
Journal of Cognitive Psychology, 19, 494-513. 321-338.

McDaniel, M. A., & Donnelly, C. M. (1996). Learning with analogy Nungester, R. J., & Duchastel, P. C. (1982). Testing versus review:
and elaborative interrogation. Journal of Educational Psychol Effects on retention. Journal of Educational Psychology, 74,
ogy, 88, 508-519. 18-22.

McDaniel, M. A., Howard, D. C., & Einstein, G. O. (2009). The read Oakhill, J., & Patel, S. (1991). Can imagery training help childr
recite-review study strategy: Effective and portable. Psychologi who have comprehension problems? Journal of Research i
cal Science, 20, 516-522. Reading, 14(2), 106-115.
McDaniel, M. A., & Pressley, M. (1984). Putting the keyword method Olina, Z., Resier, R., Huang, X., Lim, J., & Park, S. (2006). Proble
in context. Journal of Educational Psychology, 76, 598-609. format and presentation sequence: Effects on learning and ment
McDaniel, M. A., Wildman, K. M., & Anderson, J. L. (2012). Using effort among US high school students. Applied Cognitive Ps
quizzes to enhance summative-assessment performance in a web chology, 20, 299-309.
based class: An experimental study. Journal of Applied Research O'Reilly, T., Symons, S., & MacLatchy-Gaudet, H. (1998). A co
in Memory and Cognition, 1, 18-26. parison of self-explanation and elaborative interrogation. Co
McNamara, D. S. (2010). Strategies to read and leam: Overcoming temporary Educational Psychology, 23,434-445.
learning by consumption. Medical Education, 44, 240-346. Ormrod, J. E. (2008). Educational psychology: Developing learne
Metcalfe, J., & Kornell, N. (2007). Principles of cognitive science in (6th ed.). Upper Saddle River, NJ: Pearson Education.
education: The effects of generation, errors, and feedback. PsyO'Shea, L. J., Sindelar, P. T., & O'Shea, D. J. (1985). The effec
chonomic Bulletin & Review, 14, 225-229. of repeated readings and attentional cues on reading fluenc
Metcalfe, J., Kornell, N., & Finn, B. (2009). Delayed versus immedi and comprehension. Journal of Reading Behavior, 17, 129—
ate feedback in children's and adults' vocabulary learning. Mem 142.

ory & Cognition, 37, 1077-1087. Ozgungor, S., & Guthrie, J. T. (2004). Interactions among elaborativ
Metcalfe, J., Kornell, N., & Son, L. K. (2007). A cognitive-science interrogation, knowledge, and interest in the process of constru
based programme to enhance study efficacy in a high and low risk ing knowledge from text. Journal of Educational Psychology, 9
setting. European Journal of Cognitive Psychology, 19, 743-768. 437^143.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 55

Palincsar, A. S., & Pylyshyn,


Brown, A. Z. W.L.
(1981).(1984).
The imagery debate: Analogue media versus
Reciprocal t
prehension-fostering tacit knowledge.
and Psychological Review, 88, 16-45.
comprehension-monitori
Cognition and Ramsay, C. M.,
Instruction, I,Sperling,
117-175.R. A., & Domisch, M. M. (2010). A com
Pashler, H., Bain, P., Bottge, B.,
parison of the effects Graesser,
of students' A.,
expository text comprehension K
McDaniel, M., & strategies.
Metcalfe, J. Instructional
(2007). Science, Organizing
38, 551-570. ins
study to improve Raney, G. E. (1993).
student Monitoring changes(NCER
learning in cognitive load during
2007
ington, DC: National reading: An
Center event-related
for brain potential and reaction
Education time analy
Research,
Education Sciences, U.S. sis. Journal of Experimental Psychology:
Department of Learning, Memory, and
Education.
Pashler, H., Zarow, G., & Triplett,
Cognition, 1, 51-69. B. (2003). Is te
of tests helpful even Rasco, R. W., Tennyson,
when R. D., & Boutwell,error
it inflates R. C. (1975). Imagery
rates
Experimental instructions
Psychology: and drawings in learning
Learning, prose. Journal of Educa
Memory, and
29, 1051-1057. tional Psychology, 67, 188-192.
Pauk, W., & Ross, J. Q. Rau, M. A., Aleven, How
(2010). V., & Rummel,
to N. (2010).
study Blocked versus
in inter
col
Boston, MA: Wadsworth.
leaved practice with multiple representations in an intelligent
Pavlik, P. I., Jr., & Anderson,
tutoring systemJ. R. (2005).
for fractions. Practice
In V. Aleven, J. Kay, & J. Mostow
effects on vocabulary (Eds.), Intelligent tutoring
memory: An systems (pp. 413-422). Berlin/Heidel
activation-based
spacing effect. berg, Science,
Cognitive Germany: Springer-Verlag.
29, 559-586.
Peterson, S. E. (1992). Raugh, M.
The R., & Atkinson, R. C. (1975).
cognitive A mnemonic method of
functions for u
study technique. learning
Reading a second-language vocabulary.
Research and Journal of Educational
Instruction
Pressley, M. (1976). Psychology,
Mental imagery 67, 1-16. helps eight-year-
what they read. Rawson,
Journal of K. A. (2012). Why do rereading lagPsychology,
Educational effects depend on test
Pressley, M. (1986). The delay? Journal of Memory
relevance of and Language,
the 66, 870-884.
good strat
to the teaching of Rawson, K. A., & Dunlosky,
mathematics. J. (2011). Optimizing schedules
Educational of
Psych
139-161. retrieval practice for durable and efficient learning: How much
Pressley, M., Goodchild, F., Fleet, F., Zajchowski, R., & Evans, E.is enough? Journal of Experimental Psychology: General, 140,
D. (1989). The challenges of classroom strategy instruction. The 283-302.
Elementary School Journal, 89, 301-342. Rawson, K. A., & Kintsch, W. (2005). Rereading effects depend upon
Pressley, M., Johnson, C. J., Symons, S., McGoldrick, J. A., & Kurita,the time of test. Journal of Educational Psychology, 97, 70-80.
Rawson, K. A., & Van Overschelde, J. P. (2008). How does knowl
J. A. (1989). Strategies that improve children's memory and com
prehension of text. The Elementary School Journal, 90, 3-32. edge promote memory? The distinctiveness theory of skilled
Pressley, M., & Levin, J. R. (1978). Developmental constraints asso memory. Journal of Memory and Language, 58, 646-668.
Rea, C. P., & Modigliani, V. (1985). The effect of expanded versus
ciated with children's use of the keyword method of foreign
language vocabulary learning. Journal of Experimental Childmassed practice on the retention of multiplication facts and spell
Psychology, 26, 359-372. ing lists. Human Learning, 4, 11-18.
Recht, D. R., & Leslie, L. (1988). Effect of prior knowledge on good
Pressley, M., McDaniel, M. A., Turnure, J. E., Wood, E., & Ahmad,
M. (1987). Generation and precision of elaboration: Effects on and poor readers' memory of text. Journal of Educational Psy
intentional and incidental learning. Journal of Experimental Psy chology, 80, 16-20.
chology: Learning, Memory, and Cognition, 13, 291—300. Rees, P. J. (1986). Do medical students leam from multiple choice
Pyc, M. A., & Dunlosky, J. (2010). Toward an understanding of stu examinations? Medical Education, 20, 123-125.
dents' allocation of study time: Why do they decide to massRewey,
or K. L., Dansereau, D. F., & Peel, J. L. (1991). Knowledge
space their practice? Memory & Cognition, 38, 431-440. maps and information processing strategies. Contemporary Edu
Pyc, M. A., & Rawson, K. A. (2009). Testing the retrieval effortcational Psychology, 16, 203-214.
Reynolds, J. H., & Glaser, R. (1964). Effects of repetition and spaced
hypothesis: Does greater difficulty correctly recalling informa
tion lead to higher levels of memory? Journal of Memory and review upon retention of a complex learning task. Journal of
Language, 60, 437-447. Educational Psychology, 55, 297-308.
Riches, N. G., Tomasello, M., & Conti-Ramsden, G. (2005). Verb
Pyc, M. A., & Rawson, K. A. (2010). Why testing improves memory:
Mediator effectiveness hypothesis. Science, 330, 335. learning in children with SLI: Frequency and spacing effects.
Pyc, M. A., & Rawson, K. A. (2012a). Are judgments of learning Journal of Speech, Language, and Hearing Research, 48, 1397
made after correct responses during retrieval practice sensitive 1411.
Rickard, T. C., Lau, J. S.-H., & Pashler, H. (2008). Spacing and the
to lag and criterion level effects? Memory & Cognition, 40, 976
988. transition from calculation to retrieval. Psychonomic Bulletin &

Pyc, M. A., & Rawson, K. A. (2012b). Why is test-restudy prac Review, 15, 656-661.
Rickards, J. P., & August G. J. (1975). Generative underlining strat
tice beneficial for memory? An evaluation of the mediator shift
hypothesis. Journal of Experimental Psychology: Learning, egies in prose recall. Journal of Educational Psychology, 67,
Memory, and Cognition, 38, 737-746. 860-865.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
56 Dunlosky et al.

Rickards, Schmidt, S. P.,


J. R. (1988). Test expectancy
& andDenne
individual-item versus
lining and adjunct
relational quest
processing. The American Journal of Psychology, 101,
tional 59-71.
Science, 8, 81-90
Rinehart, Schneider, D.,
S. V. I., Healy, A. F.,Stahl,
& Bourne, L. E. (1998). ContextualS
of summarization train
interference effects in foreign language vocabulary acquisition
Research and retention. In A. F. Healy & L. E Bourne (Eds.), Foreign
Quarterly, 21, lan
guage learning (pp. 77-90).
Rittle-Johnson, B. Mahwah, NJ: Lawrence Erlbaum.
(2006)
nation and Schneider,
direct V. I., Healy, A. F., & Bourne, L. E. (2002). What is learned
instru
Roediger, H. L.,
under difficult conditions is hard to Ill, &
forget: Contextual interfer
retrieval ence effects in foreign vocabulary acquisition,
practice in retention, and
long
ences, 15, 20-27.
transfer. Journal of Memory and Language, 46, 419-440.
Roediger, Schworm, S.,
H. L.,& Renkl, A. (2006).
Ill, Computer-supported
& example
Ka
memory: based learning: When instructional
Basic research explanations reduce self
tice. Perspectives on
explanations. Computers & Education, 46, 426—445. Ps
Roediger, Scruggs, T. E.,
H. L.,Mastropieri, M.Ill,
A., & Sullivan, G.&
S. (1994). Pro
K
ing: Takingmoting
memory tes
relational thinking: Elaborative interrogation for students
logical Science, 17,
with mild disabilities. Exceptional 249
Children, 60, 450-457.
Roediger, H.
Scruggs, L.,
T. E., Mastropieri, M. A., Sullivan,Ill,
G. S., & Hesser, L. S. &
negative consequences
(1993). Improving reasoning and recall: The differential effects
Experimental Psycholog
of elaborative interrogation and mnemonic elaboration. Learning
31, 1155-1159.
Disability Quarterly, 16, 233-240.
Roediger, H. L.,
Seabrook, R., Brown, Ill,
G. D. A., & Solity, Putn
J. E. (2005). Distributed and
efits of massed practice: From laboratory
testing and to classroom. Applied
thei Cogni
chology of tiveLearning
Psychology, 19,107-122. an
Rohrer, D. Seifert,
(2009). The
T. L. (1993). Effects of elaborative ef
interrogation with prose
lems. Journal
passages. Journalfor Resear
of Educational Psychology, 85, 642-651.
Rohrer, D.,
Seifert, T.& Taylor,
L. (1994). Enhancing memory for main ideas using elabo K
distributed rative
practice on
interrogation. Contemporary Educational Psychology, 19, t
Applied 360-366.
Cognitive Psyc
Rohrer, D., &
Shapiro, Taylor,
A. M., & Waters, K.
D. L. (2005). An investigation of the cogni
lems improves
tive processes underlyinglearning
the keyword method of foreign vocabu
Rohrer, D., laryTaylor, K.,
learning. Language Teaching Research, 9(2), 129-146. &
fer Shriberg, L. K., Levin, J. R., McCormick,
learning. of Journa C. B., & Pressley, M.
Memory, and Cognition
( 1982). Learning about "famous" people via the keyword method.
Rose, D. S.,
Journal Parks, M.,
of Educational Psychology, 74(2), 238-247.
Imagery-based learning:
Simmons, A. L. (2011). Distributed practice and procedural memory
comprehension with
consolidation in musicians' skill dr
learning. Journal of Research in
tional Research,
Music Education, 59, 357-368. 94(1),
Ross, S. M.,
Slamecka, N. & Di
J., & Graf, P. Vesta
(1978). The generation effect: Delineation
strategy for enhancing
of a phenomenon. Journal of Experimental Psychology: Human
cational Psychology,
Learning and Memory, 4, 592-604. 68
Rothkopf, E.
Slavin, Z.
R. E. (2009). (1968).
Educational psychology: Theory and practiceT
inspection. (9thJournal of
ed.). Upper Saddle River, NJ: Pearson Education. E
Runquist, W.
Smith, B. L., N.
Holliday, W. G.,(1983). S
& Austin, H. W. (2010). Students' com
Memory & prehension
Cognition,
of science textbooks using a question-based reading 1
Santrock, J. (2008).
strategy. Journal Ed
of Research in Science Teaching, 47, 363-379.
McGraw-Hill.
Smith, T. A., & Kimball, D. R. (2010). Learning from feedback:
Schmidmaier, R., Ebersbach, R., Schiller, M., Hege, L, Hlozer, M., Spacing and the delay-retention effect. Journal of Experimental
& Fischer, M. R. (2011). Using electronic flashcards to promote Psychology: Learning, Memory, and Cognition, 36, 80-95.
learning in medical students: Retesting versus restudying. Medi
Snowman, J„ McCown, R„ & Biehler, R. (2009). Psychology applied
cal Education, 45, 1101-1110. to teaching (12th ed.). Boston, MA: Houghton Mifflin.
Schmidt, R. A., & Bjork, R. A. (1992). New conceptualizationsSobel,
of H. S., Cepeda, N. J., & Kapler, I. V. (2011). Spacing effects
practice: Common principles in three paradigms suggest new in real-world classroom vocabulary learning. Applied Cognitive
concepts for training. Psychological Science, 3, 207-217. Psychology, 25, 763-767.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
Improving Student Achievement 57

Sommer, T., Schoell, E.,Toppino, T. C., & Cohen, M. C.


& Büchel, S. (2009). The testing effect
(2008). and the
Associa
of the memory for object-location
retention interval. Experimental
associations
Psychology, 56, 252-257. as re
testing effect. Acta Toppino, T. C., Cohen, M. S., Davis,
Psychologica, 128, M. L., 238-248.
& Moors, A. C. (2009).
Son, L. K. (2004). Spacing Metacognitive
one's control over the distribution
study: Evidence of practice: When
for
control strategy. is spacing
Journal of preferred? Journal of Experimental Psychology,
Experimental Psycholog 35,
Memory, and 1352-1358.
Cognition, 30, 601-604.
Sones, A. M., & Stroud, J.
Toppino, B.
T. C., (1940).
Fearnow-Kenney, M. D., Review, wit
Kiepert, M. H., & Teremula,
ence to temporal position. Journal
A. C. (2009). of
The spacing effect in Educational
intentional and incidental free
665-676. recall by children and adults: Limits on the automaticity hypoth
Spitzer, H. F. (1939). Studies in retention. Journal of Educationalesis. Memory & Cognition, 37, 316-325.
Psychology, 30, 641-656. Toppino, T. C., Kasserman, J. E., & Mracek, W. A. (1991). The effect
Spôrer, N., Brunstein, J. C., & Kieschke, U. (2009). Improvingof spacing repetitions on the recognition memory of young chil
dren and adults. Journal of Experimental Child Psychology, 51,
students' reading and comprehension skills: Effects of strategy
instruction and reciprocal teaching. Learning and Instruction, 19,123-138.
272-286. Tse, C.-S., Balota, D. A., & Roediger, H. L., III. (2010). The benefits
Spurlin, J. E., Dansereau, D. F., O'Donnell, A., & Brooks, and costs of repeated testing on the learning of face-name pairs in
healthy older adults. Psychology and Aging, 25, 833-845.
L. W. (1988). Text processing: Effects of summarization fre
quency on text recall. Journal of Experimental Education,van
56,Hell, J. G., & Mahn, A. C. (1997). Keyword mnemonics ver
199-202. sus rote rehearsal: Learning concrete and abstract foreign words
Stein, B. L., & Kirby, J. R. (1992). The effects of text absent and text by experienced and inexperienced learners. Language Learning,
present conditions on summarization and recall of text. Journal 47(3), 507-546.
of Reading Behavior, 24,217-231. van Merriënboer, J. J. G., de Croock, M. B. M., & Jelsma, O. (1997).
Stein, B. S., & Bransford, J. D. (1979). Constraints on effective elab The transfer paradox: Effects of contextual interference on reten
oration: Effects of precision and subject generation. Journal of tion and transfer performance of a complex cognitive skill. Per
Verbal Learning and Verbal Behavior, 18, 769-777. ceptual & Motor Skills, 84, 784-786.
van Merriënboer, J. J. G., Schuurman, J. G., de Croock, M. B. M„ &
Sternberg, R. J., & Williams, W. M. (2010). Educational psychology
(2nd ed.). Upper Saddle River, NJ: Pearson. Paas, F. G. W. C. (2002). Redirecting learners' attention during
Stigler, J. W„ Fuson, K. C., Ham, M., & Kim, M. S. (1986). An training: Effects on cognitive load, transfer test performance and
analysis of addition and subtraction word problems in American training efficiency. Learning and Instruction, 12, 11-37.
Vaughn, K. E., & Rawson, K. A. (2011). Diagnosing criterion level
and Soviet elementary mathematics textbooks. Cognition and
Instruction, 3, 153-171. effects on memory: What aspects of memory are enhanced by
Stordahl, K. E., & Christensen, C. M. (1956). The effect of study repeated retrieval? Psychological Science, 22, 1127-1131.
Verkoeijen, R R J. L., Rikers, R. M. J. P., & Ôzsoy, B. (2008). Distrib
techniques on comprehension and retention. Journal of Educa
tional Research, 49, 561-570. uted rereading can hurt the spacing effect in text memory. Applied
Sumowski, J. F., Chiaravalloti, N., & DeLuca, J. (2010). Retrieval Cognitive Psychology, 22, 685-695.
Vlach, H. A., Sandhofer, C. M., & Kornell, N. (2008). The spacing
practice improves memory in multiple sclerosis: Clinical applica
tion of the testing effect. Neuropsychology, 24, 267-272. effect in children's memory and category induction. Cognition,
Taylor, K., & Rohrer, D. (2010). The effects of interleaved practice. 109, 163-167.
Applied Cognitive Psychology, 24, 837-848. Vojdanoska, M., Cranney, J., & Newell, B. R. (2010). The testing
Thiede, K. W„ & Anderson, M. C. M. (2003). Summarizing can effect: The role of feedback and collaboration in a tertiary class
improve metacomprehension accuracy. Contemporary Educa room setting. Applied Cognitive Psychology, 24, 1183-1195.
tional Psychology, 28, 129-160. Wade, S. E., Trathen, W., & Schraw, G. (1990). An analysis of sponta
Thomas, M. H., & Wang, A. Y. (1996). Learning by the keyword neous study strategies. Reading Research Quarterly, 25, 147-166.
Wade-Stein, D., & Kintsch, W. (2004). Summary street: Interactive
mnemonic: Looking for long-term benefits. Journal of Experi
mental Psychology: Applied, 2, 330-342. computer support for writing. Cognition and Instruction, 22,
Thompson, S. V. (1990). Visual imagery: A discussion. Educational 333-362.
Psychology, 10(2), 141-182. Wahlheim, C. N., Dunlosky, J., & Jacoby, L. L. (2011). Spacing
Thomdike, E. L. (1906). The principles of teaching based on psychol enhances the learning of natural concepts: An investigation of
ogy. New York, NY: A.G. Seiler. mechanisms, metacognition, and aging. Memory & Cognition, 39,
Todd, W. B., & Kessler, C. C., III. (1971). Influence of response 750-763.
Wang, A. Y., & Thomas, M. H. (1995). Effects of keywords on long
mode, sex, reading ability, and level of difficulty on four mea
sures of recall of meaningful written material. Journal of Educa term retention: Help or hindrance? Journal of Educational Psy
tional Psychology, 62, 229-234. chology, 87, 468-475.
Wang, A. Y., Thomas, M. H., & Ouellette, J. A. (1992). Keyword
Toppino, T. C. (1991). The spacing effect in young children's free
recall: Support for automatic-process explanations. Memory & mnemonic and retention of second-language vocabulary words.
Cognition, 19,159-167. Journal of Educational Psychology, 84, 520-528.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms
58 Dunlosky et al.

Weinstein, Comparisons
Y., between individual
McDerm and dyad study using elabora
comparison tive interrogation,
of self-selected
study study and repetitious-reading.
st
ing Applied Cognitive and
questions, Psychology, 9, 75-89. gener
Woloshyn, V. Applied,
Psychology: E., Willoughby, T., Wood, E., & Pressley, M. (1990).
16
Wenger, S. K.,
Elaborative Thomps
interrogation facilitates adult learning of factual
facilitates paragraphs. Journal of Educational Psychology, 82, 513-524. re
subsequent
chology: Wong, B. Y. L., Wong, R., Perry, N.,
Human & Sawatsky, D. (1986). The
Learnin
Wheeler, M. A.,summarization
efficacy of a self-questioning Ewers
strategy for use by
rates of underachievers and learning disabled adolescents
forgetting follow in social stud
571-580. ies. Learning Disabilities Focus, 2, 20-35.
Wong, R. M. F., Lawson, M. J., & Keeves, J. (2002). The effects of
Willerman, B., & Melvin, B. (1979). Reservations about the keyword
mnemonic. The Canadian Modern Language Review, 35(3),self-explanation training on students' problem solving in high
443-453. school mathematics. Learning and Instruction, 12, 233-262.
Wood, E., & Hewitt, K. L. (1993). Assessing the impact of elabora
Willoughby, T., Waller, T. G., Wood, E„ & MacKinnon, G. E. (1993).
The effect of prior knowledge on an immediate and delayed assotive strategy instruction relative to spontaneous strategy use in
ciative learning task following elaborative interrogation. Conhigh achievers. Exceptionality, 4, 65-79.
temporary Educational Psychology, 18, 36-46. Wood, E., Miller, G., Symons, S., Canough, T., & Yedlicka, J. (1993).
Willoughby, T., & Wood, E. (1994). Elaborative interrogation exam
Effects of elaborative interrogation on young learners' recall of
facts. Elementary School Journal, 94, 245-254.
ined at encoding and retrieval. Learning and Instruction, 4, 139—
149. Wood, E., Pressley, M., & Winne, P. H. (1990). Elaborative interroga
Wissman, K. T., Rawson, K. A., & Pyc, M. A. (2012). How and when
tion effects on children's learning of factual content. Journal of
do students use flashcards? Memory, 20, 568-579. Educational Psychology, 82(4), 741-748.
Wittrock, M. C. (1990). Generative processes of comprehension.
Wood, E., Willoughby, T., Bolger, A., Younger, J., & Kaspar, V.
Educational Psychologist, 24, 345-376. (1993). Effectiveness of elaboration strategies for grade school
Wollen, K. A., Cone, R. S., Britcher, J. C., & Mindemann, K. M.
children as a function of academic achievement. Journal of
(1985). The effect of instructional sets upon the apportionment
Experimental Child Psychology, 56, 240-253.
Woolfolk,
of study time to individual lines of text. Human Learning, 4, A. (2007). Educational psychology (10th ed.). Boston,
89-103. MA: Allyn & Bacon.
Woloshyn, V. E., Paivio, A., & Pressley, M. (1994). Use of elabora
Wulf, G., & Shea, C. H. (2002). Principles derived from the study of
tive interrogation to help students acquire information consistent simple skills do not generalize to complex skill learning. Psycho
with prior knowledge and information inconsistent with priornomic Bulletin & Review, 9, 185-211.
knowledge. Journal of Educational Psychology, 86, 79-89. Yates, F. A. (1966). The art of memory. London, England: Pimlico.
Woloshyn, V. E., Pressley, M., & Schneider, W. (1992). Elaborative
Yu, G. (2009). The shifting sands in the effects of source text summa
interrogation and prior-knowledge effects on learning of facts. rizability on summary writing. Assessing Writing, 14, 116-137.
Journal of Educational Psychology, 84 , 115-124. Zaromb, F. M., & Roediger, H. L., III. (2010). The testing effect in
Woloshyn, V. E., & Stockley, D. B. (1995). Helping studentsfree recall is associated with enhanced organizational processes.
acquire belief-inconsistent and belief-consistent science facts: Memory & Cognition, 5(8), 995-1008.

This content downloaded from


114.10.22.171 on Fri, 03 Mar 2023 03:46:40 UTC
All use subject to https://about.jstor.org/terms

You might also like