Cognitive Load Theory: Implications For Medical Education: AMEE Guide No. 86
Cognitive Load Theory: Implications For Medical Education: AMEE Guide No. 86
Cognitive Load Theory: Implications For Medical Education: AMEE Guide No. 86
AMEE GUIDE
Abstract
Cognitive Load Theory (CLT) builds upon established models of human memory that include the subsystems of sensory, working
and long-term memory. Working memory (WM) can only process a limited number of information elements at any given time. This
constraint creates a ‘‘bottleneck’’ for learning. CLT identifies three types of cognitive load that impact WM: intrinsic load (associated
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
with performing essential aspects of the task), extraneous load (associated with non-essential aspects of the task) and germane
load (associated with the deliberate use of cognitive strategies that facilitate learning). When the cognitive load associated with a
task exceeds the learner’s WM capacity, performance and learning is impaired. To facilitate learning, CLT researchers have
developed instructional techniques that decrease extraneous load (e.g. worked examples), titrate intrinsic load to the
developmental stage of the learner (e.g. simplify task without decontextualizing) and ensure that unused WM capacity is dedicated
to germane load, i.e. cognitive learning strategies. A number of instructional techniques have been empirically tested. As learners’
progress, curricula must also attend to the expertise-reversal effect. Instructional techniques that facilitate learning among early
learners may not help and may even interfere with learning among more advanced learners. CLT has particular relevance to
medical education because many of the professional activities to be learned require the simultaneous integration of multiple and
varied sets of knowledge, skills and behaviors at a specific time and place. These activities possess high ‘‘element interactivity’’ and
therefore impose a cognitive load that may surpass the WM capacity of the learner. Applications to various medical education
For personal use only.
Introduction the one who feels the tusk says the elephant is like a solid
pipe. Resolution to the conflict only occurs when an
Successful learning requires the interplay of multiple pro- ‘‘enlightened one’’ points out that each is describing one part
cesses, including those in the cognitive, affective (i.e. motiv- of the whole. Similarly, in medical education, we have multiple
ation and emotion), social (i.e. interaction with and experience theories. Each captures a ‘‘part of the whole’’. However, no
of others), environmental (i.e. location or setting) and meta- ‘‘enlightened one’’ or unifying theory of learning has (yet)
cognitive (i.e. thinking about one’s thinking) domains. Given emerged. Therefore, educators must select from amongst these
the complexity of learning, it is not surprising that many, theories and then adapt and apply them as appropriate.
sometimes competing and often overlapping theories of Cognitive Load Theory (CLT), first described by John
learning have been put forward. Schunk (2012) recently Sweller in 1988 (Sweller 1988), represents an important
categorized learning theories into neuroscience, behaviorism, cognitive learning theory, which is receiving increasing
social cognition, information processing, constructivism, cog- recognition in medical education. CLT integrates three key
nitive learning, motivation, self-regulation and development components of the cognitive architecture: memory systems
(Schunk 2012). With the plethora of theories arising from (sensory, working and long-term memory; LTM), learning
14
disparate academic disciplines, the vocabulary can be obtuse processes and types of cognitive load imposed on working
and the arguments intense. memory (WM). CLT has particular relevance to medical
20
The debates around learning theories can be reminiscent of education because the tasks and professional activities to be
the story of the elephant and the six blind men (Mallisena et al. learned require the simultaneous integration of multiple and
1933). The six blind men were asked to determine what an varied sets of knowledge, skills and behaviors at a specific time
elephant looked like by feeling different parts of the elephant’s and place. These tasks may overload the learner. CLT helps us
body. They of course came to very different conclusions. The understand how and why learners in the health professions
blind man who feels a leg says the elephant is like a pillar; the struggle with mastering the complex concepts and developing
one who feels the tail says the elephant is like a rope; the one toward expertise. CLT has also generated new instructional
who feels the trunk says the elephant is like a tree branch; the approaches that hold promise (van Merriënboer & Kirschner
one who feels the ear says the elephant is like a hand-held fan; 2013). This guide will help medical educators understand CLT
the one who feels the belly says the elephant is like a wall; and and how it can be used to optimize learning. We will
Correspondence: John Q. Young, MD, MPP, Vice Chair, Department of Psychiatry, Zucker Hillside Hospital, Hofstra North Shore-LIJ School of
Medicine, 75-59 263rd, Glen Oaks, NY 11004, USA. Tel: 718-470-4891; Fax: 718-962-7717; E-mail: [email protected]
ISSN 0142-159X print/ISSN 1466-187X online/14/50371–384 ß 2014 Informa UK Ltd. 371
DOI: 10.3109/0142159X.2014.889290
J. Q. Young et al.
Practice points map to enable retrieval when the information is needed in the
future.
Cognitive Load Theory (CLT) builds upon an estab-
Unlike sensory memory and LTM, WM is not infinite. In a
lished model of human memory that includes the
famous 1956 article, Miller postulated that the WM cannot
subsystems of sensory, working and long-term process more than about seven independent units at a time
memory. (Miller 1956), an assertion that subsequent research has
Working memory (WM) can only process seven
confirmed. The arrows in Figure 1 show the flow of
elements of information at any given time. This information.
constraint creates a ‘‘bottleneck’’ for learning.
CLT delineates three types of cognitive load that Sensory memory
impact WM: intrinsic (essential to the task), extraneous
(not essential to the task) and germane (load imposed Learning progresses through distinctive pathways of the
by the learner’s deliberate use of cognitive strategies to human memory system (Issa et al. 2011). This process starts
facilitate learning, i.e. schemata construction). with the sensory memory system. CLT is based on the dual
When the cognitive load associated with a task channel principle—the notion that learners have separate
exceeds the learner’s WM capacity, performance and channels for perceiving and processing auditory and visual
learning is impaired. information (Paivio 1986). In medical education, the majority
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
CLT has particular relevance to medical education of sensory information comes in the form of sounds (e.g.
because the tasks are complex and may impose a spoken words) and images (e.g. printed words and pictures),
cognitive load that surpasses the WM capacity of the though touch and smell are also important. Printed words and
learner. pictures (e.g. graphs and facial expression of a patient) are
To facilitate learning, CLT focuses on instructional perceived by the eyes and briefly held in the visual sensory
techniques that decrease extraneous load (e.g. worked memory system (also called iconic memory). Spoken words
examples), titrate intrinsic load to the developmental and other sounds (e.g. heartbeat and the patient’s answer to a
stage of the learner (e.g. simplify task without question) are perceived by the ears and briefly held in the
decontextualizing) and ensure that unused WM cap- auditory sensory memory system (echoic memory). The
acity is dedicated to germane load, i.e. cognitive sensory memory system has enormous capacity—the visual
For personal use only.
strategies that facilitate learning. and auditory systems perceive a vast amount of incoming
CLT is also consistent with an approach to curricular information but can hold any given piece of information for
design called 4C/ID, which includes several important only a very brief period of time (from less than 0.25 to 2
elements: authentic learning tasks, supportive infor- seconds) (Mayer 2010). Most of the perceived information
mation that is adapted to the expertise of the learner, does not reach conscious awareness. But when a learner
feedback and opportunities for part-task practice as attends to information in sensory memory, such as the words
necessary. of an attending clinician describing the pathophysiology of
congestive heart failure, the information moves to WM.
Working memory
summarize CLT and the cognitive architecture it assumes and A learner must have intact capacity for attention in order to
then explore how CLT informs instructional technique and ‘‘screen out’’ irrelevant stimuli (e.g. the bird chirping outside or
curriculum design in medical education. a peer rustling through his backpack during a lecture) and
‘‘screen in’’ the relevant words and images (e.g. the patient’s
history or rash) from the sensory memory system for process-
CLT and the human memory ing in the WM (Mayer 2010). As said, a learner’s WM can hold
no more than seven (2) information elements at a time
system
(Miller 1956) and can actively process (i.e. organize, compare
CLT is about memory and builds upon a pre-existing model of and contrast) no more than two to four elements at any given
human memory developed by Atkinson and Shiffrin in the moment (Kirschner et al. 2006). In addition, WM can only hold
1960s (Atkinson & Shiffrin 1968). Figure 1 depicts how the an information element for a few seconds with almost all
three components of memory proposed by the Atkinson and information lost after 30 seconds unless it is actively refreshed
Shiffrin model relate to each other. In short, information enters by rehearsal (e.g. repeating to oneself an important laboratory
the mind through the sensory memory system. This sub-system value or phone number that one has verbally received until
can simultaneously process huge amounts of visual and one is able to write it down). The limited capacity of WM has a
auditory information, but retains the information only for a profound impact on the rate of learning. Many learning tasks,
very short period of time (milliseconds). Information raised to especially complex clinical activities, entail more than seven
awareness enters the domain of WM. WM (re-)organizes the units of information. For the learner to work within these
information so that it may be efficiently stored as packages in constraints, all of the information elements must be combined
LTM. The LTM has theoretically limitless capacity in terms of and organized into a few meaningful units, also called
duration and volume, but a route map is required to find the ‘‘chunks’’. Information processing in WM refers to mentally
information. The WM encodes the information with this route rearranging the words and images into a coherent cognitive
372
Cognitive Load Theory
Retention and
Retention 25-2000
capacity
milliseconds;
Retention15-30 theoretically
large capacity
seconds; infinite
capacity limited
(7 ± 2 units)
retrieval
images aenon Working
Echoic and Iconic encoding, storage Long-term
Memory
Sensory Memory
Memory
representation (or schema) and connecting this with relevant and/or will be used. Illness scripts represent a type of
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
prior knowledge activated from LTM. This occurs, for example, schemata (Bowen 2006; Boshuizen & Schmidt 1992); for
when we construct the schema of chicken noodle soup after example, the illness script for a major depressive episode
observing a bowl filled with a steaming yellow liquid with organizes the various symptoms and signs into one construct.
noodles and small bits of white meat or when the student This reduces the number of individual information elements
examines the tracings of an ECG (visual images) and identifies from nine or more (symptoms and signs) to one schema or
normal sinus rhythm with ST elevation (a cognitive represen- chunk and helps the learner differentiate a major depressive
tation). In each case, multiple pieces of information are re- episode from similar illnesses such as dysthymia. Thus,
arranged into one representation, which can then be activated schemata organize knowledge in LTM and substantially
in the WM as one single element. reduce WM load because even a highly complex schema
Dual-channel theory places an additional constraint on can be retrieved and processed as one information element in
For personal use only.
board configuration for five seconds from a real game and then
nentially. This makes trial and error (or random) testing of
asked to reproduce that configuration on a new board, chess
possible combinations effectively impossible when there is a
masters were able to reproduce the board with 70% accuracy
high degree of interactivity. An example from Anatomy and
compared with 30% accuracy for amateur players. Yet, when
Physiology can illustrate this. Learning the anatomy of the
random board configurations were used, both groups per- heart, including the four chambers, the septum, the valves,
formed equally poorly. Masters were only superior on config- etc., has relatively low element interactivity, i.e. the names of
urations taken from real games (Chase & Simon 1973; Sweller structures do not change due to interactions between the parts.
& van Merrienboer 2013). These results have been replicated In contrast, learning about cardiac output has much higher
in a variety of other areas, including baseball (Chiesi et al. element interactivity—preload, afterload and contractility
1979), electronics (Egan & Schwartz 1979) and algebra (three information elements) interact to determine stroke
(Sweller & Cooper 1985). It can take years and thousands of volume (another information element), which in turn interacts
hours of practice to obtain the knowledge associated with high with heart rate (yet another element) to determine cardiac
levels of problem-solving skills (Ericsson & Charness 1994). To output. A change in one factor (such as preload) will influence
use the computer analogy from above, the seven documents other factors (such as stroke volume). The higher degree of
that experts are able to open in WM have much more interactivity increases the intrinsic load.
information that is of higher quality and better organized than The intrinsic load imposed by element interactivity can be
non-experts. modulated by the learner’s expertise (i.e. the availability and
automaticity of their schemata). When a more advanced
learner already possesses a schema that incorporates some or
Cognitive load and CLT all of the interacting elements into a single element (e.g. the
Although schemata are stored in LTM, their construction and construct of stroke volume, which then entails the three
refinement occurs in WM. CLT was initially developed by John elements of preload, afterload and contractility), the intrinsic
Sweller in the 1980s (Sweller 1988). As described above, CLT load of that learning task is reduced. Therefore, intrinsic load
starts with the premise that each learner has limited WM. This generated by a task cannot be altered by instructional
premise has important implications for instructional design. interventions without either simplifying the task to be learned
Because learning requires the processing of information in or first enhancing the expertise of the learners by providing
WM, learning suffers when the cognitive load of the task preparatory training prior to the task.
exceeds the WM capacity of the trainee. Therefore, CLT
Extraneous cognitive load
prioritizes optimizing information processing in WM. CLT
identifies three types of cognitive load: Extraneous load refers to the load imposed upon the trainee’s
(1) Intrinsic load—load associated with the task. WM but not necessary for learning the task at hand, i.e. for
(2) Extraneous load—load not essential to the task. schemata construction or automation. CLT emphasizes how
374
Cognitive Load Theory
instructional techniques can inadvertently impose extraneous (A) Cognitive load of an early learner performing the task
load by, for example, providing insufficient guidance and
thereby forcing learners to employ weak problem-solving Extraneous Intrinsic
methods such as trial and error or to search for information
needed to complete the task. Similarly, when information
(B) Cognitive load of an advanced learner performing the
necessary for learning is distributed in space (e.g. requiring task with no intention to learn
multiple textbooks or with the physical separation of the
written text from the accompanying pictures) or time (e.g. Extraneous Intrinsic
across different lectures), scarce WM resources are used to
search for the information and bring it together. A teacher
(C) Cognitive load of an advanced learner performing the
provides visual overload when he shows full text slides but task and learning
allows too little time for the learners to read them; if, in
addition, he gives simultaneous verbal information that does Extraneous Intrinsic Germane
not align with the (visual) slides, distracting (extraneous)
cognitive load is introduced that will impair both channels of
Figure 3. The composition of cognitive load in early and
information. Extraneous load arises when information that is
advanced learners performing a similar task.
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
the task is high for that particular learner. If the task-associated Theoretically, however, there is unused WM space that
intrinsic load is low, then the extraneous load may not harm can be allocated to learning (germane load) as is shown in
learning as long as the total load remains within the learner’s Figure 3(C).
WM limitations (Carlson et al. 2003).
Measuring cognitive load
Germane cognitive load As a result of organizing knowledge elements into a cognitive
Germane load refers to the load imposed by the mental schema which can then be treated as one element in WM, an
processes necessary for learning (such as schemata formation identical task may surpass the WM capacity in one learner but
and automation) to occur. There is some debate as to whether not in a more skilled other learner. It is therefore important to
germane load constitutes its own category or is best under- account for the interaction between the cognitive load
stood as a constituent of intrinsic load. We conceptualize imposed by a given task and the learner’s level of competence
germane load as representing the effort associated with and the quality of her schemata at that time. The concept of
learning that is separate and in addition to the effort associated ‘‘mental effort’’ does this by representing the proportion of a
with holding the relevant interacting elements in WM, i.e. the learner’s WM capacity that is allocated to a given task. Mental
intrinsic load of performing the task. Put simply, germane load effort varies directly with cognitive load and inversely with
can be viewed as the learner’s level of concentration devoted freely available cognitive capacity. A number of measurement
to learning (as opposed to performing the task). Germane load techniques have been tested, including learner self-rating of
is regulated by the individual. When the extraneous and/or effort (during the task) or difficulty (after the task), response
intrinsic load are too high and approach or exceed the time to a secondary task presented during the task, perform-
learner’s WM limits, there will be insufficient WM resources ance (e.g. number of errors per task) and psychophysiological
available for the germane load necessary for learning (e.g. measures (e.g. heart rate variability or electrical skin conduct-
combining the new information elements with already existing ance; van Merriënboer & Sweller 2005; DeLeeuw & Mayer
schemata in LTM). 2008). Learner self-rating has been the most commonly used
Figure 3 shows how a novice and advanced trainee will strategy because it is inexpensive and has established validity
experience the same task differently with respect to cognitive (Paas et al. 2003). Moreover, self-rating instruments have
load. In all three scenarios, extraneous load is the same. For recently been developed that aim to measure not only overall
the novice (Figure 3A), the task is complex and requires more cognitive load but also intrinsic (e.g. rate the complexity of the
effort merely to execute. The intrinsic load caused by the task topic covered in the activity), extraneous (e.g. rate the clarity
is high for this learner, who’s WM will become easily of the instruction for the activity) and germane (e.g. rate how
overwhelmed, leaving no WM resources for learning (ger- much the activity enhanced your understanding of the topic)
mane load) and, in this case, insufficient WM for the task itself. load separately (DeLeeuw & Mayer 2008; Leppink et al. 2013).
Figure 3(B) illustrates the principal difference between CLT researchers have also developed measures of the quality
375
J. Q. Young et al.
of the available schemata, a critical determinant of how Synergies with other learning
much of WM’s resources will be allocated for a given task perspectives
(Kalyuga 2009).
There are, of course, numerous theories of learning. A number
of these other perspectives complement CLT and help us
develop a broader view of learning. We briefly describe the
CLT and the learning processes following theories: situated cognition, self-regulation and
CLT is applicable to all activities in life that involve executing emotion and motivation theories. These theories argue that
tasks, but has been mostly studied in the setting of education. familiarity with the task and the schema(ta) activated (i.e.
As already discussed, CLT focuses on the management of WM chunking) are critical elements of cognitive load but do not
during learning. The two major learning processes are provide a complete view of cognitive load in a given situation.
schemata construction and automation. Learners construct Situated cognition argues that thinking is ‘‘situated’’ or
schemata (also referred to as scripts) during knowledge nested in the specifics of the encounter (Brown et al. 1989).
acquisition and problem-solving by combining and re-combin- In other words, participants other than the learner (e.g. in a
ing elements together into larger and more refined chunks. clinical setting, the patient and perhaps the nurse and/or the
Several cognitive processes facilitate this process, including the attending) and the environment (e.g. ambulatory or inpatient)
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
following: (1) activating prior knowledge; (2) comparing new influence and interact with the learner. From this perspective,
information with what they already know; and (3) elaborating the physician’s cognitive reserve (load) is influenced by these
knowledge, i.e. incorporating new elements into schemata participant and encounter factors. The greater the number of
already stored in LTM or obtaining already schematized elements and their interactivity, the greater the expected
information from other people such as supervisors or peers impact on cognitive load.
(Taylor & Hamdy 2013). Similar to situated cognition, self-regulation argues for an
With extensive practice, a schema can become fully emergent result (outcome) based on a variety of interactive
automated and can act as a central processor, organizing elements (Cleary & Zimmerman 2001). Self-regulation divides
information and knowledge without conscious effort, and, performance into three phases: forethought (before), perform-
therefore, without burdening WM. With automation, familiar ance (during) and reflection (after) phases. As such, cognitive
load is influenced by the varying cognitive demands of these
For personal use only.
have built an elaborate LTM in their professional domain, with Merrienboer 1994), meaning that learners should practice on
efficient chunks, schemata or scripts and have used these problems that differ in the same dimensions as in the real
pathways to arrive at solutions on a regular basis, which in turn world. For example, only after seeing bipolar illness or
enhances their ease of retrieval (Ericsson 2006). pneumonia in multiple settings and scenarios do physicians
Expert clinical reasoning utilizes two modes of thinking: a become adept at modifying their diagnostic and treatment
rapid generation of ideas that appear as recognized patterns strategies to the various (typical and atypical) presentations
and a slow analytic reasoning process (Eva 2005). For routine of those illnesses.
problems, and for very experienced experts, pattern recogni- Complex skills such as clinical reasoning develop over time
tion is dominant and leads to correct solutions most of the as a function of practice. According to traditional phase models
time. Novel problems, on the other hand, require analytic (Dreyfus & Dreyfus 1980), an expert would simply be
reasoning. Novices cannot adequately recognize patterns and described as someone who has automated most of his or her
draw conclusions based upon them; doing this leads to task performance. CLT, however, is more in line with System
guessing. They must always use analytic reasoning. Nobel models according to which experts not only differ from
prize winner Kahnemann has named these two modes: System novices in that they have automated many routine aspects of
1 thinking for the rapid pattern recognition and System 2 tasks (i.e. superior System 1 functioning), but their deep
thinking for the slow analytic reasoning (Kahneman 2011). understanding of the domain (i.e. in rich cognitive schemata)
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
Other system models make this same distinction between also allows them to recognize and interpret new problem
automatic processing, which is fast, unconscious, inflexible situations in more general terms, to monitor and to reflect on
and intuitive because it uses mental shortcuts (System 1), and the quality of their own performance and to detect and correct
controlled processing, which is slow, conscious, flexible and errors (i.e. superior System 2 functioning).
effortful (System 2)(Shiffrin & Schneider 1977; Kahneman The dual reasoning systems model in clinical reasoning has
2011). Importantly, Systems 1 and 2 not only work in parallel recently been critiqued and the cognitive continuum theory
but also interact with each other. In particular, System 2 can be has been proposed as an alternative (Custers 2013). This
employed to monitor the quality of the answers provided by perspective argues that mental processing does not consist of
System 1; and if it is convinced that our intuition is wrong, then two distinct modes (either one or the other). Rather mental
it is capable of correcting or overriding the automatic processing occurs along a continuum, with System 1 and
judgments. Novices and experts thus differ from each other System 2 representing the two poles. Furthermore, cognitive
For personal use only.
in both System 1 and 2 processing. continuum theory argues that most clinical situations require a
Van Merriënboer (2013) describes the implications of mode of thinking somewhere in between pure System 1
System 1–System 2 models for training complex skills such (intuitive) or System 2 (rational) thinking, i.e. a form of quasi-
as clinical reasoning (van Merriënboer 1997, 2013; van rational thinking. From an educational point of view, the task is
Merriënboer & Kirschner 2013). First, it is clear that practice to prepare trainees to move between these modes effectively
aimed at the development of such skills must attend to the and appropriately, whether these modes represent dual
development of both Systems 1 and 2 processing, and that systems or lie upon a continuum.
learners must also learn to co-ordinate both types of process- ‘‘Encapsulation’’ has been proposed as the mechanism by
ing. In other words, practice must aim at the development of which schemata are automated and effective System 1 thinking
routine aspects of behavior as well as the development of non- emerges (Boshuizen & Schmidt 1992; Schmidt & Boshuizen
routine aspects of behavior, such as conscious reasoning (i.e. 1993). Studies on novices, intermediate learners and experts
use of domain knowledge to infer tentative problem solutions) demonstrate that biomedical knowledge is efficiently stored in
and conscious decision-making (i.e. use of cognitive strategies LTM, but that increasing levels of expertise are associated with
to approach problems in a systematic fashion). For a novice less conscious application of that knowledge. That knowledge,
learner, those aspects that need to be developed into System 1 however, has not been forgotten (erased) but rather
behaviors are called recurrent skills (van Merriënboer 1997); embedded within more elaborate schemata. These schemata
they are treated as being consistent from problem situation to in clinical medicine that constitute the chunks of encapsulated
problem situation. Critical to the development of recurrent knowledge in LTM have been called ‘‘illness scripts’’. Illness
skills is repetitive practice. For example, after vast amounts of scripts include three features of a disease entity: causal factors
repetitive practice, pathologists become expert microscope and etiology (called ‘‘enabling factors’’), physical disease
users because they have developed cognitive rules that drive mechanism (‘‘fault’’) and the resulting signs, symptoms and
particular actions under particular circumstances—their finger prognosis (‘‘consequences’’) (Feltovich & Barrows 1984;
movements to zoom in, zoom out, and position the slide are Schmidt & Boshuizen 1993; Custers et al. 1998). When
directly (unconsciously) driven by System 1 regardless of necessary, illness scripts and the embedded biomedical
whether the slide is showing infectious, vascular, nutritional or knowledge can be unpacked and the elements used separ-
other injuries. Repetitive practice also yields cognitive expert- ately. In terms of memory architecture, the expert deals with
ise; for example, the ability to immediately distinguish normal familiar clinical situations stored as illness scripts in LTM as
from abnormal tissue. In contrast, those aspects that need to be single units in the WM, only to be de-capsulated when
developed into non-routine, System 2 behaviors are called something unfamiliar happens. This frees much of the WM to
non-recurrent skills; these behaviors differ from problem enable the processing of other information.
situation to problem situation. Critical to the development of The lesson for the teaching of clinical reasoning is that
non-recurrent skills is variability of practice (Paas & Van starting with studying complete (whole-task) cases that are
377
J. Q. Young et al.
relatively simple, thus, not necessarily ‘‘authentic’’, can help interactivity and therefore impose a cognitive load that may
with building rough illness scripts that can be refined later. surpass the WM capacity of the learner. A specific approach
By repeatedly analyzing cases with relatively few features, when the learning tasks at hand are complex is to provide
students eventually develop the ability to intuitively recognize scaffolds (worked examples are often recommended) and to
groups of features that are caused by the same disease (Custers simplify tasks without de-contextualizing them (whole-task
2013). With this foundation, students may enter the clinical approaches are often recommended). When a task is very
environment and experience the subtle differences in cases complex, peer collaboration has been recommended to
that enable refinement. alleviate individual cognitive load (Schunk 2012). This may
imply that learners within a group would divide parts of the
task among themselves.
CLT and instructional design Based on CLT, instructional approaches that have been
proposed are whole-task approaches, the elaborate four-
Medical education is in reality a continuum of activities that
component instructional design (4C/ID) approach that is
spans undergraduate medical education, graduate medical
consistent with both CLT and System 1–System 2 theory and
education and continuing medical education. CLT provides a
numerous empirically derived instructional techniques. All
framework for the design and implementation of these
apply to what has been called ‘‘complex learning’’, i.e. the
activities. In particular, CLT contends that we can best facilitate
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
techniques helpful to early learners (e.g. decreasing extrane- been ‘‘learned’’ in school to what they must ‘‘do’’ at work
ous load or simplifying the task) are not helpful to experts and (Konkola et al. 2007). One reason for this is that the
can even result in worse performance (van Merriënboer & educational programs decompose the real life tasks into
Sweller 2010). Although many examples of the expertise- fragments that are taught at different moments in the curricu-
reversal effect have been reported in the literature (Kalyuga lum. The acknowledged need to integrate the teaching of
2007), the use of worked examples and conventional prob- similar topics (e.g. heart) from different disciplines (e.g.
lems provides a good illustration. Novice learners can only physiology, pathology, pharmacology) and courses has led
solve conventional problems through weak-method problem to horizontally and vertically integrated curricula (O’Neill et al.
solving (e.g. means-ends analysis) which, in turn, impose a 2000), but the whole-task approach goes a step further. To
high extraneous cognitive load and do not help novice understand the problem of fragmentation in terms of the WM,
learners to construct cognitive schemas in LTM. Thus, novice transfer requires the retrieval and combination of too many
learners learn more from studying worked examples than from separate elements from the LTM. The WM simply cannot
solving the equivalent problems. For more advanced learners, combine all these elements and even more so under time and
worked examples become superfluous because they have other pressures of the working environment. The learner lacks
already developed useful schemas in LTM. The presentation of the bigger chunks, stored in the LTM, that constitute the
worked examples may even interfere with the schemas they combinations of the required elements. Think of specific
have available in memory. Thus, in contrast to novice learners, declarative (what to do) and procedural (how to do it)
more advanced learners learn more from solving conventional knowledge, together with psychomotor skill and with the right
problems than from studying the equivalent worked examples. attitude or context sensitivity that are all necessary to perform
Learning how to complete a history provides another example. the task in real life. Whole-task training approaches are holistic
To assist early learners in medicine, mnemonic aids are often in the sense that from the start all of these are combined and
used to facilitate recall such as ‘‘OPQRST’’ for characterizing resemble the real life situation.
chest pain (onset, provocation, quality, radiation, severity and Several innovations in medical education over the past
timing). These mnemonics can be extremely helpful when decades employ a whole task approach and may be successful
students face their first clinical encounters, but do not help because of their effect on the regulation of WM processes.
experienced physicians. Use of these acronyms can in fact Problem-based learning is such a holistic approach (Dolmans
slow down their practice. et al. 2013), as are horizontal integration (Harden et al. 1984)
CLT has particular relevance to medical education in the and vertical integration (Wijnen-Meijer et al. 2010). A recent
clinical workplace because the tasks and professional activities discussion about the risk of fragmentation of medical
to be learned require the simultaneous integration of multiple competencies by applying a competency-based approach
and varied sets of knowledge, skills, and behaviors at a specific (Grant 1999; Lurie et al. 2009) has led to the concept of
time and place. These activities possess high element entrustable professional activities, a more holistic framework
378
Cognitive Load Theory
Table 1. Summary of the four components instructional design approach to medical education.
2. Supportive information aspects of the whole task Explain how to systematically approach tasks in the domain and how
the domain is organized
Promote elaboration of new information through self-explanation,
questioning, group discussion, etc.
Promote reflection through cognitive feedback
3. Procedural information Tell how to perform routine aspects of the task (how-to instructions)
Promote the formation of automated schemas through providing just-in-
time instructions precisely when learners need them during whole-task
Schema automation for recurrent aspects performance
of the whole task Promote learning of routines through corrective and immediate feedback
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
4. Part-task practice Provide repetitive practice for selected routine aspects of the whole task
for clinical training and assessment (ten Cate 2005; ten Cate & guide a thinking step (e.g. information about the adverse
Scheele 2007; Pangaro & ten Cate 2013). Curriculum designers effects of specific medications when anticipating a con-
in medicine struggle with the requirements to include an ever sultation of a known patient), as the information is not
expanding knowledge base. The traditional and all too (yet) present and retrievable from LTM. Supportive
common solution of granting many experts space so that information is relevant for the development of non-
each may add small contributions leads to fragmentation and recurrent aspects of a task (System 2) and can be seen as
transfer problems. Another example is the scheduling of short scaffolding that should be reduced in the course of skill
For personal use only.
clerkships, i.e. one or two or four weeks (Holmboe et al. acquisition at one particular level of complexity.
2011). As a reaction, longitudinal clerkships (up to one year) Supportive information usually has high element inter-
are being proposed (Hirsh et al. 2012). activity, which makes it less useful to be presented during
Lengthening clinical attachments to allow for time to digest the task execution.
stimuli and establish coherence in the LTM is one way. (3) Procedural information telling the trainee what to do,
Another option is to narrow down clinical experiences to a step by step. It is relevant for the development of
small domain, but intensify the work and responsibility in that recurrent aspects of a task (System 1). As it provides
domain. It is remarkable how well junior students are able to little element interactivity, it can best be presented during
grasp enough of the ins and outs of those tasks to practice at a the task, exactly when the learner needs it. Procedural
high level in patient care (Chen et al. 2014). information includes direct feedback information about
the task execution. A clinical supervisor could provide this
The 4C/ID approach information.
An elaborate instructional model that fully aligns with CLT is (4) Part-task learning opportunities to rehearse and store
the 4C/ID approach to complex learning (van Merriënboer & chunks in LTM that enable gearing at higher complexity
Kirschner 2013). The four components (see Table 1 for a levels in subsequent tasks. One should be careful to apply
summary) focus on the specification of the following: part-task rehearsals as they should not become stand-
(1) A series of learning tasks. These should be holistic, alone learning tasks, but in some cases, it is very helpful
integrated tasks authentically resembling vocational or to practice subtasks. In surgical specialties, suturing is
professional practice. The series consists of tasks that evidently a part-task that can be well practiced separately.
increase in complexity. If the task were a patient The 4C/ID approach has been elaborated into a 10-step
consultation, the simple version would be a patient who procedure in a book that provides many more details (van
communicates well, with a single question, a clear disease Merriënboer and Kirschner 2013).
that requires a routine approach for diagnosis and
CLT-derived instructional techniques
treatment and has excellent prognosis. The complex
case would be a patient with impaired communication CLT researchers have used their model of learning to generate
ability, with multiple seemingly unrelated signs and and test a number of instructional techniques aimed at
symptoms, which requires elaborate investigations that managing cognitive load (Sweller 2005; Plass et al. 2010;
may end in differential critical diagnoses with difficult Sweller et al. 2011; Sweller & van Merrienboer 2013).
treatment and suboptimal prognosis. Table 2 describes a number of these techniques and how
(2) Supportive information that is typically studied by they might apply to medical education. The techniques are
the learner before the task requires this information. organized by the four principal instruction strategies of
This information can be specific content information to CLT: minimize extraneous load; manage intrinsic load when
379
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
For personal use only.
Table 2. Applying CLT to medical education: Instructional techniques to manage cognitive load.
380
Application to medical education
CLT instructional
J. Q. Young et al.
Decreasing Worked example Provide learner with at least one dem- In renal physiology, the instructor pro- During their first week of the Internal An intern reviews the progress notes of
extraneous load onstration of the problem solution vides an example of how she cal- Medicine clerkship, the student their supervising resident to gain a
path rather than requiring the learner culated the anion-gap before asking observes an attending interview a better understanding of the compo-
to search for the solution themselves students to do so on their own patient. The attending stops to nents of a good written
explain each step documentation
Problem completion Provide learner with a partially com- To help the trainee learn how to calcu- The student observes the surgery When doing chart reviews, the student
pleted problem and ask them to late the sensitivity and specificity of a until the end when asked to do the will first develop a differential diag-
complete the missing steps diagnostic test, the trainee is given a final suturing nosis after reading the history and
worksheet of partially completed The student is informed that a patient physical exam but before reading
problems is having angina and then is asked to the assessment
do a history and physical exam
Split-attention Two or more sources of information that Because copies of the lecture slides At the beginning of a year-long rota- While studying the cardiovascular
cannot be understood in isolation have not been distributed, a resident tion, the students receive orientation system at home, a student reads a
are presented separately in space or allocates almost all of her concen- for activities they will not perform the textbook in which the diagram of the
time. For example, a diagram and tration to copying the content of the mid-point heart’s anatomy occurs on a separ-
text are unintelligible without the projected slides rather than making On the inpatient ward, the laboratory ate page from the text describing the
other and presented on separate connections between the presented values, medications administered function of each anatomic structure
pages information and her prior knowledge and the physician clinical documen-
tation area each kept in separate
places
Modalitya Present information in visual and audi- Auditory explanations (rather than writ- When receiving sign-out from another When practicing suturing, the student
tory modalities rather than only one ten explanations on the screen) resident on a patient in the ICU, the listens to instructions on an audio
modality. Working memory includes accompany a digital animation of laboratory data and trends are pre- tape (rather than reading written
two partially independent proces- how the lungs work. (Instructional sented in graphical format while the instructions) while practicing on the
sors, the ‘‘visual-spatial sketchpad’’ videos created by the Khan overall assessment is presented pig’s foot
and the ‘‘phonological loop’’. If all Academy illustrate this principle) verbally
information is presented in visual
modality, the visual system may
become overloaded. By off-loading
some information to the auditory
system, the load on the visual
system is reduced and learning may
be facilitated
Transient information For a long, complex statement that Introduce pauses in the complex state- Before a resident is asked to offer her When taking histories for the first time,
includes a large number of novel, ment to reduce the number of inter- diagnosis and treatment recom- the student writes down key infor-
interacting elements, the learner has acting elements and facilitate easier mendations for a complicated mation elicited by the patient to help
difficulty processing the statement in memorization. If auditory explan- patient, the resident is allowed to remember
working memory—spoken state- ations accompany a long and com- first read the written history and
ments are transient. If the statement plex animation of brain functioning, it physical exam
is presented in written form, relevant may be helpful to divide the anima-
sections can be repeatedly recon- tion into segments or allow for stop
sidered because written statements and replay.
remain permanently (physically)
available
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
For personal use only.
Redundancy Information presented that not required A diagram illustrating electrical conduc- Interruptions such as a pager beeping Students’ text on their smart phones
for learning, imposing extraneous tion in the heart may not need a while learning how to place an about social plans during lecture
cognitive load statement describing the flow of indwelling catheter
impulses.
Manage intrinsic Isolated elements For learning tasks that possess The Krebs cycle is taught in stages. Students learn how to take a history and With a goal of learning how to critically
load exceptionally high levels of element how to perform a physical exam appraise an article, the trainee first
interactivity due to intrinsic cognitive separately before being asked to do studies the criteria for internal validity
load, it may be preferable to reduce a full history and physical before focusing on the criteria for
the intrinsic cognitive load in the external validity
early stages to avoid overload. This
is done by presenting only some of
the elements initially
Strategies to help learners develop
‘‘chunks’’ of combined information
for parts of the task to relieve overall
load
Progress from low- to Work from tasks with a low physical Students are expected to work through Students learn how to interview patients In practicing a case presentation, the
high-physical fidelity fidelity (e.g paper-based patient a paper-case with increased time by observing, practicing with stan- student may first practice privately
cases) to tasks with a higher physical pressure dardized patients, interacting with and then practice in front of a peer or
fidelity (e.g. scenarios using manne- real patients with direct supervision, friend more casually and then prac-
quins or standardized or real interacting with real patients with tice simulating the actual setting
patients) asynchronous supervision
Progress from simple Gradually increase the complexity Problem-based learning curricula pro- As residents master their clinical skills, While preparing for a licensing exam,
to complex (number of interacting elements) gress from simple to more complex they are allowed to treat increasingly the trainee starts with easier prob-
paper cases in which the presenta- complex patients with more auton- lems before progressing to more
tion is not typical and the diagnosis omy. This may include treating difficult problems
is less clear patients with higher acuity, more co-
morbidities, higher-risk treatments,
and/or less than expected response
Optimize germane Contextual interference A type of variability where different In a lecture on recognizing heart mur- A student first learns how to treat In studying for a licensing exam, the
load versions of a task (A, B and C) are murs, the instructor first plays each depression in a specialized depres- student first does practice questions
practiced in a random type (e.g. mitral valve regurgitation) sion clinic where all patients have on a given topic and then takes
(ACBBCABAC) rather than blocked repeatedly and then plays them in already been diagnosed with practice tests in which all topics are
order (AAA-BBB-CCC) random order asking students to depression and then rotates in a covered
indicate their answer via audience clinic where patients present without
response system a diagnosis
Variability Learning is enhanced when the vari- In a class on evidence-based medicine, For clinical training for a given common The resident decides to moonlight at a
ability of the task/problem is the students are asked to critique disease, the program ensures clinic with a very different patient
increased. This increases the the internal validity of different types exposure to variable presentation, population than his training program
number of interacting elements of studies (RCT, observational and age, gender, setting, co-morbid
associated with the intrinsic cogni- meta-analysis) of varying quality medical diagnoses of a given dis-
tive load ease and exposes the resident to
patients with similar symptoms who
do not have the disease
Imagination Learners perform at a higher level when In a lecture on mindfulness, students The attending asks the resident to The student repeatedly visualizes herself
asked to imagine a concept or pro- are led through a guided exercise imagine each step of a lumbar doing the entire physical exam
cedure than those asked to study puncture before practicing the
the same concept or procedure. procedure
Tends to apply to more expert
learners when they need to entrench
and automate schemata
(continued )
Cognitive Load Theory
381
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
For personal use only.
Table 2. Continued
382
Application to medical education
CLT instructional
J. Q. Young et al.
necessary; optimize germane load; and address the expertise- Declaration of interest: The authors report no conflicts of
reversal effect. Examples are given for each of these strategies interest. The authors alone are responsible for the content and
for three different types of instructional settings: class-room, writing of the article.
workplace and self-directed learning. Most of these instruc-
tional techniques have been empirically tested and proved to
be effective in multiple studies.
References
Atkinson RC, Shiffrin RM. 1968. Human memory: A proposed system and its
Conclusion control processes. In: Spence KW, Taylor Spence J, editors. Psychology
of learning and motivation. New York: Academic Press, 89–195.
CLT builds upon a cognitive architecture that includes a model
Boshuizen HP, Schmidt HG. 1992. On the role of biomedical knowledge in
of human memory (sensory, working, and long-term) and clinical reasoning by experts, intermediates and novices. Cogn Sci 16:
assumptions about how learning occurs (schemata construc- 153–184.
tion that is refined in WM and may then be encoded and Bowen JL. 2006. Educational strategies to promote clinical diagnostic
automated in LTM via conscious practice). The theory draws reasoning. N Engl J Med 355:2217–2225.
attention to how WM, with its limited capacity, represents a Brown JS, Collins A, Duguid P. 1989. Situated cognition and the culture of
learning. Educ Res 18:32–42.
‘‘bottleneck’’ in the formation of LTM (or learning). Therefore,
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
learning by increasing germane load. A number of instruc- practice by experts, non-experts, and novices. J Appl Sport Psychol 13:
tional techniques have been developed and empirically tested. 185–206.
Importantly, as learners’ progress, curricula must also attend to Custers EJ. 2013. Medical education and cognitive continuum theory: An
the expertise-reversal effect. alternative perspective on medical problem solving and clinical
CLT offers a framework and a rich set of tools with which to reasoning. Acad Med 88:1074–1080.
Custers EJFM, Boshuizen HPA, Schmidt HG. 1998. The role of illness scripts
design instruction. Its application to medical education is
in the development of medical diagnostic expertise: Results from an
relatively new. Future research will need to identify which interview study. Cogn Instr 16:367–398.
instructional techniques are most effective at managing cog- Deleeuw KE, Mayer RE. 2008. A comparison of three measures of cognitive
nitive load in the setting of medical education and how these load: Evidence for separable measures of intrinsic, extraneous, and
techniques interact with the developmental stage of the germane load. J Educ Psychol 100:223–234.
Dolmans DH, Wolfhagen IH, Van Merrienboer JJ. 2013. Twelve tips for
trainee. Future research will also need to determine how
implementing whole-task curricula: How to make it work. Med Teach
best to simplify and sequence ‘‘whole-tasks’’ for the early 35(10):801–805.
learner and then how to address the expertise-reversal effect Dreyfus SE, Dreyfus HL. 1980. A five-stage model of the mental activities
for the more advanced learner. Developing valid methods for involved in directed skill acquisition. Washington, DC: Storming Media.
measuring cognitive load and its components will be critical to Egan DE, Schwartz BJ. 1979. Chunking in recall of symbolic drawings. Mem
testing both the applicability of the theory and the efficacy of Cognit 7:149–158.
Ericsson KA. 2006. The Cambridge handbook of expertise and expert
techniques derived from the theory.
performance. Cambridge, NY: Cambridge University Press.
Ericsson KA, Charness N. 1994. Expert performance: Its structure and
acquisition. Am Psychol 49:725–747.
Notes on Contributors Eva KW. 2005. What every teacher needs to know about clinical reasoning.
Med Educ 39:98–106.
JOHN Q. YOUNG, MD, MPP, is the Vice Chair for Education in the
Feltovich PJ, Barrows HS. 1984. Issues of generality in medical problem
Department of Psychiatry at Hofstra North Shore-LIJ School of Medicine
solving. In: Schmidt HG, Volder ML, Learning DISOP-B, editors.
and The Zucker Hillside Hospital.
Tutorials in problem-based learning: New directions in training for
JEROEN VAN MERRIENBOER, PhD, is a Professor of Learning and
the health professions. Assen, The Netherlands: Van Gorcum, 128–142.
Instruction in the Department of Educational Development and Research,
Grant J. 1999. The incapacitating effects of competence: A critique. Adv
Faculty of Health, Medicine and Life Sciences, Maastricht University. He is
Health Sci Educ Theory Pract 4: 271–277.
the Program Director of Research in Education, School of Health
Harden RM, Sowden S, Dunn WR. 1984. Educational strategies in
Professions Education.
curriculum development: The SPICES model. Med Educ 18:284–297.
STEVEN DURNING, MD, PhD, is a Professor of Medicine and Pathology at Hirsh D, Walters L, Poncelet AN. 2012. Better learning, better doctors,
the Uniformed Services University (USU). He directs the Introduction to better delivery system: Possibilities from a case study of longitudinal
Clinical Reasoning Course and is a general internist. integrated clerkships. Med Teach 34:548–554.
OLLE TEN CATE, PhD, is a Professor of Medical Education and Director of Holmboe E, Ginsburg S, Bernabeo E. 2011. The rotational approach to
the Center for Research & Development of Education at the University medical education: Time to confront our assumptions? Med Educ 45:
Medical Center Utrecht, the Netherlands. 69–80.
383
J. Q. Young et al.
Issa N, Schuller M, Santacaterina S, Shapiro M, Wang E, Mayer RE, Darosa Plass JL, Moreno R, Brünken R. 2010. Cognitive load theory. Cambridge,
DA. 2011. Applying multimedia design principles enhances learning in NY: Cambridge University Press.
medical education. Med Educ 45:818–826. Ryan RM, Deci EL. 2000. Self-determination theory and the facilitation of
Kahneman D. 2011. Thinking, fast and slow. New York: Farrar, Straus and intrinsic motivation, social development, and well-being. Am Psychol
Giroux. 55:68–78.
Kalyuga S. 2007. Expertise reversal effect and its implications for learner- Schmidt HG, Boshuizen HP. 1993. On acquiring expertise in medicine.
tailored instruction. Educ Psychol Rev 19:509–539. Educ Psychol Rev 5:205–221.
Kalyuga S. 2009. Knowledge elaboration: A cognitive load perspective. Schunk DH. 2012. Learning theories: An educational perspective. Boston:
Learn Instr 19:402–410. Pearson.
Kirschner PA, Sweller J, Clark RE. 2006. Why minimal guidance during Shiffrin RM, Schneider W. 1977. Controlled and automatic human
instruction does not work: An analysis of the failure of constructivist, information processing: II. Perceptual learning, automatic attending
discovery, problem-based, experiential, and inquiry-based teaching. and a general theory. Psychol Rev 84:127–190.
Educ Psychol 41:75–86. Sweller J. 1988. Cognitive load during problem solving: Effects on learning.
Konkola R, Tuomi-Gröhn T, Lambert P, Ludvigsen S. 2007. Promoting Cogn Sci 12:257–285.
learning and transfer between school and workplace. J Educ Work 20: Sweller J. 2005. Implications of cognitive load theory for multimedia
211–228. learning. New York, NY: US, Cambridge University Press.
La Rochelle JS, Durning SJ, Pangaro LN, Artino AR, Van der Vleuten CPM, Sweller J, Ayres PL, Kalyuga S. 2011. Cognitive load theory. New York:
Schuwirth L. 2011. Authenticity of instruction and student performance: Springer.
A prospective randomised trial. Med Educ 45:807–817. Sweller J, Cooper GA. 1985. The use of worked examples as a substitute for
problem solving in learning algebra. Cogn Instr 2:59–89.
Med Teach Downloaded from informahealthcare.com by Central Michigan University on 12/25/14
Leppink J, Paas F, Van der Vleuten CP, Van Gog T, Van Merrienboer JJ.
2013. Development of an instrument for measuring different types of Sweller J, Van Merrienboer JJG. 2013. Instructional design for medical
cognitive load. Behav Res Methods 45(4):1058–1072. education. In: Walsh K, editor. Oxford textbook of medical education.
Lurie SJ, Mooney CJ, Lyness JM. 2009. Measurement of the general Oxford, UK: Oxford University, 74–85.
competencies of the accreditation council for graduate medical Sweller J, Van Merrienboer JJG, Paas FGWC. 1998. Cognitive architecture
and instructional design. Educ Psychol Rev 10:251–296.
education: A systematic review. Acad Med 84:301–309.
Taylor DC, Hamdy H. 2013. Adult learning theories: Implications for
Mallisena, Dhruva AB. 1933. Syadvadamanjari of Mallisena. Poona:
learning and teaching in medical education: AMEE Guide No. 83. Med
Bhandarkar Oriental Research Institute.
Teach 35(11):e1561–e1572.
Mayer RE. 2010. Applying the science of learning to medical education.
Ten Cate O. 2005. Entrustability of professional activities and competency-
Med Educ 44:543–549.
based training. Med Educ 39:1176–1177.
Miller GA. 1956. The magical number seven, plus or minus two: Some limits
Ten Cate O, Scheele F. 2007. Competency-based postgraduate training: Can
on our capacity for processing information. Psychol Rev 63:81–97.
we bridge the gap between theory and clinical practice? Acad Med 82:
For personal use only.
384