Stephen R. Anderson (1985) .

Stephen R. Anderson (1985).
Phonology in
the twentieth century : theories of rules and
theories of representations. Chicago: The
University of Chicago Press.
Caveat: Being an adaptation of a section of a chapter in my
Doctoral thesis, this is a fairly challenging article which may require
solid grounding in Applied Linguistics and Cognitive Theories of
Skill Acquisition.
1. L2-Acquisition as skill acquisition: the Anderson Model
The Anderson Model, called ACT* (Adaptive Control of Thought),
was originally created as an account of the way students internalize
geometry rules. It was later developed as a model of L2-learning
(Anderson, 1980, 1983, 2000). The fundamental epistemological
premise of adopting a skill-development model as a framework for
L2-acquisition is that language is considered as governed by the
same principles that regulate any other cognitive skill. A number of
scholars such as Mc Laughlin (1987), Levelt (1989), O’Malley and
Chamot (1990) and Johnson (1996), have produced a number of
persuasive arguments in favour of this notion.
Although ACT* constitutes an espoused theory of L2 acquisition, It

endorses Anderson’s to claim that his model alone cannot give a
completely satisfactory account of L2-acquisition. It can be used
effectively to conceptualize at least three important dimensions of
L2-acquisition which are relevant to type of instructional
approaches implemented in many schools:
(1) the acquisition of grammatical rules in explicit L2-instruction,
(2) the developmental mechanisms of language processing and
(3) the acquisition of Learning Strategies.
Figure 1: The Anderson Model (adapted from Anderson, 1983)
The basic structure of the model is illustrated in Figure 1, above.

Anderson posits three kinds of memory, Working Short-Term
Memory (WSTM), Declarative Memory and Production (or
Procedural) Memory. Working Memory shares the same features
discussed in previous blogs (see ‘Eight important facts about
Working Memory’) while Declarative and Production Memory may
be seen as two subcomponents of Long-Term Memory (LTM). The
model is based on the assumption that human cognition is
regulated by cognitive structures (Productions) made up of ‘IF’ and
’THEN’ conditions. These are activated every single time the brain
is processing information; whenever a learner is confronted with a
problem the brain searches for a Production that matches the data
pattern associated with it. For example:
IF the goal is to form the present perfect of a verb and the person
is 3rd singular/
THEN form the 3rd singular of ‘have’
IF the goal is to form the present perfect of a verb and the

appropriate form of ‘have’ has just been formed /
THEN form the past participle of the verb

The creation of a Production is a long and careful process since
Procedural Knowledge, once created, is difficult to alter.
Furthermore, unlike declarative units, Productions control
behavior, thus the system must be circumspect in creating them.
Once a Production has been created and proved to be successful, it
has to be automatized in order for the behavior that it controls to
happen at naturalistic rates. According to Anderson (1985), this
process goes through three stages:
(1) a Cognitive Stage, in which the brain learns a description of a

skill;
(2) an Associative Stage, in which it works out a method for
executing the skill;
(3) an Autonomous Stage, in which the execution of the skill
becomes more and more rapid and automatic.
In the Cognitive Stage, confronted with a new task requiring a skill

that has not yet been procedural, the brain retrieves from LTM all
the declarative representations associated with that skill, using the
interpretive strategies of Problem-solving and Analogy to guide
behavior. This procedure is very time-consuming, as all the stages
of a process must be specified in great detail and in serial order in
WSTM. Although each stage is a Production, the operation of
Productions in interpretation is very slow and burdensome as it is
under conscious control and involves retrieving declarative
knowledge from LTM. Furthermore, since this declarative
knowledge must be kept in WSTM, the risk of cognitive overload
leading to error may arise.
Thus, for instance, in translating a sentence from the L1 into the L2,
the brain will have to consciously retrieve the rules governing the
use of every single L1-item, applying them one by one. In the case
of complex rules whose application requires performing several
operations, every single operation will have to be performed in
serial order under conscious attentional control. For example, in
forming the third person of the Present perfect of ‘go’, the brain may
have to: (1) retrieve and apply the general rule of the present perfect
(have + past participle); (2) perform the appropriate conjugation of
‘have’ by retrieving and applying the rule that the third person of
‘have’ is ‘has’; (3) recall that the past participle of ‘go’ is irregular;
(4) retrieve the form ‘gone’.
Producing language by these means is extremely inefficient. Thus,

the brain tries to sort out the information into more efficient
Productions. This is achieved by Compiling (‘running together’) the
productions that have already been created so that larger groups of
productions can be used as one unit. The Compilation process
consists of two sub-processes: Composition and Proceduralisation.
Composition takes a sequence of Productions that follow each other
in solving a particular problem and collapses them into a single
Production that has the effect of the sequence. This process lessens
the number of steps referred to above and has the effect of speeding
up the process. Thus, the Productions
P1 IF the goal is to form the present perfect of a verb / THEN form
the simple present of have.
P2 IF the goal is to form the present perfect of a verb and the

appropriate form of ‘have’ has just been formed / THEN form the
past participle of the verb would be composed as follows:
P3 IF the goal is to form the present perfect of a verb / THEN form
the present simple of have and THEN the past participle of the verb
An important point made by Anderson is that newly composed
Productions are weak and may require multiple creations before
they gain enough strength to compete successfully with the
Productions from which they are created. Composition does not
replace Productions; rather, it supplements the Production set.
Thus, a composition may be created on the first opportunity but
may be ‘masked’ by stronger Productions for a number of
subsequent opportunities until it has built up sufficient strength
(Anderson, 2000). This means that even if the new Production is
more effective and efficient than the stronger Production, the latter
will be retrieved more quickly because its memory trace is stronger.
The process of Proceduralisation eliminates clauses in the condition

of a Production that require information to be retrieved from LTM
memory and held in WSTM. As a result, proceduralised knowledge
becomes available much more quickly than non-proceduralised
knowledge. For example, the Production P2 above would become
IF the goal is to form the present perfect of a verb
THEN form ‘have’ and then form the past participle of the verb
The process of Composition and Proceduralisation will eventually
produce after repeated performance:
IF the goal is to form the present perfect of ‘play’/ THEN form ‘has
played’
For Anderson it seems reasonable to suggest that Proceduralisation
only occurs when LTM knowledge has achieved some threshold of
strength and has been used some criterion number of times. The
mechanism through which the brain decides which Productions
should be applied in a given context is called by Anderson
Matching. When the brain is confronted with a problem, activation
spreads from WSTM to Procedural Memory in search for a solution
– i.e. a Production that matches the pattern of information in
WSTM. If such matching is possible, then a Production will be
retrieved. If the pattern to be matched in WSTM corresponds to the
‘condition side’ (the ‘if’) of a proceduralised Production, the
matching will be quicker with the ‘action side’ (the ‘then’) of the
Production being deposited in WSTM and make it immediately
available for performance (execution). It is at this intermediate
stage of development that most serious errors in acquiring a skill
occur: during the conversion from Declarative to Procedural
knowledge, unmonitored mistakes may slip into performance.
The final stage consists of the process of Tuning, made up of the

three sub-processes of Generalisation, Discrimination and
Strengthening. Generalisation is the process by which Production
rules become broader in their range of applicability thereby
allowing the speaker to generate and comprehend utterances never
before encountered. Where two existing Productions partially
overlap, it may be possible to combine them to create a greater level
of generality by deleting a condition that was different in the two
original Productions. Anderson (1982) produces the following
example of generalization from language acquisition, in which P6
and P7 become P8
P6 IF the goal is to indicate that a coat belongs to me THEN say ‘My
coat’
P7 IF the goal is to indicate that a ball belongs to me THEN say ‘My

ball’
P8 IF the goal is to indicate that object X belongs to me THEN say

‘My X’
Discrimination is the process by which the range of application of a

Production is restricted to the appropriate circumstances
(Anderson, 1983). These processes would account for the way
language learners over-generalise rules but then learn over time to
discriminate between, for example, regular and irregular verbs.
This process would require that we have examples of both correct
and incorrect applications of the Production in our LTM.
Both processes are inductive in that they try to identify from
examples of success and failure the features that characterize when
a particular Production rule is applicable. These two processes
produce multiple variants on the conditions (the ‘IF’ clause(s) of a
Production) controlling the same action. Thus, at any point in time
the system is entertaining as its hypothesis not just a single
Production but a set of Productions with different conditions to
control the action.
Since they are inductive processes, Generalization and

Discrimination will sometimes err and produce incorrect
Productions. As I shall discuss later in this chapter, there are
possibilities for Overgeneralization and useless Discrimination, two
phenomena that are widely documented in L2-acquisition research
(Ellis, 1994). Thus, the system may simply create Productions that
are incorrect, either because of misinformation or because of
mistakes in its computations.
ACT* uses the Strengthening mechanism to identify the best

problem-solving rules and eliminate wrong Productions.
Strengthening is the process by which better rules are strengthened
and poorer rules are weakened. This takes place in ACT* as follows:
each time a condition in WSTM activates a Production from
procedural memory and causes an action to be deployed and there
is no negative feedback, the Production will become more robust.
Because it is more robust it will be able to resist occasional negative
feedback and also it will be more strongly activated when it is called
upon:
The strength of a Production determines the amount of activation

it receives in competition with other Productions during pattern
matching.Thus, all other things being equal, the conditions of a
stronger Production will be matched more rapidly and so repress
the matching of a weaker Production (Anderson, 1983: 251)
Thus, if a wrong Interlanguage item has acquired greater strength

in a learner’s LTM than the correct L2-item, when activation
spreads the former is more likely to be activated first, giving rise to
error. It is worth pointing out that, just as the strength of a
Production increases with successful use, there is a power-law of
decay in strength with disuse.
2.Extending the model: adding a ‘Procedural-to-

Procedural route’ to L2-acquisition
One limitation of the model is that it does not account for the fact
that sometimes unanalysed L2-chunks of language are through rote
learning or frequent exposure. This happens quite frequently in
classroom settings, for instance with set phrases used in everyday
teacher-to-student communication (e.g. ‘Open the book’, ‘Listen
up!’). As a solution to this issue Johnson (1996) suggested
extending the model by allowing for the existence of a ‘Procedural
to Procedural route’ to acquisition whereby some unanalysed L2-
items can be automatised with use, ‘jumping’, as it were, the initial
Declarative Stage posited by Anderson.
This means that teaching memorised unanalysed chunks can work

in synergy with explicit language teaching, as happens in my
approach.

Stephen R. Anderson (1985) .

Uploaded by

Copyright:

Available Formats

Stephen R. Anderson (1985) .

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stephen R. Anderson (1985) .

Uploaded by

Copyright:

Available Formats

Stephen R. Anderson (1985).

Although ACT* constitutes an espoused theory of L2 acquisition, It

The basic structure of the model is illustrated in Figure 1, above.

THEN form the 3rd singular of ‘have’

IF the goal is to form the present perfect of a verb and the

THEN form the past participle of the verb

(1) a Cognitive Stage, in which the brain learns a description of a

In the Cognitive Stage, confronted with a new task requiring a skill

Producing language by these means is extremely inefficient. Thus,

P2 IF the goal is to form the present perfect of a verb and the

The process of Proceduralisation eliminates clauses in the condition

The final stage consists of the process of Tuning, made up of the

P7 IF the goal is to indicate that a ball belongs to me THEN say ‘My

P8 IF the goal is to indicate that object X belongs to me THEN say

Discrimination is the process by which the range of application of a

Since they are inductive processes, Generalization and

ACT* uses the Strengthening mechanism to identify the best

The strength of a Production determines the amount of activation

Thus, if a wrong Interlanguage item has acquired greater strength

2.Extending the model: adding a ‘Procedural-to-

This means that teaching memorised unanalysed chunks can work

You might also like