Mohammad Rahimi
and
Mahboobeh Saadat
SI
Shiraz University
D
A Verbal Protocol Analysis of a C-Test
E-mail Addresses:
[email protected]
mahsaadat@ gmail.co
of
Abstract
Ar
ch
ive
The C-test is widely known as a test of overall language proficiency. Most of
the evidence in this regard has been obtained through correlational studies.
Nonetheless, construct validity of the C-test is just partially established.
Moreover, such studies do not reveal anything about the mental processes going
on in the mind of the testees. Verbal protocol analysis has been recommended as
an important tool to validate the C-test. A C-test consisting of 5 texts with 100
deletions was given to a sample of 26 Iranian English seniors, and subsequently
a retrospective verbal protocol analysis was carried out to learn what happened
in the mind of the testees while they were restoring the test items. The results of
the study showed that the subjects used 13 different strategies, consisting of
both bottom-up and top-down processes. However, the use of different
strategies varied as a function of both the types of items in the C-test as well as
the proficiency level of the subjects. The results of the study suggest construct
validity of the C-test as a test of overall language proficiency.
Key words: C-test, construct validity, language proficiency, reduced
redundancy, retrospection, verbal proto.
www.SID.ir
56
A Verbal Protocol Analysis of a C-Test
Introduction
of
SI
D
Tests of reduced redundancy have been widely appreciated for being
highly valid and eminently authentic. These tests reflect the
sociolinguistic-integrative approach to language testing according to
which knowledge of a language necessarily requires the ability to function
when there is reduced redundancy through the use of what Oller (1979)
calls an expectancy grammar. In fact, as Feldmann and Stemmer (1987, p.
255) state, “Comprehension of input leads us to form certain expectations
about what will come next, be it the next letter, the next word, or the next
sentence.” Klein-Braley (1997) believes that the concept of reduced
redundancy can serve as a good criterion to measure the learner’s
language proficiency.
Ar
ch
ive
The cloze, widely used as a test of overall language proficiency, is an
example of tests of reduced redundancy. This test consists of a passage in
which every nth word--usually, every fifth, sixth or seventh word--is
deleted. The results of many studies have lent support to the validity of
cloze as a measure of overall language proficiency by establishing high
correlation between the scores of the subjects on this test and those
obtained from discrete-point proficiency tests such as TOEFL and UCLA
placement test (Chappelle and Abraham, 1990; Oller 1988; Alderson,
1979, 1980; Darnell, 1968 to name a few). None the less, cloze, as KleinBraley and Raatz (1985) and Klein-Braley (1997) state, suffers from some
rather serious shortcomings mainly pertinent to the deletion and scoring
procedures employed, reliability and validity of the test, as well as the
fact that the use of a single text may make the test biased.
To eliminate the above-mentioned drawbacks, Klein-Braley and
Raatz (1985) offered the C-test technique. The C-test has been widely
used and praised as a valid test of overall language proficiency. Numerous
studies have found high correlations between the C-test and other
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
57
Ar
ch
ive
of
SI
D
integrative tests, say, the cloze test and dictation, and other tests of
language proficiency (Jafarpur, 2001; Klein-Braley, 1997; Mochizuki,
1994; Katona, 1992; Neghishi, 1987). They have all been indicative of the
empirical validity of the C-test. For instance, Jafarpur (2001) has found a
high correlation between the scores on the C-test and the English
Placement Test and a relatively high correlation between the C-test and
the cloze. Inasmuch as the scores show high reliability and concurrent
validity, he concludes that the C-test is advantageous over cloze. KleinBraley (1997) has shown that the C-test highly correlates with other tests
of reduced redundancy and a language proficiency test--DELTA, the
Duisburg English Language Test for Advanced Students. Accordingly,
she is convinced that the C-test is the best representative of reduced
redundancy tests of general language proficiency. Mochizuki (1994) has
demonstrated that the C-test highly correlates with two language
proficiency tests--STEP and CELT. He concludes that the C-test seems to
be a promising means of assessing overall language proficiency. Having
found a high correlation between the C-test scores and those of a language
proficiency test in the case of Hungarian subjects, Dornyei and Katona
(1992) come to the conclusion that the C-test is a highly valid and reliable
integrative instrument for measuring the overall language proficiency.
Moreover, they consider it a better measure of general language
proficiency than the cloze test. Finally, Neghishi (1987) observes a high
correlation between the C-test scores and the scores obtained from a
language proficiency test--ELBA.
In addition, Klein-Braley (1985) produces various types of evidence
in support of the C-test as a measure of general language proficiency. For
instance, she claims that processing the C-test requires, at least, some of
the mechanisms involved in normal language processing inasmuch as
type-token ratio and mean sentence length, two popular indices of text
difficulty and readability, can predict C-test difficulty as well. Hastings
(2002) demonstrates that:
www.SID.ir
58
A Verbal Protocol Analysis of a C-Test
SI
D
A C-test measures the ability to apply and integrate contextual,
semantic, syntactic, morphological, lexical, and orthographic
information and knowledge pertaining to a particular written
language. Furthermore, the processing that is required for a
successful C-test performance seems comparable to natural
language processing in both length and complexity, and may in
fact have much in common with natural language performance.
ch
ive
of
However, he admits that his study, being merely an exploratory error
analysis of the C-test, fails to definitely answer what a C-test measures.
Sigott (2002) disputes the claims that underestimate the C-test prevalently
as a test of lower-order skills. The results of his study suggest that the
individual test taker's characteristics and those of the individual C-test
passage determine whether high-level processing is triggered by an item
or not. He argues that the facility index at text level and the word class to
which an item belongs are not reliable predictors of high-level processing
since a significant number of the subjects engaged in high-level
processing to restore both easy and difficult items from all four classes
under study.
Ar
The validity of the C-test, as a test of overall language proficiency,
however, has been criticized based on the results of a series of studies. For
instance, Jaafarpur (1995), emphasizing that there is nothing unique about
the Rule-of Two, refutes the claims made on the C-test. Particularly, he
shows that “The Rule-of Two produces a sizable number of
nonfunctioning items” (p. 97). In addition, he convincingly claims that the
C-test is not able to make discrimination among the examinees of
different proficiency levels. Sigott (1995) comments that the C-test items
are sensitive to aspects of vocabulary, syntactic competence, and sentence
level grammar. Having reviewed different evidence pertinent to what the
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
59
ch
ive
of
SI
D
C-test measures, Chapelle (1994) reaches no definite result as to whether
the C-test is a valid test of overall language proficiency or not. Hood
(1990) claims that the C-test scores are more indicative of general reading
skill than general language skill. In fact, he does not find any evidence
showing the supremacy of the C-test over the cloze test. Kamimoto
(1992) believes that the C-test tends to measure the subjects’ vocabulary
and grammatical competence and hence its processing occurs at the micro
level. He relates this fact to the deletion procedure employed in the design
of the C-test. Stemmer (1991) demonstrates that different results may be
obtained from the C-test depending on individual text characteristics.
Furthermore, the fact that function words are restored more successfully
than content words and that text understanding rarely exceeds the
proposition border convince Stemmer to presume that the current form of
the C-test does not tap general language proficiency. Similarly, Cleary
(1988) asserts that the C-test fails to appropriately measure the general
language proficiency. Cohen et. al (1984), similar to Kamimoto (1992),
believe that C-test processing is more at the micro level. They posit that
due to the type of deletion procedure employed in the C-test, the testee
pays more attention to such aspects as vocabulary and grammar than
higher levels of language. Singleton and Little (cited in Chapelle 1994)
consider the C-test responses as a source of evidence showing second
language lexical development and processing.
Ar
The latter group of studies calls into question the validity of the C-test
as a test of general language proficiency. Even the correlational studies
that prove the empirical validity of the C-test cannot guarantee its
construct validity because as Kamimoto (1992, p. 69) states,
This method of statistical analysis gives no access to what really goes on
in the students’ minds when they take a C-test. Correlational studies
only show an outcome of what has already taken place and prevent us
from knowing whether students resort to either integrative skills or
discrete-point skills..... In short, studies only on correlational studies are
not sufficient for the purpose of an inquiry into what a C-test measures.
www.SID.ir
60
A Verbal Protocol Analysis of a C-Test
of
SI
D
In addition, Grotjahn (1986) mentions three reasons why correlational
studies are inadequate for construct validation of the tests. Firstly, he
presumes, construct validation of tests is only partially established with
the help of other tests. Indeed, he believes in the circularity of this
approach. Second, the validation of the tests through correlational studies
does not tell us anything about the mental processes going on in the mind
of the learner. Finally, he contends that the results of such studies heavily
depend on the number and type of variables included in the study.
Ar
ch
ive
To determine what exactly a measure taps, some scholars have
suggested verbal protocol analysis--introspective and retrospective
techniques--which help researchers figure out what is really going on in
the mind of the testee while taking the test. As Grotjahn (1986, p. 162)
remarks, “... in validating (language) tests we also have to analyze the
mental processes in the test-taking subject” (p. 162). Green (1998, p.7)
states, “The fundamental underlying assumption for protocol analysis is
that information that is heeded as a task is being carried out is represented
in a limited capacity short-term memory, and may be reported following
an instruction to either talk aloud or think aloud.” Similarly, Ericsson and
Simon ((1984)) maintain that introspective and retrospective reports may
tap some of the testee’s cognitive processes.
Indeed, Babaii and Ansary (2001) in a retrospective analysis of the
C-test found that the learners used four major types of cues with varying
frequencies to restore the items in the C-test: automatic processing,
lexical adjacency, sentential cues, and top-down cues. They came to the
conclusion that C-testing is a reliable and valid procedure mirroring the
reduced redundancy principle.
Feldmann and Stemmer (1987), through think-aloud protocols and
retrospective interviews showed that what a C-test would measure seemed
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
61
ive
of
SI
D
to vary according to the deletion in the test. They found that the subjects
used bottom-up and top-down processing depending on the item that was
deleted and their own level of proficiency and that a skilled reader would
use both strategies. According to Adams and Collins (1979), bottom-up
strategies are adopted when the information in the text is novel or does
not fit the learner’s ongoing hypotheses about the content of the text; topdown processing helps the reader to resolve ambiguities or to make a
choice between alternative interpretations of the data. Feldmann and
Stemmer (1987) identified different strategies adopted by the subjects
while taking the C-test. They primarily attempted to put these strategies
on a continuum ranging from bottom-up to top-down strategies. However,
they finally admitted that it was not possible to unambiguously put the
strategies used by the learners on such a continuum and that in some
cases they even failed to make a clear distinction between a bottom-up
and a top-down strategy. These researchers enumerate some of the
strategies used by the subjects as follows: recall by structural analysis, by
adding letters/syllables to the item beginning, by repetition, by search for
meaning, by looking for external help, by substitution, and recall of past
situations.
Ar
ch
Storey (1997) is of the opinion that one may find varying degrees of
construct validity for different items in a discourse cloze, another measure
of reduced-redundancy: “If the item is able to generate processes
identified in a theoretical model of the reading process it can be shown to
have a good level of construct validity. If alternative processes, irrelevant
to the underlying construct, are generated, the validity of the item is called
into question” (p. 227). He continues, “If test items generate other
processes, then they are not testing what they are designed to test, in other
words, they lack construct validity” (p. 226). In his study, Storey noticed
that the subjects analyzed the rhetorical structure of the text more deeply
for restoring deleted discourse markers. Hence, he concluded that the
construct validity of such items was established. However, when the
www.SID.ir
62
A Verbal Protocol Analysis of a C-Test
D
subjects used a variety of surface matching to restore the deleted cohesive
ties, he called the validity of such items into question.
ive
of
SI
Some verbal protocol analyses carried out in the case of the C-test
have cast doubt on its construct validity as a test of general language
proficiency. Grotjahn (1986) maintains, “The C-test is very economical
and, above all, a highly reliable measurement instrument. However, what
it measures, i.e., its construct validity, is in my opinion thus far not very
clear” (p. 161). Chapelle and Abraham (1990) believe that C-testing is
mostly a measure of grammatical competence rather than textual
competence. Finally, Cohen et al. ((1984)) and Kamimoto (1992) insist
that the cognitive processes in the case of the C-test are more at the micro
level than the macro level. They believe that due to the deletion
procedure, the learner uses the lexical and grammatical processes to
provide the response to the test.
ch
As the results of the aforementioned studies reveal, construct validity
of the C-test, examined through verbal protocol analysis, is still in a state
of indeterminacy. As such, this paper is a further attempt at investigating
the construct validity of the C-test through the analysis of the processes
going on in the mind of the testees retrospectively.
Ar
Method
Subjects
The subjects of the study were 26 Iranian English seniors taking a course
in language testing with the first researcher. They were native speakers of
Persian and enjoyed different levels of proficiency in English. They were
in their twenties and of both sexes, 18 females and 15 males.
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
63
D
Instruments
ive
of
SI
The instrument utilized in the study was a C-test consisting of 5 texts with
100 deletions. To construct the test, six short passages with a variety of
interesting subjects and different levels of difficulty, judged by the Flesch
Reading Ease readability scale (Microsoft Word, (1995), were selected
from Rakhshanfar and Jahrudi (n.d.). The texts were arranged from the
easiest to the most difficult one as recommended by Klein-Braley and
Raatz (1985). The difficulty levels of the texts were 94, 93, 88, 86, 84,
and 68, respectively. The first and the last sentences of the texts were left
intact. Starting from the second word of the second sentence, half of the
letters of every second word were deleted. Each text yielded 20 items, so
there were 120 items on the whole.
Ar
ch
As suggested by Klein-Braley (1997), the test was subsequently
given to a control group of 6 EFL teachers. Their scores on the test ranged
between a low of 112 and a high of 120, thus over 90% correct on
average. Furthermore, it was piloted with a group of subjects similar to
the target group. In other words, the 6 texts along with Shiraz University
Placement Test were given to 25 Iranian English majors. The correlation
between the C-test and the proficiency test was 0.69. The reliability
coefficient obtained for the C-test scores as measured through K-R 21 was
0.88. KR-21 is generally considered not to be suitable for estimating the
reliability of tests of reduced redundancy, because the items in such tests,
unlike multiple-choice ones, are not independent. Yet, Brown (2002),
based on a series of studies, claims that K-R 21 only underestimates the
reliability of tests with dependent items.
Finally, based on the results of the item analysis, text five which
contained a higher number of mal-functioning items, as compared with
other texts, was ultimately removed from the test so that the final version
www.SID.ir
64
A Verbal Protocol Analysis of a C-Test
will
D
consisted of five texts with 100 deletions. A copy of this C-test
appear in the Appendix.
Procedure
of
SI
The administration of the C-test and the subsequent interviews were
conducted by the first researcher with whom the subjects were taking a
course. In order to gain further information about how to efficiently
conduct the verbal protocol analysis, before carrying out the main verbal
protocol analysis, the C-test was given to another group of English majors
and then a retrospective pilot study was done with 7 of them to see what
kinds of explanation they might put forward. As such, the researcher
could get some idea of how to elicit information from the target subjects.
ch
ive
As for the target group, one session before administering the main
C-test, another C-test was given to them to complete in class as a warm
up. Then, in the same session, they were asked to say why they had
provided each of the responses. Whenever, they failed to do so, the
instructor tired to help them by encouraging remarks to continue their
explanations so that they finally verbalized the strategy they used. The
next session, the main C-test was given to the subjects and they were
asked to be attentive to the way they reconstructed the texts.
Ar
The retrospection started from the afternoon of the very day the
subjects took the test and continued for two days so that all the subjects
would take part in it and report the strategies they used for all the items as
far as they could. It was carried out in the subjects’ native language-Persian--so that the subjects could explicate exactly what had happened in
their mind. It was thought that conducting the explanation in English
might have made the task either impossible or very difficult for them to
do in some cases. Each subject was given his/her own paper and, starting
from the first blank, was asked to say why he/she had given a particular
response. Their explanations were tape recorded for later analysis.
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
65
Data Analysis
ive
of
SI
D
The recorded explanations were transcribed and then coded based on the
type of reason(s) the subjects had mentioned for their responses. Codings
were singly done by both researchers for each case and were then
compared so that a consensus was made on any disagreement. In addition,
the frequencies of the different types of reasons were obtained and were
then changed into percentages to determine the most frequent ones. To
determine if there was any difference between the type and the percentage
of the strategies used by the subjects with different levels of proficiency,
the top and bottom 25% were identified as the high and low ability
groups, respectively. In addition, the percentage of the strategies used in
each individual text was obtained once for all the subjects and next for the
two proficiency groups.
Results and discussion
ch
The following strategies were detected in the subjects’ explanations. They
are presented through the classification proposed by Feldmann and
Stemmer (1987).
1. Structural analysis
Ar
a. Syntactic analysis
The subject analyses the syntactic structure of the sentence to retrieve a
word. For instance, in the case of “Lions a-- found....” the subjects usually
indicated, “I have used “are” because “Lions” is plural and the previous
sentence is in the simple present tense.
b. Formal indicators
The subject uses a formal syntactic indicator to guess the missing word.
For instance, in the case of “the grass----- of Afr---” a subject said, “I
www.SID.ir
66
A Verbal Protocol Analysis of a C-Test
2. Adding letters/syllables to the item beginning
D
wrote the word Africa because the word starts with a capital letter.
Usually, the names of countries and cities start with a capital letter.
SI
The subject guesses the missing word just by getting help from the
undeleted part of the word and adding some letters to it. For example, in
the case of “in Euro---- zoos” a subject said, “I guessed it was European
because of the first four letters. It was quite clear.”
of
3. Using past situations
ive
The subject has already seen the word, so just by noting the beginning of
the word he/she can retrieve the whole word. For instance, in the case of
“the grass---- of ...” a subject said, “I had seen the word grassland many
times, so I easily wrote it.” As for some other items, the subject uses the
word before or after the item and since he/she has already seen the same
two words together, he/she guesses the missing word. For instance, in the
case of “So-- people ...” a subject said, “In many cases, I have seen the
word people preceded by some before.”
ch
4. Translation to mother tongue (translation of immediately following or
preceding words)
Ar
For instance, in the case of “In win--- its ... .” a subject said, “I guessed
the word with regard to the meaning of the sentence. It means /dar
zemestan/ (in winter).
5. Using the co-text, preceding/following sentence(s) (including the
introductory and the final sentence of the text)
For instance, in the case of “There a-- no wi-- lions i- Europe, b-- there a- captive li--- in ....” A subject said, “I read the sentence preceding and
following the missing word and since they seemed to be in contrast, I
wrote but.”
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
67
6. Using mother tongue meaning equivalent
SI
D
The subject, translating the whole sentence, guesses what specific word
should be used in his/her mother tongue. Then he/she looks for its
equivalent in the target language. For example, in the case of “... that set-- low ov-- the ar-- ... .” a subject said, “Reading the whole sentence, I
guessed it should be something like /ruye/ (over), so I wrote the word
over.”
7. Using the general meaning of the text
of
The subject uses the general meaning and idea of the text to restore a
missing word. For instance, in the case of “there a-- captive li--- in ... .” a
subject said, “I guessed the word should be lion, since the whole text is
about lions.”
ive
8. Using external help (other C-test texts, introductory or final part of the
text)
ch
The subject retrieves the missing word because he/she has seen the same
word in previous texts or in some other tests. For instance, in “t-- black s- ... .” a subject said, “I saw the with black in the previous text, so I
guessed it should be the here, too.”
9. Using reference
Ar
a. Retrieving the word by referring to the same lexical item
repeated before
The subject guesses the word because he/she has already seen the same
word in the text. For example, in the case of “So-- other people fa--- ... .”
a subject said, “I wrote the word faint here because it was used in the first
sentence of the text.”
b. Retrieving the item because it is morphologically related to
another item in the text
www.SID.ir
68
A Verbal Protocol Analysis of a C-Test
D
The subject retrieves the missing word by referring to a lexical item which
is related to it and is mentioned in the same text. For instance in the case
of “... in Euro---- zoos.” a subject said, “I wrote European since in the
previous sentence I had seen the word Europe.”
SI
c. Retrieving the missing word by substituting a pronoun for a
lexical item
mentioned before
10. Using inference
of
For example, in the case of “So-- people fa--- if th-- ... .” a subject said, “I
guessed it would be the word they because it refers to the word people in
the same sentence.”
a) Inferring from the meaning of a lexical item/a phrase
ch
ive
The subject retrieves the missing word by inferring from a lexical item/a
phrase mentioned in the same text. For instance, in the case of “... but
Ger--- soup ... ” a subject said, “I guessed it should be German because in
the previous sentence we had the word Chinese referring to a country, so I
guessed here we must have the name of a country.”
b) Inferring from the meaning of a sentence
Ar
For instance, in the case of “... while oth--- like ... .” a subject said,
“Reading the previous sentence and this sentence, I got that the writer is
contrasting two groups of people, so since we had some people in the
previous sentence, I guessed it must be others.”
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
69
11. Juxtaposition
12. Using background knowledge
SI
D
The subject restores the missing word because of its co-occurrence with
the preceding and/or the following word (he/she has seen such a
combination before). For instance, in “And th---, as though--- resting o... ” a subject said, “I chose though because I usually see it coming with
as.”
of
The subject uses his/her background knowledge to restore a missing word.
For example, in the case of “So-- people fa-- in crow---.” a subject said,
“I guessed it should be crowds because usually people faint in busy
places.”
13. No strategy (automatic processing)
ive
The subject cannot explain why he/she has written a particular item. For
instance, in the case of “... people fa--- if th-- ... ” a subject said, “I don’t
know why I’ve written they. I just guessed it should be they.”
Ar
ch
In all, 13 strategies were discerned. Feldmann and Stemmer (1987)
divide the strategies used to retrieve the C-test items into two groups of
bottom-up and top-down strategies. However, it is not possible to draw a
clear demarcation line between the two types of strategies. That is, the
difference between the two types is a matter of degree, rather than type.
Thus, we can say that background knowledge is much closer to the topdown end of the continuum, whereas adding letters/syllables to the item
beginning is closer to the bottom-up end of the continuum. Other
strategies, such as looking for external help, can be put nearly in the
middle of the continuum.
Storey (1997) believes that when the restoration of the deleted item
requires reference to material outside the sentence providing the
immediate context for the item, it is done at the macro level. In contrast,
www.SID.ir
70
A Verbal Protocol Analysis of a C-Test
D
when the restoration is done within the immediate context, it is done at
the micro level.
ch
ive
of
SI
Taking into account the two types of categorization proposed by
Feldmann and Stemmer (1987) and Storey (1997), the strategies used by
the subjects in the present study were divided into two groups: top-down
strategies and bottom-up strategies. Thus, syntactic analysis, using formal
indicators, adding letters/syllables to the item beginning, translation to
mother tongue, using mother tongue equivalent, fall within the bottom-up
category, whereas using past situations, using co-text, using the general
meaning of the text, using external help, inferring from the meaning of a
lexical item/a phrase, inferring from the meaning of a sentence,
background knowledge, and juxtaposition fall within the other category.
Besides, strategies related to reference, i.e., referring to the same lexical
item, referring to a morphologically related item, and substituting a
pronoun for a lexical item may fall within either category depending on
whether the item referred to occurs within the same sentence or the
preceding sentences. The farther the deleted item is from the item referred
to, the closer the strategy used by the subject tends to be to the top-down
end of the continuum and vice versa.
Ar
As the strategies mentioned earlier indicate, the testees employed
different types of strategies in completing the C-test. In fact, this supports
the claims made on the construct validity of the C-test as a test of overall
language proficiency. The results of the present study are in line with
those of Feldmann and Stemmer (1987) and Babaii and Ansary (2001). As
Babaii and Ansary (2001) say, “... to the extent that the C-test triggers
both macro- and micro-aspects of the language, it confirms well to the
principle of reduced redundancy which fundamentally emphasizes that
both a global and a local knowledge are required to supply the missing
elements in a distorted linguistic message” (p. 216) .
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
71
D
It is also noteworthy to see how frequently any one of these strategies
has been used by the subjects. Table 1 shows the percentages of the
strategies used by the subjects in this study.
Table 1
Percent strategies used by the subjects
SI
Strategy
Percentage
26.3
20.3
14.3
10.4
9.6
4.3
3.5
2.8
2
1.5
1
1
0.9
0.8
0.8
0.5
0.4
Top-down
Bottom-up
22.3
74.2*
Ar
ch
ive
of
Translation to mother tongue
Syntactic analysis
Adding letters or syllables to item beginning
Referring to the same lexical item
Inferring from the meaning of the same lexical item/phrase
Juxtaposition
No strategy identifiable
Using the general meaning of the text
Inferring from the meaning of a sentence
Substituting a pronoun for a lexical item
Formal indicators
Mother tongue equivalent
Background knowledge
Past situations
External help
Referring to a morphologically related item
Using co-text, preceding and/or following sentence(s)
(including the
introductory and the final sentence of the text)
*The sum of the percentages of top-down and bottom-up strategies in this
table and the following tables excludes “no strategy identifiable”.
www.SID.ir
72
A Verbal Protocol Analysis of a C-Test
Ar
ch
ive
of
SI
D
As it is evident from the table, the highest percentage belongs to
translation to mother tongue (26.3%), which is a bottom-up strategy. The
next highest percentage, too, belongs to another bottom-up strategy, i.e.,
syntactic analysis (20.3%). In fact, it can be said that about half of the
strategies used by the subjects fall within these two categories. As such,
these results are in line with those of Chapelle and Abraham (1990) who
claim that C-testing most likely results in tests of more grammatical and
less textual competence. The next strategy used by the subjects is
reference to the same lexical item (10.4%), which is, as mentioned before,
a middle-of-the-roader strategy. (Of course, it was found that in 6% of the
cases the item referred to was in the sentences other than the sentence in
which the deleted item appeared and in 4.4% of the cases in the same
sentence). The lowest percentage is that of using co-text ... (0.4). On the
whole, it was found that 74.2% of the strategies were bottom-up, 22.3%
top-down, and 3.5% no strategy. Thus, although both types of processes
were employed by the subjects and this confirms the claims made on the
C-test as a measure of general language proficiency, it seems that the
testees did not complete the C-test as a whole but acted on the individual
items independent of each other. This finding is in line with those of
Cohen et al. ((1984)) and Kamimoto (1992) who state that in C-testing,
processing is more at micro-level than macro-level. However, the results
are in contrast with those of Dornyei and Katona (1992) who claim that
the C-test is quite integrative and the aspect which is less efficiently
measured in the C-test is grammar.
Feldmann and Stemmer (1987) believe what a C-test measures seems
to vary according to the deletions in the test. Storey (1997), too, holds that
the items on a C-test have varying degrees of construct validity.
Accordingly, it may be assumed that different texts in the C-test may yield
quite different results. In order to verify the above assumption in the case
of the C-test utilized in this study, the analysis done earlier for the whole
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
73
C-test was done for each individual text of the test. Table 2 illustrates the
results.
Ar
ch
ive
of
SI
D
Table 2
Percent strategies used by the subjects in individual texts of the C-test
Strategy
Text 1 Text 2 Text 3 Text 4
Translation to mother tongue
23
27.3
24.6
31
Syntactic analysis
26
25.6
13.8
22.5
Adding letters or syllables to item 17.9
10.3
17.2
12.9
beginning
Referring to the same lexical item
4.7
16.6
9.8
9.9
Inferring from the meaning of the same 4.5
2.5
21
7.5
lexical item/phrase
Juxtaposition
4.7
2.5
4
4
No strategy identifiable
1.8
6.7
2.6
4
Using the general meaning of the text
5
2.5
1.5
0.15
Inferring from the meaning of a sentence
1
0.6
1.7
1
Substituting a pronoun for a lexical item
1.5
3.4
0.8
2
Formal indicators
3
0.15
Mother tongue equivalent
1.5
0.4
1.1
1.2
Background knowledge
2.6
1
Past situations
0.3
0.1
1.7
0.6
External help
0.04
3.5
Referring to a morphologically related item 1.7
Using
co-text,
preceding/following 0.8
0.5
0.16
0.05
sentence(s) (including the introductory and
the final sentence of the text).
Bottom-up
Top-down
79
21
73.1
26.9
65.3
34.7
76.2
23.8
Text 5
27
12.7
12.9
12
13.3
5.8
2.6
4.3
4.5
1.5
0.7
0.2
1
0.4
0.4
0.7
65
34.9
Translation to mother tongue and syntactic analysis are strategies
used most frequently in texts one, two, and four.
www.SID.ir
74
A Verbal Protocol Analysis of a C-Test
SI
D
This, however, does not stand true for texts 3 and 5. In these two texts,
although the strategy most frequently used is translation to mother
tongue, which is a bottom-up strategy, the next highest strategy is
inference from a lexical item/phrase, which is a top-down one.
Interestingly enough, as Table 2 indicates, in these two texts the
percentage of the overall top-down strategies is higher than the other ones
(34.7 in text 3 and 34.9 in text 5).
ive
of
As mentioned earlier, the difference observed in the percentage of
using different strategies in particular and top-down and bottom-up
strategies in general might be due to the nature of the texts and the deleted
items. A scrutiny of the deleted items shows that the reason cannot be
related to whether the deleted items are content words or function words,
because in almost all the texts about 90% of the deleted items are content
words. The difference may not be attributed to the readability of the texts,
either, since presumably as we proceed from text one to text five the
difficulty level of the texts increases.
Ar
ch
However, since inferring from a lexical item/phrase--which is a topdown strategy--is used much more frequently in these two texts than the
other texts, the reason might be the fact that the vocabulary in these two
texts has been much easier or the topic has been more familiar to the
subjects.
Feldmann and Stemmer (1987) maintain that a skilled reader will
activate both top-down and bottom-up processing simultaneously. Yet,
the more proficient the subjects are, the more they will be able to use the
nature of redundancy of the text. To examine to what extent this idea
holds true in the case of the subjects participating in the present study, the
top and bottom 25% of the subjects were selected as the high and low
proficiency groups. Then, the percentage of the strategies used by either
of the two groups on the whole test as well as individual texts was
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
75
obtained to see if any difference would be observed. Table 3 shows the
results pertaining to the whole test.
D
Table 3
Percent strategies used by high and low ability groups
High
19.5
23
10.6
12
13
3.5
2.2
3.6
3
2.7
1.4
1
1.4
0.8
1
0.8
0.5
Low
34.8
17
14.4
9.5
9.1
3.3
2.5
2.6
1.7
1.1
0.8
1.2
0.5
0.2
0.7
0.3
0.4
Bottom-up
Top-down
64.5
33.5
77.9
19.6
Ar
ch
ive
of
SI
Strategy
Translation to mother tongue
Syntactic analysis
Adding letters or syllables to item beginning
Referring to the same lexical item
Inferring from the meaning of the same lexical item/phrase
Juxtaposition
No strategy identifiable
Using the general meaning of the text
Inferring from the meaning of a sentence
Substituting a pronoun for a lexical item
Formal indicators
Mother tongue equivalent
Background knowledge
Past situations
External help
Referring to a morphologically related item
Using co-text, preceding/following sentence(s) (including the introductory
and the final sentence of the text)
As Table 3 indicates, in the high ability group the highest percentage
is that of syntactic analysis (23%), whereas in the low ability group it
belongs to translation to mother tongue (34.8). The second highest
frequent strategy is reversed in the two groups, i.e., translation to mother
tongue (19.5%) in the high ability group and syntactic analysis (17%) in
www.SID.ir
76
A Verbal Protocol Analysis of a C-Test
ch
ive
of
SI
D
the low ability group. One justification for this phenomenon can be the
fact that since the high ability group are more proficient in all aspects of
the language, including grammar, they have attempted to restore the
missing items via syntactic analysis, while the low ability group, not being
so proficient, have just resorted to the easiest way to restore the items, i.e.,
translation to their mother tongue. These two strategies, however, are
bottom-up processing. Of course, an examination of the third highest
strategy and the overall percentage of the strategies used by either group
shows that the high ability group tend to use top-down strategies more
frequently than the low ability group. The third highest frequent strategy
used by the high ability group is inference from a lexical item/phrase
(13%), which is a top-down strategy and in the low ability group adding
letters/syllables (14.4%), which is a bottom-up strategy. In addition, a
comparison of the overall percentage of the strategies used by the low and
high ability groups shows that the percentage of the bottom-up strategies
used in the low ability group (77.9%) is higher than that of the high ability
group (64.5%). In contrast, the percentage of the top-down strategies used
by the high ability group (33.3%) is higher than that of the low ability
group (19.6%). These results confirm the idea of Feldmann and Stemmer
(1987) indicating that the high ability group tend to use more top-down
strategies than the low ability group in restoring the missing items.
Ar
In order to determine if there is any difference between the
performance of the high and the low ability groups on individual texts of
the test, the percentage of the strategies used by either group on each text
was determined. Table 4 shows the results.
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
77
Ar
ch
ive
of
SI
D
Table 4
Percent strategies used by high and low ability groups in individual texts of the C-test
Strategy
T1
T1 T2
T2
T3
T3
T4
T4
T5
high low high low
high low
high low
high
Translation to mother tongue 14.9 25. 20.3 37.5
19.7 28
24.4 44.6 12
6
Syntactic analysis
27.4 21
31.9 18
14.3 15.3 21.8 20.7 14.8
Adding letters or syllables to 14.9 19. 5.8
14.8
16.3 12.7 16
9.1
10
item beginning
8
Referring to the same lexical 6.5
1.7 18.8 14.8
8.8
12.7 11.8 8.3
14.1
item
Inferring from the meaning 7.1
6.4 5.8
1.6
26.5 21.2 8.4
7.4
17.4
of
the
same
lexical
item/phrase
Juxtaposition
4.2
3.5 1.4
3.9
2.7
0.8
2.5
0.8
6.7
No strategy identifiable
3
1.2 4.3
3.1
1.4
4.2
0.8
3.3
1.3
Using the general meaning 5.4
7
1.4
3.1
1.4
0.8
10
of the text
Inferring from the meaning 1.2
1.7 1.4
1.6
2.7
0.8
1.7
1.7
8
of a sentence
Substituting a pronoun for a 2.4
1.7 5
2.3
2
0.8
4.2
0.8
lexical item
Formal indicators
4.2
2.3 0.7
2
Mother tongue equivalent
0.6
2.9 0.7
0.8
1.4
0.8
2.5
0.8
Background knowledge
5.4
2.3 1.4
Past situation
0.6
0.7
0.7
0.8
0.8
1.3
External help
0.7
4.2
2.5
Referring
to
a 2.4
1.7 0.8
0.7
morphologically related item
Using co-text,
preceding 1.2 0.7
0.7
0.8
1.3
/following sentence +(s)
(including the introductory
and the final sentence of the
text)
Bottom-up
70.3 79
68.1 86
59.1 65.5 73.2 82
49.5
Top-down
26.7 20
27.6 10.9
39.5 29.9 26
14.7 49.2
T5
low
38.3
9.8
15.8
9.8
9
7.5
0.8
2.3
4.5
1.5
0.8
0.8
-
80.5
19
www.SID.ir
78
A Verbal Protocol Analysis of a C-Test
Ar
ch
ive
of
SI
D
The performance of the two groups of the subjects shown in Table 4
indicates those texts 1, 2, and 4 follow more or less the same pattern
observed in all other cases, i.e., the strategy most frequently used in each
case for both high and low ability groups is a bottom-up strategy-translation to mother tongue or syntactic analysis--and in all cases the
percentage of the top-down strategies used by the high ability group is
higher than that of the low ability group. None the less, in the case of texts
three and five things are different. As the table illustrates, in both cases
the strategy most frequently used by the high ability group is a top-down
strategy, i.e., inference from a lexical item/phrase. Likewise, the overall
percentage of the top-down strategies used by the high ability group in the
case of these two texts is much higher than the other texts (39.5% for text
three and 49.2% for text five). In contrast to the other four texts, text five
is the only case where the percentage of the bottom-up strategies used by
the high ability group does not greatly exceed that of the top-down
strategies (49.5% and 49.2%, respectively). These findings support the
idea proposed by Feldmann and Stemmer (1987) that what a C-test
measures varies according to the deletions in the test and the point made
by Storey (1997) that items have varying degrees of construct validity.
Conclusions
The results of this study show that the subjects have used 13 different
strategies, consisting of both bottom-up and top-down processes.
Although the subjects tended to use the bottom-up strategies quite more
frequently than the top-down ones for restoring the items, this pattern was
found not to prevail throughout the five texts included in the C-test. In
other words, depending on the content of the text and the deleted lexical
items, the type of strategy used by the subjects and its percentage varied.
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
79
It was also noticed that the type and the percentage of the strategies used
by the subjects with different proficiency levels on the whole test as well
as individual texts were, to some extent, different.
SI
D
All in all, the results of the study are indicative of the construct
validity of the C-test as a test of overall language proficiency. This, none
the less, does not mean that all aspects of language are measured equally
through a C-test; it all depends on the texts included in the test and the
proficiency level of the subjects taking it. In fact, the subjects’ knowledge
of lower levels of language such as vocabulary and syntax are engaged
more while they are restoring the test items.
Ar
ch
ive
of
However, as Grotjahn (1986) states, when reporting retrospectively
especially in delayed cases, the subject may convey information that is not
related, in one way or another, to the real corresponding activity carried
out in his/her mind. In other words, the subject may give some
explanation for providing a particular response, but the real cognitive
processes carried out in his/her mind might be something quite different.
Specifically, since the data for the present study were elicited within two
days after the subjects had taken the test, some of the subjects, specially
those who were interviewed after a longer lapse of time between the
administration of the test and the retrospection might have forgotten why
they had provided a particular response and the strategy they reported was
not exactly the one they used while doing the test. As such, introspective
and/or retrospective verbal protocol analyses with shorter time lapse are
needed to verify the results reported here.
Received 10 November 2004
Accepted 5 July 2005
www.SID.ir
80
A Verbal Protocol Analysis of a C-Test
Acknowledgments
References
ive
of
SI
D
The researchers would like to thank very sincerely Prof. Jafarpur without
whose kind and supportive contributions and comments the study would
have been impossible. Grateful thanks are also extended to two IJAL
anonymous reviewers for their insightful remarks and suggestions. Any
remaining deficiencies are, of course, ours.
ch
Adams, M. J., Collins, A. (1970). A schema-theoretic view of reading.
In Freedle, R. O. (Ed.), New Directions in Discourse
Processing:(pp. 1-22). Norwood , N. J. Ablex.
Babaii, E., Ansary, H. (2001). The C-test: A valid operationalization
of reduced redundancy principle? System, Vol. 29, pp. 209-219.
Ar
Brown, J. D. (2002). Do cloze tests work? Or, is it just an illusion?
Second Language Studies, Vol. 21, pp. 79-125.
Chapelle, C.A., (1994). Are C-tests valid measures for L2 vocabulary
research? Second Language Research, Vol.10, pp. 157-187.
Chapelle, C., Abraham, R., (1990). Cloze method: What difference
does it make? Language Testing, Vol. 7, pp. 121- 146.
Cleary, C. (1988). The C-test in English: Left-hand deletions. RELC
Journal, Vol. 19, pp. 26-38.
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
81
Cohen, A.D., Segal, M. and Wiss, R., (1984). The C-test in Hebrew.
Language Testing, Vol. 1, pp. 221- 225.
D
Dornyei, Z., Katona, L., Validation of the C-test amongst Hungarian
EFL learners. Language Testing, Vol. 9, pp.187-206.
SI
Ericsson, K., Simon, H., (1984). Protocol Analysis: Verbal Reports as
Data. Cambridge: Cambridge University Press.
of
Feldmann, U., Stemmer, B., (1987). Thin_ aloud a_ retrospective da_
in c-te_ taking: diffe_ languages-diff_ learners-sa_ approaches?
In Faerch, C., Kasper, C., (Eds.), Introspection in Second
Language Research. (pp. 251-267). Multilingual Matters,
Clevedon, Green, A., (1998). Verbal Protocol Analysis in
Language Testing Research. Cambridge: Cambridge University
Press.
ive
Grotjahn, R., 1986. Test validation and cognitive psychology: some
methodological considerations. Language Testing, Vol. 3, pp.
159-185.
ch
Hastings, A. J., (2002). Error analysis of an English C-test: Evidence
for integrated processing. In Grotjahn, R. (ed.), Der C-Test.
Theoretische Grundlagen und praktische Anwendungen, (Vol. 4).
AKS, Bochum, (pp. 53-66).
Ar
Hood, M., (1990). The C-test: a viable alternative to the use of the
cloze procedure in testing? In: Arena, L., (ed.), Language
Proficiency. (pp. 173-189.), New York. Plenum Press.
Jafarpur, A., (1995). Is C-Test superior to cloze? Language Testing,
Vol.12, pp.194-216.
--------, (2001). A comparative study of a C-test and a cloze test. In:
Grotjahn, R., (Ed.), Der C-Test. Theoretische Grundlagen und
praktische Anwendungen, (Vol. 4). AKS, Bochum, (pp. 21-41).
www.SID.ir
82
A Verbal Protocol Analysis of a C-Test
D
Kamimoto, T. (1992). An inquiry into what a C-Test measures.
Fukuoka Women’s Junior College Studies, , Vol. 44, pp. 67-79.
SI
Klein-Braley, C., (1985). A cloze-up on the C-Test: A study in the
construct validation of authentic tests, Language Testing, Vol. 2,
pp. 76-104.
----------, (1997). C-tests in the context of reduced redundancy testing:
an appraisal. Language testing, Vol. 14, pp. 47-84.
of
Klein-Braley, C., Raatz, U., (1985). A survey of research on the Ctest. Language Testing, Vol. 1, pp. 134-146.
Microsoft word, (1985)-95. Microsoft Word, Arabic Edition, Version
3.1. Microsoft Corp.
ive
Mochizuki, A., (1994). Four kinds of Texts, their reliability and
validity. JALT Journal, Vol.16, pp. 41-54.
Oller, J.W. Jr. (1979). Language Tests at School. Longman Group
Ltd., London.
ch
Rakhshanfar, M.R., Jahrudi, H., (n.d.). Selected English Reading
Books. Khajeh Nasir Technical College, Tehran.
Ar
Sigott, G., (1995). The C-test: some factors of difficulty. Arbeiten aus
Anglistik und Amerikanistik, Vol. 20, pp. 43-53.
---------, (2002). High-level processes in C-Test taking? In: Grotjahn,
R., (Ed.), Der C-Test. Theoretische Grundlagen und praktische
Anwendungen,(Vol. 4). AKS, Bochum, (pp.67-82).
Singleton, D., Little, D., (1991). The second language lexicon: some
evidence from university-level learners of French and German.
Second Language Research, Vol.7, pp. 62-81.
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
83
Stemmer, B. (1991) What's on a C-test taker's mind? Mental processes
in C-test taking. University of Dr. N. Brockmeyer, Bochum.
of
SI
D
Storey, P., (1997). Examining the test-taking process: a cognitive
perspective on the discourse cloze test. Language Testing, Vol. 14,
pp. 214-231.
Text 1
ive
Appendix (The C-test)
Ar
ch
The lion is called the king of beasts. Lions a - - found liv- - - wild i - the
grass- - - - - of Afr - - - . They hu- - smaller ani - - - - and fe- - on th - -. There a
- - no wi - - lions i - Europe, b- - there a- - captive li - - - in Euro - - - - zoos.
T- - male li - - is a beau - - - - - animal. Ro - - - his head he has a ring of long
hair called a mane. When the lion is young, the hair of his mane is yellow.
When he is old, the hair is sometimes black. The female lion, or lioness, does
not have a mane. Lions are dangerous animals. A lion can kill a man.
Text 2
People faint when the normal blood supply to the brain is suddenly cut
down. This c-- happen i- they a - - surprised o- shocked b - sudden ne - - or b
- something th - - see. So - - people fa- - - if th - - see oth - - - hurt. So - people fa - - - in cro - - - . Others fa- - - if th - - are i- a room th - - is h - and stuffy. If a person faints while standing, lay him down. If his face is pale,
lift his feet. If he is sitting down when he faints, place his head between his
www.SID.ir
84
A Verbal Protocol Analysis of a C-Test
SI
D
knees. Loosen any tight clothing that might keep him from breathing easily. If
possible, place a cold, wet cloth on his forehead.
Text 3
Text 4
ive
of
The Black Sea gets its name from the color of its water. In win - - - its co
- - - is ve - - dark. Th- - is cau - - - by fo - - that set - - - low ov - - the ar- and c - - off sunl - - - - . The Bl - - - Sea i- 748 mi- - - from ea - - to we - - ; it i
- 374 mi - - - from no - - - to so - - -. Four countries- Russia, Romania, Bulgari,
and Turkey- border the sea. Several large rivers empty into it; the Danub,
Dnieper, Don, Bug, and Kuban are a few. The deepest part of the sea is in its
south central region. Many ports line the sea. Grain, lumber and sugar are the
main exports that pass through these ports. Fishing is good in the Black Sea and
supports many of the people on its coasts.
Ar
ch
We have just climbed out of a spaceship onto the surface of the moon.
Behind u - is t- - ship, ha- - in t - - sunlight a- - half i - deep sha - - - . A few
mi - - - ahead i - a wall o- mountains towe- - - - against t- - black s - -. And
th - - -, as tho- - - resting o- the moun- - - - - , is a gr- - - ball o - light beaut - - - - - colored in blue and green and brown with a patch of dazzling white at the
top. It is our own faraway world- the earth. We take a step and rise like prize
jumpers- up, float, and down again. Hopping carefully, we explore the valleys,
the sloping crater walls, the shadowy crater floors. Not a sound can be heardthere is no air to carry sound, no wind; there are no smells, no plants, no
animals. There is nothing but rock and dust, blinding sunlight and cold black
shadows.
www.SID.ir
IJAL, Vol. 8, No. 2, September 2005
85
Text 5
Ar
ch
ive
of
SI
D
People in different countries may eat the same food but they prepare it very
differently. For exa- - - - , Chinese so - - is th - - and cl - - - , but Ger - - - soup
i- thick a - - heavy. So - - people li- - raw me - - , while oth - - - like me - only i- it i- well-cooked. Ma - - people li - - butter fr- - - and fi - - , but th - - are peo - - - in India who like it melted into an oil before they eat it. Many
people in the East like plain boiled rice, but in some countries people like theirs
made into a sweet pudding.
www.SID.ir
A Verbal Protocol Analysis of a C-Test
Ar
ch
ive
of
SI
D
86
www.SID.ir
View publication stats