DOCUMENT RESUME
ED 244 987
AUTHOR
TITLE
INSTITUTION
SPONS AGENCY
PUB DATE
NOTE
PUB TYPE
EDRS PRICE
DESCRIPTORS
IDENTIFIERS
TM 840 310
Herman, Joan L.; Dorr-Bremme, Donald W.
Teachers and Testing: Implications from a National
Study. Draft.
California Univ., Los Angeles. Center for the Study
of Evaluation.
National Inst. of Education (ED), Washington, DC.
Apr 84
38p.; Paper presented at the Annual Meeting of the
American Educational Research Association (6Cth, New
Orleans, LA, April 23-27, 1934).
Speeches/Conference Papers (150) -- Reports
Research/Technical (143)
MF01/PCO2 Plus Postage.
Administrator Attitudes; Educational NeedS;
*Educational Testing; Elementary Secondary Education;
*National Surveys; Principals; Teacher Attitudes;
*Teacher Behavior; Teacher Education; *Teacher Made
Tests; Test Selection; *Test USe
*Curriculum Embedded TeStS
ABSTRACT
This paper presents findings from a study of
teachers' and principals' testing practices. The research included a
nation-wide survey, exploratory fieldwork in preparation for the
survey, and a case study inquiry on testing_costs. Teachers and
principals share misgivings with some of the research community about
the appropriateness of required tests for some students, and about
their quality and equity. Teachers seem to use test results
temperatelyas one of many sources of information. As a result of
required testing, more time is spent in teaching basic skills and
less attention can be paid to other subject areas. The survey also
Suggests that those in the education and testing communities have
paid far too little attention to the matter of teachers' assessment
skills. Teachers essentially receive neither training nor any kind of
supervision nor any supporting resources in the development of their
own tests. Given their frequency and importance at the elementaryschool level, the findings also suggest curriculum-embedded testing
as another neglected area of inquiry. Finally, formal measures should
have three important qualities: a close match to curriculum,
immediate avaiiability and accessibility, and feelings of ownership.
(BW)
****************x******************************************************
Reproductions supplied by EDRS are the best that car be made
from the original document.
***********************************************************************
DRAFT
Teachers and Testing:
Implications from a National Study
Joan L. Herman
And
Donald W. Dorr-Bremme
Center for the Study of Evaluation
UCLA
U.S. DEPARTMENT CIF EDUCATION_
NATIONAL INSTITUTE OF EDUCATION_
EDUCATIONAL RESOURCES INFORMATION
CENTER (ERIC)
1,4: This document has been reproduced as
received from the person or organization
originating it.
Minor changes have been made to improve
reproduction quality.
edi in this do_Cu2
Points of view or opinions
ment do not necessarily represent official NIE
position or policy.
"PERMISSION TO REPRODUCE THIS
MATERIAL HAS BEEN GRANTED BY
.
/4"-ertt...e.,1
TO THE EDUCATIONAL RESOURCES
INFORMATION CENTER (ERIC)."
The research presented herein was performed pursuant
to a grant from the National Institute of Education.
However, the opinions expressed herein do not necessarily reflect the position or policy of the National
Institute of Education, and no official endorsement by
the National Institute of Education should be inferred.
Paper Presented at the Annual Meeting
of the American Educational Research Association
New Orleans_, LA
April 1984
Teachers and Testing:
Implications from a National Stu4y
AbStraet
Joan L. Herman and Donald Dorr-Bremme
Center for the Study of Evaluation
UCLA
This paper presents finaings from a national survey of teachers' and _
principals' testing practices.
Implications are drawn for staff development and training in test development and selection,_clinical decision making; and assessment of higher level skills; for quality control in
curriculum-embedded testing; and for structuring district and school
testing programs to facilitate their use by teachers.
Introduction
Fueled by school
board accountability concerns, minimum competency
mandates, evaluation requirements for federal, state and local programs,
and
of
growth
the
achievement
systems,
of
enterprise
testing
in
(Baker,
arbitrariness of current testing practices
testing
about
of
their
narrowing
validity
the
and
curriculum
bias
and
both
an
subject
of
become
the
Critics have attacked the
considerable public discussion and debate.
concerns
and
visibility
and
scope
significant
has
schools
American
assessment
continuum-based
and
curriculum-embedded
1978), have expressed
have
1978),
(Perrone,
have questioned
the
accused
value
of
traditional testing amidst changing functions of education (Tyler, 1978).
The quality of available tests continues to
be controversial
(CSE, 1979;
The Huron Institute, 1978), at least one major teachers' organization has
called for a moratorium on the use of standardizxed tests, and vigorous
legal battles have been launched.
Responding to these various challenges,
reaffirmed
its
advocates of
variety of purposes that
importance and reasserted the
current tests can and do serve.
testing have
Supporters have maintained, for example,
that testing promotes accountability,
facilitates more accurate placement
and selection decisions, and yields information useful for curricular and
instructional improvement.
The
testing controversy
rages
on
while
investment in achievement testing continues.
debate are
high,
public
policy
in
this
the
nation's
considerable
Although the stakes in the
arena
has
plodded
on
without
the benefit of basic information about the nature of testing as it actually
occurs and is used in schools.
How are
How much testng really goes on?
What functions do tests serve for teachers and princi-
test results used?
What are the effects on schools of various local; state and federal
pals?
Thete and similar questions have gone largely unaddressed.
mandates?
A
few studies have indicated teachers' reservations about the limited use of
one type of achievement measure
1970;
(Airasian,
Hilloch,
1965;
et
Body
Resnick,
the norm-referenced standardized test
-7-=
1975;
al,
1981;
Goslin,
1965;
1971;
Salmon.=-Cox,
Goslin,
Epstein
Statz and Beck;
and
1979).
Beyond this, however, the landscape of testing practices and test used in
American schools have remained unexplored.
In this context, the UCLA Center for the Study of Evaluation's (CSE)
three year study provides educational policy-makers with basic, new information
on
classroom
achievement
testing
across
the
United
States.
Conducted from 1979 through 1983, CSE's research was designed to take a
comprehensive picture of national
wide
range of
produced
norm-
types
and
of
formal
testing practices.
assessment measures
criterion-referenced
tests
and
It
investigated a
(e.g.,
commercially
curriculum
embedded
measures, tests of minimum competency and functional literacy; district-,
school-, and teacher-developed tests) at Well as some less formal means for
gauging student progress and achievement (teachers'
interactions with learners).
observations of and
Within this broad range, inquiry focused on
_
i
achievement testing practices in reading/English and in
mathematics, basic
skills areas which are the subject of continuing public concern.
Teachers
and principals at both elementary and secondary grade levele served as
primary subjects for the nationwide survey, addressing those grade levels
which had been identifed in prior research as important transition points
and the targets of frequent testing.
A nation-wide survey of teachers and principals was central
study,
and results of this
follows.
survey form the basis
of
the
to the
report that
The research also included exploratory fieldwork in preparation
for the survey and, following the survey, case study inquiry on testing
costs.
During these phases of the
conducted with approximately
project,
100 school-level
intensive interviews were
educators
in
five
school
districts across the country.
Below, we first provide a brief description of the survey sample, then
continue with survey findings on three major questions:
1.
How much and what kinds of achievement testing take place in
the nation's schools?
How important are the results of different types of assessment
in teachers' routine tasks?
3.
What are schools' and districts' administrative practices
with regard to testing and test use?
We conclude by
testing
controversy
considering the
and
explore
the
findings
study's
in
light of
implications
the current
for
teacher
training, quality control, and for structuring district and school testing
programs to facilitate their use by teachers in the classroom.
4
.=
The Survey Sample*
The
survey
addressed
a
nation=wide
sample
of
principals
teachers drawn through a successive; random=selectioo procedure.
and
First;
a nationally representative probability sample of 11.4 school districts
was drawn; stratified on the basis of dittritt size, minimum competency
testing policy; socioeconomic status; Urbah=SUburban=rutal locale, and
geographic region of the country.
(A lattice sampling technique was
used to select cells from the matrix defined by thesE five stratifying
variables, and then random sampling to selett dittrittt within a cell.)
Next,
from within
these
districts,
size
permitting,
two
elementary
schools and two high schools were randomly se-Vetted using a procedure
that facilitated (where possible) inclusion of schools at levels serving
Finally, in each of these
both higher- and lower-intoMe populations.
schools,
principals
received
directions
teachers for inclusion in the study.
for
randomly
drawing
four
Directions for elementary princi-
pals guided the random selection of two fourth=grade and two sixth-grade
teachers; those for high tchool principals, the random selection of two
teachers of tenth-grade English and two of tenth=grade mathematics.
The
principal aod each of the four participating teachers received received
questionnaires that elicited detailed information on their
and
school
testing
practices,
as
well
as
related
individual
contextual
and
attitudinal data.
detailed description of the_ sampling procedure and results_ is
contained in a separate report (Choppin, et. al, 1981). This information has not been reproduced here in order to avoid redundancy. Readers
interested in more information regarding the sample and procedure used
to draw it are referred to that earlier work.
*A
N.
5
school
Returns were obtained from 220 principals, 475 elementary
teachers,
sampled.
school
and 363 high
teachers
in
91
of the
114
districts
eleReturn rates from all principals and from teachers at the
mentary level were approximately 60%.
About 50% of the high school
To correct for differential
teachers in the sample responded.
return
rates by sampling cell and to approximate a nationally representative
distribution of respondents, weightings were applied in all descriptive
analyses.
The results reported below,
therefore,
represent weighted
princiestimates of national testing practices, test use patterns, and
pal and teacher perceptions on testing-related issues.
How Much Testing Goes on in_Schools?
Survey results show that the typical student in the upper elementaking reading
tary grades spends; on the average, about 10 hours a year
tests.}
tests and somewhat more than 12 hours a year taking mathematics
(See Table 1.)
Test-taking time, then, seems to comprise a little over
five percent of the time often allocated annually to formal instruction
in each of these subjects.
(This
figure assumes one hour of daily
instruction in each subject fOr 177 school days per year.)
The typical tenth-grade student enrolled in English, survey results
indicate, spends about 26
hourt a year completing English tests.
This
constitutes in the neighborhood of twenty percent of his or her annual
time in English class.
Fdr the typical tenth grader enrolled in mathe-
matics, taking math tests consumes a little over 24 hours each year --
roughly eighteen percent of the
time
spent annually
in
mathematics
The
likely that survey!- results underestimate actual time.
year
_they
give
over
the
survey asked teaChert to fill in all the tests
It is moot whether
and to estimate -the student time required for each.
they consistently intlUdet all tests.
I
It
is
class.
(Here, the percentages given assume daily
classOS of 45 minutes
Clearly, on the average
in each subject, over 177 days per school year.)
nationally,
the high school
the frequency and duration of testing in
subjects exceed those
jects.
in the equivalent upper-elementary-school
sub-
(Refer again to Table 1.)
The annual times for testing reported are estimates of students'test - taking times.
They can probably only serve as rough indicators of
classthe times that the teachers in question spend giving tests in the
room.
On-site interviews (Dorr-Bremme,
1982) suggest that elementary
teachers spend only about a quarter to a third of their total
testing actually giving tests in the classroom.
they devote to giving
a
reading or math test,
time an
That is, for each hour
they typically spend
another two or three hours in such activities as preparing for testing
(e.g.,
constructing and dittoing the test,
standardized testing), correcting and
reviewing directions
For
grading tests (or checking over
students' standardizedtest answer sheets), recording scores, etc.
(Time
spent consulting test results and otherwise "using" them is not included
here.)
Thus, elementary-school teachers'
exceeds the typical student's.
annual
time on testing far
(Case studies in two elementary schools
found that teachers spent on the average of 200 to 250 hours per year,
in and out of class, in achievement testing in all
subject areas--or
roughly 12 to 15 percent of their reported annual work time.)
Resources
schools, but prewere not available for detailed case studies in high
of
survey interview data indicate that the average testing time per year
high-school teachers is also much greater than their students'.
-7
Table 1
Time Devoted to Testing in Typical Classes
Total Amount of
Class Time Spent
on Testing
per Annum
Elementary School (Grades
of Test
Sessions for
Typical Student
No
Average
Length_
of Session
4 -6)
9 hrs. 56 min.
22
27 min.
12 hrs. 28 min.
23
32 min.
26 hrs. 34 min.
49
32 min.
10th Grade English Class
24 hrs. 18 min.
45
33 min.
10th Grade Mathematics Class
=.-Rmding Tests
--Mathematics Tests
Table 2
Time Devoted to Required Testing;
I. ime
As a ' ercentage o__ota
For Typical Classes
_ Percentage
Time on Tetting
Required by
State
Percentage
Time on Testing
Required by
Local School
District
Percentage
Testing Time
Devoted to
Non-Required
Tests
Elementary School (Grades 4-6)
--Reading
30
29
41
21
25
54
--Mathematics
10th Grade English Class
10th Grade Mathematics Class
74
12
9
14
77
=8=
How much of the testing just described is required by the educational
How much
hierarchy beyond the school?
Table
teachers?
data
provides
2
undertaken at the
is
to
answer
these
discretion
of
questions.
Elementary teachers in the sample report that about half the
testing they conduct both in reading and in math is required by their
At the high school level, about one quarter
state or school district.
of the classroom assessment in both English and mathematics results from
Notice; howeveri that since high
state or school-district mandates.
school students on the average spend twice as much time annually being
tested as elementary students do;
these percentages suggest that the
number of hours spent in required testing is quite similar at
actual
Notice, too, that a greater proportion of
both levels of schooling.
assessment in the high school subjects is voluntary:
conducted at the
discretion of the individual teacher.
Which types consume
What types of tests are used most heavily?
larger proportions of classroom testing time?
developed by
individual
As Table 3 shows, tests
teachers and schools and,
those which accompany curriculum materials,
level,
majority of classroom testing time.
at the elementary
occupy
the
great
Of all the test types listed, these
are the types over which teachers have most control.
They can admini-
ster them when they deem appropriate; they can design (or readily adapt)
the content to suit their own teaching emphases.
Most teachers inter-
viewed said that these types of tests fit best with their instructional
schedules and curricula.
most
valid
grading,
instruments
And, from their points
of
on-going planning
those
of
listed
teaching,
Ii
for
etc.
of view, these are the
such
The
routine
tasks
as
predominance
of
locally developed tests at the secondary level supports the rotion that
high school teachers have more control over classroom assessment than do
elementary school teachers.
But heavy use of locally developed tests in
the high schools may also reflect that they have
fewer suitable commer-
Comprehensive curricular programs --
cial testing materials available.
tests,
including texts with coordinated workbooks,
are more
etc.
widely available for teachers of the elementary grades.
Finally, note that the two types of testing most often generated
by state policy -- minimum competency testing and state assessment --
consume on
the average
very
small
proportions
of classroom testing
time.
How are Test- Results Used?
Long lists of tests' purposes have been provided in almost every
Lists of such purposes usually
test and measurement text in education.
include selection, placement,
remediation,
teacher assessment, accountability, and so on.
these ideals represent reality?
variety of potential
purposes
improvement,
instructional
But to what extent do
The survey questionnaires sampled a
and examined the extent
to
which
the
results of particular types of tests and other methods of assessment
actually serve each.
Teachers also were asked to rate the importance of a variety of
assessment types for activities in which they routinely engage.
The
results in Table 4 show that both elementary and secondary teachers do
see test results of various types as useful
decisions.
in making a
variety of
Clearly, however, teachers accord the highest importance to
their own observations of students'
work and
to their own clinical
- 10 -
Table 3
Types of Testlisedi_
As a Percentage of the_Total_Tima
Devoted_ to_Testing
ElementarY
Teachers
TYPE OF TEST
Reading
Math
10th
Grade
English
Teachers
10th
Grade
MathematitA
Teachers
5
1
8
17
8
5
2
35
74
76
Tests which form part of a
statewide assessment program
3
3
Required Minimum Competency Tests
1
2
Tests included with curriculum
materials
28
35
Other commercially published tests
17
18
Locally developed and district
adopted tests
13
School or teacher developed tests
37
=11.-
students in a curriculum,
For initially grouping or placing
curriculum to another, and for
for changing students from one group or
respondent reported that their
assigning grades, nearly every teacher
or important
a crucial
"own observations and Students' classwork" is
The great majority of respondents also indicate
source of information.
themselves develop also figure as
that the results of the tests they
judgments.
crucial
or
important
teachers also
in
these
decisions.
elementary
Many
school
the
responded that the "resultS of tests included with
curriculum being used"
are quite
influential
in their instructional
decision-making.
not attribute heavy
These results indicate that while teachers do
importance to the
results of required tests,
they
do
view them as
about initial planning and
somewhat useful sources of data for decisions
for decisiont
placement of students in groups or curriculum, and even
instructional groups or curricula
about reassigning studentsto different
throughout the year.
In this last process; they
probably serve as
"capabilities."
kind of benchmark for judging individual Student's
a
For
poorly in his
example, imagine a situation where a student is performing
or her instructional group.
A teacher might examine standardized test
ability" or whether
results to determine whether the problem is "low
explanation, and
other factors such as motivation seem a more likely
then base instructional decisions accordingly.
variety of
It is apparent from these results that teachers use a
litted; they do not rely only
sources to make each kind of decisions
upon a single information source.
As one teacher stated:
14
- 12 -
Table 4
Importance of Test Results for Teacher Decision-Making
in Elementary and Secondary Schools*
District
'Continuim
Standardized or Minimum
Competency
Test
Batteries
Tests
Decision Area:
Tests
Included with
Curriculum
TeacherMade
Tests
Teacher
Observations/
Opinions
ELEMENTARY
3;39
Planning teaching at
beginning of the
school year
2.53
(0.74)
2.60
(0.79)
Initial grouping or
Placement of students
2.51
(0.74)
(0.82)
2.91
(0.74)
3.12
(0.83)
(0.78)
2.52
(0.81)
3;04
(0.79)
(0.74)
3;12
(0;84)
3;66
(0.72)
1.62
(0.76)
1.81
(0.81)
2.89
3.38
(0.79)
(0;74)
3;69
(0;72)
Changing a student from
one group or curriculum
to another, providing
remedial or accelerated
work
Deciding on report card
grades
2.52
2.59
(0.76)
3.58
SECONDARY
Planning teaching at
the beginning of the
school year
2.22
2.38
(0.84)
(0.93)
3;59
(0.60)
2.28
2.46
2.48
(0.92)
(0.98)
(0.92)
3.04
(0.87)
3.84
(0.85)
Changing students from
one group or curriculum
to another, providing
remedial or accelerated
work
2.52
(0.95)
2.59
(0.86)
2.67
(0.93)
3.27
(0.76)
(0.66)
Deciding on report card
grades
1.36
(0.66)
1.45
(0.64)
2.29
(0.96)
3.65
(0.62)
(0.65)
Initial grouping or
placement of students
* [4-point scale:
4 = Crucial Importance - 1 = Unimportant or not used
3.61
3.68
- 13 -
"You can't count a score on one test too heavily. The kid
could be sick or tired or just not feel up to doing it that
Maybe his parents had a fight the night before. Maybe
day.
Maybe he doesn't test well." (Choppin
et
he doesn't try.
,
al, 1981)
Not only do survey respondents indicate that they consult several
sources of information about students' achievement in making
particular
instructional decisions, respondents -- and particularly those at the
elementary school
-- also report thinking that many kinds of
level
assessment techniques give them crucial
and/or important information.
The data in Table 5 are illuminating here:
over half the elementary
;
school
teachers surveyed report giving heavy weight to each of many
sources of information
in
planning their teaching, in making initial
groupings and placements, and in modifying instruction throughout the
year.
What are Schools' and Districts' Administrative Practices in the Area
of Testing and Test Use?
A growing literature suggests that district and/or school
leader-
ship is a significant determinant of whether and how educational innovations and practices are sustained (Berman & McLaughlin, 1978; Bank &
Williams, 1982; Edmonds, 1979).
Thus, the Test Use in Schools survey
examined the practices of school
and district administrators in:
(1)
making, and holding teachers accountable for curricular decisions based
on test scores;
(2) monitoring and/or supporting school
and classroom
(3) providing information
and staff development
i
testing practices; and,
on testing.
Making
and
holding
_curricular decisions.
in
this
area
teachers
accountable
for
test-score-based
The school and district administrative practices
that were
included
on
the
16
survey appear in Table
6.
- 14-
Table 5
Propartion_ofTeachers who Report Considering Many Types of Assessment Information
Critical/Important for Given Activities
Planning
Teaching at
Beginning of
School Year
Number of Sources of
Information Given in
Question on Survey
Initial
Grouping
or Placement
ofStudents_
Changing
Grouping
or
Placement
Deciding
on Report
Card
Grades
4
7
Proportion of
Elementary Teachers
who Indicated That
at Least this many
functioned as Critical
and/or Impc.tant
for the Given Activity
50%
71%
62%
40%
Proportion of
High School Teachers
33%
47%
49%
20%
6
Number of Sources
Defined as "Many"
for Purposes of
this Analysis
As the table shows, school and district administrators hardly ever esta-
test-score goals
blish specific
schools
individual
for
or
teachers.
However, district administrators occasionally do check to see that areas
in the curriculum that test scores indicate need improvement are in fact
being
emphasized
members
their
in
monitor
principals
schools;
their
staff
teaching fairly often toward this same end, particularly in
lower SES schools.
Often, too (but not, on the whole, as a matter of
routine), school administrators meet with teachers in groups or indivi=
dually to review test scores and highlight their implications for curri=
cular emphases.
Table 6 also indicates that test scores function in making and
holding teachers accountable for decisionS on curricular emphases less
frequently at
schools.
the secondary-school
Perhaps this occurs
returning test results.
level
than they do
in elementary
in relation to districts'
practices in
Secondary principals find that scores are only
rarely returned by their district such that they can be used in curricular decision making.
In elementary schools, the curriculum-embedded
tests that accompany basal
reading and math series can be used as
a
basis for cross-classroom analysis of achievement patterns when standardized-test
district
results
office.
and
other
(Recall
are
scores
that
the
use
not
of
from
forthcoming
commercial,
the
curriculum-
embedded tests is more prevalent in the elementary grades.)
Monitoring and
supporting testing practices.
Table
those tchool and district practices examined in this area.
practices examined,
only one
seems
7
displays
Of all the
to occur more than occasionally:
district monitoring of the district testing program.
18
Release time for
Table 6
Making and Holding Teachers AttOUntable for Test-score-Based
Curricular Decisions
Principals' Reports*
Elementary
Secondary
SCHOOL AOMINISTRATORM
Meets_with teachers to review scores and
Teachers' Report*
Emory
3.09
2.94
2.84
2.05
3.23
3.07
2.66
2.31
1.57
1.55
1.46
1.27
identifies areas that need extra emphasis
Observes teachers; reviews their plans
Secondary
to_ensure areas indicated by tests are
being emphasized
Takes test scores into account in evaluating
teachers and/or establishes test-score goals
for teachers to meet
DISTRICT A01INISTRATOR(S
Returns test results such that they can be
2.63
2.03
2.84
2.67
2.12
2.33
used in school's curricular decision making
Observesi reviews school plans and/or
Not Asked
requires reports to assure school is
emphasizing skills that test scores
show need work
Establishes specific test-score goals for school
*Mean ratings on for-point scale:
4 t happens regularly, routinely; 3:
2 .4 not regular or routine and happens rarely; 1
not regular or routine bUt happens fairly often;
does not happen at all;
19
20
- 17 -
teachers to develop tests is on the whole a rare phenomenon.
are administrative reviews of
student performance
on
such
(a)
So, too,
teacher-constructed tests
instruments
as
and
(b)
unit and chapter tests.
(Although not specified in Table 8, the latter test types were mentioned
explicitly in the questionnaire item.)
is little monitoring of teachers'
These results suggest that there
They
classroom testing schedules.
also indicate that one type of measure upon which teachers rely heavily
-- tests that they themselves construct -- is most often written individually and with no supervisory review.
Providing staff development and information_about testing and test
Principals were asked to comment on the frequency with which
results.
they and district administrators provided in-service experiences germane
In addition, teachers were asked to report
to testing and test results.
on the occurrence of particular types of staff development over the last
two years.
The responses of principals and teachers to these q-stions
are shown in Tables 8 and 9.
According
to
principals,
staff development for teachers
in
the
area of assessment occurs occasionally, i.e., with a frequency that on
the average falls about midway between survey categories "very often"
and
"rarely."
It appears that such
staff development
is
generally
initiated slightly more frequently by district administration than by
principals.
Of
all
the topics listed, more teachers report participating in
sessions devoted to:
(b)
directions
for
interpret and use
(a)
analysis and explanation of test results,
administering
required
tests,
the results of different types
21
and
of
(c)
tests.
how
to
Staff
Table 7
Monitoring and Supporting Testing Practices
Principals
Reports*
Teachers' Reports*
Secondary
Elementary
.Secondary
2.30 (1.10)
2.32 (1.10)
1.78 (1.17)
2.43 (1.02)
1.62 (0.92)
2.17 (1.07)
3.09 (0.95)
2.85 (1.07)
Elementary
SCHOOL ADMINISTRATOR(S)
Requires teachers to turn *41 test scores/
grades on classroom tests and/or assignments
Requires teachers to turn in copies of
Not Asked
tests they construct
DISTRICTADMPOOM
Conducts observations and/or requires reports
Not Asked
to see that all aspects of district testing
program are properly carried out
Provides release time and/or extra pay for
2.12 (1.03)
,
2.33(6.98)
teachers to develop tests or curricular
materials including tests
*Mean ratings_on four-point scale:
4 = happens regularly, routinely;_3_=_not regular or routine but happens fairly often;
2 = not regular or routine and happens rarely; 1 t does lot happen at all;
23
--- 19=
Table 8
Providing Staff Development and Information About Testing
Principals' Reports on Frequency*
Elementar
Secondary
SCHOOL ADMINISTRATOR(S)
Brings in speakers, workshops, printed
material to update teachers' assessment
skillS
DISTRICT ADMINISTRATOR(S)
.
2.62 (0;87)**
2.48 (0.77)
2;73 (0.98)
2.71 (0.90)
.
Brings in speakers, workshops, printed
material to update teachers' assessment
skills
4__=happens _regularly; routinely; 3 . not regular or routine
* Mean_ ratings -on four7point_scale:
regular
or routine and happens rarely; 1 = does not happen at
but happen 51717WerFT72---Ta
all.
** Numberc in parentheses are standard deviations.
=20-
Table 9
in Staff Development
Percentages of Tea&ers Reporting Participation
Elementary
Topic
(1)
(2)
(3)
(4)
(5)
Analysis and explanation of state,
district, or school test results
84
How to administer tests required_by
state, district, and/or school
n
(procedures to follow, etc.)
Secondary
English
70
59
Alternative ways (other than tests)
to assess student achievement
54
How to tie what is taught more closely
to the skills, content covered on
required tests
60
46
78
How_to interpret and use results of
different types of tests (e.g., norm=
referenced and criterion-referenced
tests and their applications)
Secondary
Math
35
21
50
37
25
Presentation of published materials
designed to prepare students for
particular tests or to improve
test-taking skills
41
32
29
(7)
Training in the use of test results
to improve instruction
35
21
19
(8)
How to construct or select
good tests
20
23
18
(6)
25
development devoted to increasing teachers' routine classroom assessment
skills, these data indicate, occurs much less frequently.
example,
only about a
Thus,
for
fifth of the teachers in each category report
receiving instruction in
"how to construct or select good tests," an
area in which teachers see a critical need.
Informa-
(See Ward, 1983)
tion on other means of assessment (alternatives to testing) was equally
rare
for
teachers,
secondary
although
some
of
54%
teachers did report staff development on this topic.
the
elementary
Training in the
use of test results to improve instruction was evidently provided for
35% of the elementary teachers and about 20% of the secondary teachers
sampled.
Finally,
it
worth
is
noting
that
secondary
teachers;
overall;
report receiving staff development in topics related to testing less
often than elementary teachers do;
Resources in support of testing;
In a set of questionnaire items
separate from those discussed just above, teachers were asked to comment
on the availability and use of four resources which could support their
classroom testing efforts.
10)
are presented in
Teachers' responses to these items
this section since the availability of each of
these resources can be interpreted as due,
at least in part,
initiatives of school or district administrators.
true
(Table
to the
This is particularly
for item banks of test questions and computerized scoring and
analysis of tests.
In the case of the other two items included (other
teachers with whom I plan and develop tests, someone to help grade tests
and assignments), administrators can structure organizational arrangements that facilitate their availability and use.
26
- 22 -
list
The
resources
of
included
in
instrument was
survey
the
Neverthe-
selected on the basis of considerable fieldwork and piloting.
less, each resource was unavailable to
dents.
a large proportion of respon-
The exception; of course, was "other teachers with whom I plan
and develop tests or other evaluation assignments;" but only about a
quarter of the elementary-school teachers and a similar fraction of the
secondary-school
teachers reported taking advantage of this resource
Some 45% of the secondary teachers reported constructing
frequently.
tests with others a few times a year; and fieldwork suggests that this
often occurs as teachers in the same department conjointly devise midterm and final exams.
Computerized test scoring and analysis was reported as used a few
times annually by a quarter to a third of both the elementary and secon-
dary teachers
reflect
the
(including
sampled.
use
of
Fieldwork
optical
norm-referenced,
indicates
that
scanning machines
standardized)
these
reports may
for certain
tests.
Some
standard
districts,
however, have developed computer programs for scoring unit and chapter
tests and simultaneously analyzing individual
students'
strengths and
weakness on the skills they cover.
A
access
final
to
the
point:
in
general;
nearly
all
those
teachers who have
resources listed report using them at least sometime
during the school year.
Table 10
Available Resources for Testing Percentages of Teachers Reporting
AVAILABLE
Resource
NOT
AVAILABLE
Not Used
Used Once
To Several
Times/Year
Used at Least
Once/Month
Item banks of test questions
upon which_I draw_in
making up my tests;
71
4
8
16
Elementary
51
8
24
16
Secondary
Other teachers with whom _I plan
and develop tests or other
evaluation assignments.
37
12
26
24
Elementary
21
10
45
24
Secondary
Someone who helps me read,
grade, or correct
tests and assignments.
69
6
4
21
Elementary
70
5
4
21
Secondary
Quick, computerized
scoring and analysis
of tests
64
2
30
4
Elementary
58
16
22
4
Secondary
-24-
Conclusions
We began this discussion by noting the public controversy over the
quality and usefulness of testing; a controversy which has been marked
evidence and one which has centered
by more rhetoric than empirical
What do
primarily on standardized tests and large scale assessments.
the
survey
results
have
to
say
concerns
these
about
and,
more
particularly, about concerns for the potential misuse and abuse of test
results?
Teachers and principals do share misgivings with some in the
research community about the appropriateness of required tests for some
Survey findings here;
students, and about their quality and equity.
however, allay some concerns about the inappropriate use of tests by
classroom teachers.
Teachers (and principals, according to findings not
reported here) seem to use test results temperately -- as one of many
sources of information.
They do not give undue weight to any single
source, but rather evaluate available data in combination with their own
observations
to
reach
decisions.
Test
results,
according
to
the
findings presented here, are thus being used, but not abused.
The influence of test results on school
and classroom decision-
making is one direct impact of tests, but another impact is felt in the
very presence of
required testing,
required testing
school
personnel
in
the
schools;
As
a
result of
agree that more time is spent in
teaching basic skills -- English and math -- and less attention can be
paid to other subject areas, and principals and teachers, particularly
in lower SES schools, are strongly encouraged to emphasize those skills
which are included on required tests.
validity
of
some
concerns
about
the
The findings thus confirm the
effect
of
testing
on
the
Admittedly, tests alone have not caused the curriculum to
curriculum.
narrow.
Rather,
the
narrowing
is
a
consequence of
the
importance
ascribed by society at large to test scores and of a societal emphasis
Nonetheless, it might be well
on basic skills.
both for public and
policymakerS to consider whether the limited sample of skills assessed
by most standardized tests represents an adequate curriculum and whether
test developers, rather than teachers,
administrators, school boards and
the public, ought to be defining the curriculum.
First, the survey
What else does the ESE research have to tell us?
suggests that those in the education and testing communities have paid
far too little attention to the matter of teachers' assessment skills.
For the most part, as mentioned above, the debate on testing has been
played out in exchanges about the relative merits of normed and criterion-referenced measures, in discussions of cultural and linguistic biases
in standardized tests,
testing and so an.
statewide
that
in sociopolitical controversy over proficiency
It has focused on measures employed nationwide or
generally
have
been
developed by
commercial
concerns or by other large agencies that employ psychometricians.
testing
It is
appropriate for us to be concerned about the qualities and social implications of such tests.
and teachers'
classroom time,
decisions and they consume only small
proportions of
tests of this type do exert significant influence in
major educational
teachers'
Although they figure less heavily in principals'
gate-keeping
decisions.
assessment skills, their skills as
However,
the
quality
of
test developers and as
clinical diagnosticians, have largely escaped attention.
Yet the cumu-
lative record of teacher=made tests, the grades in which they result, as
=26=
as
well
the
teachers'
informal
of
judgments
children's
competence
clearly influence students' educational careers in major ways, perhaps
to
a
students, particularly secondary students,
What is more,
testing.
degree exceeding that of more formal
spend large proportions of
their testing time taking teacher developed and teacher-scheduled tests.
What do we know about the quality of teacher-developed tests?
little.
Almost twenty
And the little we know is far from encouraging.
years ago,
Ebel
(1967)
Very
identified common errors in teacher-developed
tests and urged better training for teachers in this area.
More recent
research indicates that teachers remain poorly prepared in assessment
(Rudman and others, 1980; Yeh and others, 1981), a finding which is not
surprising in light of preservice and inservice requirements and opportunities
for teachers.
Few states explicitly require competence in
testing for teacher certification (Woellner,
1979),
and studies have
indicated that while most teachers have had at least one measurement
course, attention
to
teacher-developed tests and clinical
skills is virtually non-existent (Gullickson, 1984; Ward,
assessment
1983).
The
results reported here indicate that inservice training does little to
fill
the
gap.
Only about one-fifth of the teachers
in
our survey
received inservice experience related to the selection and construction
of good tests or in the use of testing for classroom decisionmaking and
to improve instruction; according to other studies, these are two areas
which teachers rate as most important and in which they agree they need
help (Gullickson, 1984; Ward, 1983).
opportunities
if they are
to
Clearly, teachers need training
be competent test developers, skilled
analysts, and literate consumers of test information.
-27-
Although the study reported here did not directly address the issue
of the quality of teacher-made tests, its findings combined with those
Teachers essentially receive
cited above give cause for some pessimism.
training
neither
nor
any
kind
supervision
of
have examined explicitly
concern.
the
quality
any
supporting
One of the few studies
resources in the development of their own tests.
which
nor
issue
raises
additional
Fleming and Chambers (1983) analyzed teacher-developed tests
in Cleveland schools and found that teachers can deal with many of the
technical requirements for classroom tests, such as arrangement of test
questions, format of test questions, and the avoidance of obvious technical flaws; however, almost one-fifth exhibited errors in mechanics and
technical conventions.
More disturbing is the fact that the vast major-
ity of test questions reviewed focused on lower-level skills; requiring
recall
rules and principles; test items
of terms; factual knowledge;
requiring synthesis and higher level applications accounted for only a
very small
Many have noted that tests
percentage of the questions.
communicate expectations to students and identify for them the important
knowledge and skills that
are
for
required
particular courses;
the
objectives that really matter for students are those embedded in the
tests on which their grades are based
expressed
earlier,
and
appropriately
(Bloom,
so,
1981).
Concerns were
about curricular narrowing
associated with required tests: an equally important issue may be the
extent to which the curriculum is
being narrowed to memory and rote
learning as a function of teacher-developed tests.
Teachers, in short,
not only need training in test development, but they apparently also
need particular assistance in assessing (and perhaps in teaching) higher
level skills.
32
-28-
Given
level;
the
their
frequency
findings
and
reported
importance at
here
also
the elementary
suggest
testing as another neglected area of inquiry.
school
curriculum-embedded
Like teacher-developed
tests, we know very little about the quality of these measures, and,
again; what we do know does not give cause for optimism.
For example,
analyses of commonly-used basal series have criticized their failure to
utilize common research-based design principles (Quellmalz and Herman,
1978), and informal perusal of some recent tests indicates some serious
flaws, e.g., tests which claim to be diagnostic on the basis of one item
per objective.
It may well
be that some quality assurance mechanisms
are needed.
As we think about training requirements for teachers and quality
control
for commercial
tests, it might be well
also to explore other
testing supports that might be provided for teachers.
When taken seri-
ously, test development is an arduous and time consuming process.
might wonder whether teachers,
in fact,
One
have the time and energy to
produce good tests or whether a better approach might be to explore ways
to better enable them to capitalize on and use the efforts of others.
Item banks are one possibility, either representing the pooled efforts
of teachers within a school/district or commercially available options
(although they currently exist, both are likely to have quality control
problems).
With micro-computers on almost every school
campus;
the
technological requirements are in place for easily accessible tests that
can be customized to teachers' unique needs and classroom instructional
programs.
These same computers can be used to facilitate onerous test
scoring, recording, grading and management tasks.
-29 -
While we work to improve the quality of teacher and
curriculum
embedded tests, we must also strive to improve the usefulness of more
CSE's study suggests three general but highly impor-
formal measures.
tant qualities that more formal measures should have, qualities which
are inherent in the teacher-developed and curriculum-embedded tests that
a close match to curriculum, immediate
teachers use most frequently:
availability and accessibility, and feelings of ownership.
That is,
formal measures must reflect what is being taught in class,
and they
must be sensitive to teachers' intentions and emphases as teachers them-
selves perceive them.
these measures
to
Moreover, teachers must be able to administer
students
when
they
feel
it
appropriate,
results must be both understandable and available promptly.
and
the
Finally,
the content, format and timing of the measures must be under the control
and discretion of individual teachers and teachers must feel their needs
and input have been influential.
Many commercial, state, district, and
school testing programs do not reflect these characteristics, and the
results are predictable:
elaborate systems that are of little use to
teachers and that teachers little use.
Counter-examples, however, also
can be identified; and where these occur we have found that teachers
routinely
use more
formal
measures,
representing more
sophisticated
technology and higher technical quality, rather than their own tests.
In summary, our research suggests several complementary avenues for
improving the quality and use of tests in schools.
First, given the
time devoted to teacher-developed tests, it seems well worth considering
teachers'
competence
preparation for the role of achievement assessor and their
in
that role.
Similarly,
given
34
the
time
and
importance
-30-
accorded curriculum embedded tests, we would do well to examine and
better assure that quality of those tests.
Finally, we need to investi-
gate ways to provide teachers with tests which they can use routinely,
which reflect sound test procedures, and which meet their needs.
35
= 32 =
Tyler, R- Mhat's_Wrang with Standardized Testing.
1977, 66(2), 35=58.
Today's Education,
Standardized Test Results
Let's Use Test for Teaching:
Woellner, R.S.
Can Provide the Basis for a Program of Instruction. Teacher, 1979,
90(2), 62-63, 179-181.
A Survey
Yeh, J.P., Herman, J.L., & Rudner, L.M. Teachers and Testing:
Center for the Study
Los Angeles:
Report NO. 166.
of Tpst Ike.
UT-NiaTifibn; 1981.
36
-31 -
REFERENCES
The Fffe.cts_ of Standardized Testing and Test Information
prceptinns an. iractices _'aper presente
Tai-fie
annual meeting the American Educational Research Association, San
Francisco; 1979:
Airasian, P.W.
on T-
d-
_.
Metaphysical Test
BP.ker, E.L. Is Something Bettcr
Paper presented at the 1978 CSE Measurement and
DesignMethodology Conference, Los Angeles, 1978.
A
Boyd, J., Jacobsen, K., McKenna, B.H., Stake, R.E. & Yashinsky, J.
Study of Testing Practices in the Royal Opk (Michigan) Public
Royal Oak Michigan School District, 1975.
Schools. Royal Oak:
Center for the Study of Evaluation. ESE Criterion-Referenced Test
Los Angeles, CA:
Center for the-5tudy of naluation,
Handbook.
079.
Improving the Competence of Teachers in Educational
Ebel, R.L.
In J. Flynn and H. Garber LEds.), Assessing Behavior:
Measurement.
Readings in Educational and Psychological Measurement. Reading,
RTT--AUTiToii=11FiTiT1967.
Goslin, D.A., Epstein, R., & Hilloch, B.A.- The Use of Standardized
Tests in Flementary_Schools._ Second TechnicaTRiV6717RiT757k:
Russell
Huron Institute.
ring iaonference_of the National
Summary! of Abe
Cambridge, MA: ffbron Institute, 19/8.
Consortium_nnTesting.
the National ConfPrencP on_Achitvement Testing
on erence on__
'a iona
aper presen edit
andAchievement Tisting and Basic Skills, Washington, D.C. March 1978.
Peronne, V.
Remark
,
Research to Inform a Debate.
Introduction:
Resnick, L.B.
Kappan, 1981, 62(9), 623-624.
Phi Delta
Rudman; H.C., Kelly, J.L6 Wanous; B.S., Mehren, W.A., Clark, C.M., &
I I
nstructiom__A__Remiew
Porter, AC. Integr
Institute for Research on Teaching,
East Lansing, MI:
1922-1980.
II
ITART7----
Salmon-Cox; L. Teachers and Tests:
Kappan, 1981; 62(9), 631-634;
What's Really Happening?
Phi Delta
Stetz,_F.J& Beck; M. Teachers±Apinions_cf_Standardized_Test Use _and
Usefulness; Paper presented at the annual meeting of-the American
Educational Research Association; San Fracisco, 1979.
- 32 -
Tyler, R.
What's Wrong with StandardiZed Testing.
Today's Education,
1977; _66(2), 35-58.
Let's Use TeSt fdr Teathitig:_ Standardized Test Results
Woellner, R.S.
Can Provide the Basis fdr a Program of Instruction. Teacher, 1979,
9_0(2), 62-63; 179-181.
Yeh, J.P., Herman, J.L.; & Rudner,_ L.M. 1-actiers and Testing: A Survey
Center fOr the Study
Lot Angeles:
Report No. 166.
of Tect Use.
of Evaluation, 1981.
38