Articles zyxwvutsrqp
Case Studies of Evaluation Utilization
in Gifted Education zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHG
CAROL TOMLINSON, LORI BLAND,
TONYA MOON, and CAROLYN CALLAHAN
The findings do not reflect the positions or policiesof the Office of Educational Research
and Improvement or the United States Department of Education.
INTRODUCTION
Numerous reasons exist for evaluations, among them: improving effectiveness of programs
and program personnel, reducing uncertainties, assisting with decision-making and goalsetting, seeking justification for decisions, meeting legal requirements, fostering public
relations, enhancing the professional stature of the evaluator or program administrator,
boosting staff morale, mustering program support, and changing policy, law or procedure
(Alkin, 1980; Bissell, 1979; Mathis, 1980; Ostrander, Goldstein, 8c Hull, 1978; Raizen &
Rossi, 1981). Nonetheless, the literature of education is replete with examples of evaluation
findings which never resulted in program enhancement, improvement, or development.
Disregard for findings of educational evaluation is costly in effort, monies, and in human
terms when potential program improvements are stillborn (Datta, 1979; King, Thompson,
& Pechman, 1981).
Because of a general lack of public understanding of and support for programs for
the gifted, and keen competition for scarce resources, the survival of programs for gifted
learners may depend on carefully planned evaluations which yield useful information that
can be translated into documentation of effectiveness and action to improve programs
by educational decision makers (Dettmer, 1985; Renzulli, 1984).
Smith (1981) calls for developing more information about evaluation practice through
use of both conceptual reviews and empirical study. A review of literature (Tomlinson,
Bland, & Moon, 1993), including general evaluation utilization and literature relating to
evaluation utilization in the field of gifted education, provided a conceptual base for the
Carol TotnIInson, LorI BIand, Tonya Moon, and Carolyn CaIIahan l Curry School of Education, Department of Educational
Studies, University of Virginia, 405 Emmet St., Charlottesville, VA 22903-2495.
Copyright @ 1994 by JAI Press, Inc.
Evaluation Practice, Vol. IS, No. 2, 1994, pp. 153-168.
All rights of reproduction in any form reserved.
ISSN: 0886-1633
153
154 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
EVALUATION PRACTICE, 15(2), 1994
study reported in this article. The review delineates factors affecting evaluation utilization
including internal and external factors, message source, message content, and message
receiver. These factors served as an organizer for investigation in the study reported here
which uses empirical methods to compare evaluation designs and practices used in
programs for the gifted in a variety of school districts. The result is presentation of profiles
of evaluation utilization which have been called for (Smith, 1981), but which have been
scant in the literature of evaluation in general (examples of such studies are, Dawson &
D’Amico, 1985; and Mutschler, 1984) and absent in the literature of evaluation of programs
for the gifted in particular. The research reported here is unique because of its use of
multiple cases in an empirical study of evaluation utilization.
A study by Hunsaker and Callahan (1993) provided a conceptual framework for the
study reported here. They described the current state of practice in evaluating programs
for the gifted, and factors associated with “strong” and “weak” evaluation designs and
practices in gifted programs. Hunsaker and Callahan described as weaker those reports
which:
1) were disseminated solely to district administrators as opposed to broader
stakeholder audiences,
2) failed to include recommendations for action; and
3) lacked apparent mechanisms for translating findings into action.
Reports categorized as stronger were disseminated more broadly, included
recommendations for action, and outlined mechanisms for translating findings into
positive program change.
Based on the Hunsaker and Callahan findings and categories, the current study sought
to determine the degree to which representative evaluation reports from districts using
stronger and weaker practices generally adhered to utility standards outlined by the Joint
Committee on Standards for Educational Evaluation (1981), and the degree to which
reports from stronger and weaker districts were utilized for positive program change.
METHODS
Background and Selection of Sites for Study
Hunsaker and Callahan (1993) collected several hundred evaluation reports on
programs for the gifted from educational data bases, an appeal through professional
journals, and direct mail requests to state-level gifted coordinators and over 5,000 school
districts. While many of the reports received consisted only of program descriptions and/
or evaluation instruments, 70 reports also contained evaluation plans and evaluation
results. These seventy reports served as an initial pool of cases from which researchers
in the current study selected sites for investigation in regard to evaluation utilization.
In a first sort of reports from the initial pool of seventy, Hunsaker and Callahan
divided reports according to those giving no recommendations for program change, those
giving recommendations,
and those going beyond recommendations
toward
implementation by forming implementation committees, developing policies to support
implementation, and implementing suggested changes. Those giving no recommendations
Evaluation Designsand Practices
155
were considered examples of “weak” practice, while those going beyond recommendations
toward implementation were considered examples of “strong” practice.
Within the two categories of “weak” and “strong” practice, Hunsaker and Callahan
conducted a second sort according to range of evaluation audiences of reports, applying
the researchers’ belief that dissemination to a broader range of audiences is more useful
than dissemination to a narrower set of stakeholders. Reports highlighted by this sort were
then arranged in chronological order, with the six most recently conducted evaluations
in the “strong” and “weak” categories given preference for study based on the pragmatic
conclusion that the more recent the evaluation, the more valuable it would be in conducting
a case study because of the likelihood that key personnel involved in the evaluation process
would still be available, and that their recollection of events would be more complete.
Researchers used the 12 exemplar districts as sites for the current study. Six cases served
as examples of “strong” practice, and six as cases of “weak” practice. The 12 represented
great diversity in geography (midatlantic, northeast, midwest, west coast), size (from a
district with only three schools to a district with 179 schools), and program design (including
differentiation in the regular classroom, pullout programs, schools within schools, separate
classes, schoolwide enrichment models, or combinations of delivery systems).
Three university researchers each interviewed persons from four school districts. One
researcher worked with the four “strongest” districts and one with persons from the four
“weakest” districts. A third researcher was blind to the strong/ weak labeling throughout
the interview process, in order to serve as a check on the method used to rate districts.
This researcher interviewed two districts from the “strong” category and two from the
“weak” category. Two of her districts were “weakest of the strong” and two were “strongest
of the weak,” creating in essence a “middle” category. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPON
Defmition
For purposes of this study, evaluation utility was defined as use of formative and/
or summative evaluation information to affect a program for gifted learners in at least
one of three ways: altering ways in which program participants, evaluation audiences and/
or decision-makers thought about the program; changing the decision-making process
and/ or decisions made by stakeholders in the program; or invoking some action regarding
implementation of the program.
Data Collection and Analysis
Initial contact for this study was made by sending letters to school superintendents
and contact persons in the 12 selected school districts, asking for cooperation in the study.
Phone calls were then made to district contact persons to determine key participants in
the evaluation process (e.g., program evaluators, coordinators of gifted programs, teachers
in the gifted program, general classroom teachers) and to arrange for initial interviews.
Additional respondents were also identified from evaluation reports or by initial
interviewees as the study progressed. In two school districts, only one participant was
available. In each of the others, between two and seven interviewees participated.
Telephone interviews were conducted in two phases by three university researchers
with training and experience in qualitative research and program evaluation. Initially,
interviewers used a four-question interview protocol:
EVALUATION I’RACTKE, 15(2), 1994
1) Tell me about the process your district used to evaluate the gifted program.
2) What were the outcomes of the evaluation?
3) How did the evaluation process affect the thinking of district personnel about
or their planning for the programs for the gifted; and
4) How was information gathered through the evaluation process used?
Researchers asked followup questions to extend and clarify responses to the initial
questions. A second round of interviews followed with questions derived from the utility
standards in the Standards for Evaluations of Educational Programs, Projects, and
Materials (Joint Committee on Standards for Educational Evaluation, 1981) with followup
questions used to clarify informants’ answers (see Figure 1).
As interviews were conducted, summaries were sent to informants for verification
or modification as necessary. Following all interviews and member checks, content analysis
of interviews was conducted, with an informant’s complete interview serving as a coding
Audience Identtjkation:
Evaluator Credibility:
Information Scope
and Sequence:
Valuational Interpretation:
Report Clarity:
Report Dissemination:
Report
Timeliness:
Evaluation Impact:
Other:
When you planned your gifted program evaluation, did you involve
particular groups in the planning process so they would be more aware
of the program and its evaluation? If so, who, how?
When you planned your evaluation, did you talk about who might need
or use the results? Can you give me some examples of such groups and
how you planned the evaluations to ensure that findings would be useful
to these groups?
What thoughts did you have about the qualifications or requirements
of people who might plan or conduct or report findings of your gifted
program evaluations?
What plans did you make for determining questions you asked, who
you asked, and how much data you collected in evaluation of your
program for the gifted?
How did you decide the ways in which you interpreted information
collected?
How did you share these methods of interpretation with others in your
division or community?
How did you report out your findings?
(If there is a formal written or oral report) What did you include in
the report?
How did you decide who should be told about findings of the gifted
program evaluation? Who was told?
What was the turnaround time between conducting the gifted program
evaluation and sharing findings with people who received them?
Can you describe ways in which you interacted with different groups
in your district or community to encourage that action be taken as a
result of your findings?
What else do you feel we should know about processes, procedures,
or issues which arise in your district as you evaluate programs for the
gifted and use findings from these evaluations?
Figure
1.
Round
Two Interview
Protocol
157
EvaluationDesignsandzyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Pmct im
unit, and using pre-ordinate and emergent categories. Pre-ordinate categories included
factors suggested by the literature as impacting use of evaluation findings and factors
suggested to be important in the related study as referenced earlier (Hunsaker t Callahan,
1993). Emergent categories were those which were repeated within and among the
interviews (e.g., informal evaluation, committee involvement, changes recommended,
changes made, etc.) Information was aggregated first for the three interview categories
separately (strong, blind, and weak). This resulted in separate profiles of and factors
distinguishing “strong” and “weak” districts related to their use of evaluation findings.
A district’s evaluation documents were reviewed prior to interviews in order to develop
a basis for followup questions, and following interviews for triangulation of information
with the interviews. Additional triangulation was obtained by interviewing several people
in most school districts and by interviewing several districts in the both the “strong” and
“weak” categories. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
RESULTS
Commonalities among the Groups
There are three points which should be made regarding commonalities among the
three groups.
1. It is important to note that the “blind group” did, indeed, serve as a check and
verification that the sorting process described earlier accurately delineated
districts with weaker evaluation reports which differed in marked ways from
districts with stronger evaluation reports. That is, the case which was “strongest
of the weak” produced a profile much more like that of the weaker group than
of the stronger, while the case which was “weakest of the strong” appeared more
like the stronger group than the weaker. (Typical profiles will be presented later.)
Perhaps coincidentally and perhaps not, the two groups nearest the middle of
the 12 “exchanged positions” during the course of the study. This phenomenon
will be discussed later.
2. It is important to note areas of kinship shared by all 12 districts studied. All 12
showed an interest in evaluation of gifted programs as indicated by their
submission of evaluation reports and via willingness to participate in the interview
process. This conclusion is substantiated by another commonality which is that
all 12 districts did have some sort of plan to evaluate programs for the gifted.
Thus while the reports and procedures are discussed in terms of “weak” and
“strong,” it is likely that even the “weak” districts are ahead of the game in the
evaluation of gifted programs when compared with many districts which have
no systematic intent to evaluate or plan for doing so.
3. A third commonality to note was unexpected, and important to note. Our
assumption was that districts using weaker evaluation practices would exhibit
little, if any, use of evaluation information. In fact, however, all 12 districts used
the information gathered through evaluation to bring about some level of change
in programming. It cannot, therefore, be concluded that evaluation utility was
absent in the weaker districts and present in the stronger ones. What the study
158
EVALUATION PRACTICE, 15(2), 1994
revealed was a continuum of evaluation processes and procedures, yielding a
continuum of utility results (Tomlinson, Bland, & Callahan, 1992), and distinct
profiles of stronger and weaker districts related to Joint Committee utility
standards.
Profiles of Weaker and Stronger Districts
Factors along which continua of evaluation practice developed among the districts
studied were:
1) purposes for evaluation (with stronger districts exhibiting more policy-driven
evaluations);
2) methods of evaluation and data analysis (with stronger districts emphasizing both
process and outcome data, a broader range of outcome documentation, both
qualitative and quantitative data analysis, and more sophisticated data analysis);
3) implementation plans (with stronger districts using more specific, multifaceted
and institutionalized implementation processes and procedures to ensure use of
results);
4) evaluation reports (with stronger districts using a more formal reporting format
which was written in varied forms based upon multiple needs of multiple
audiences);
5) participants in the evaluation process (with stronger districts using a greater
number of human data sources, a greater variety of representatives from the data
sources, and a greater variety of roles for participants in the evaluation process);
6) qualifications of program personnel (with stronger districts more likely to involve
a staff member or volunteer trained or experienced in gifted education or
evaluation, and more likely to have cooperative working relationships between
experts in the two fields); and
nature of the change resulting from the evaluation process (with stronger districts
7) zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
tending to have results more focused on specific program elements rather than
more general or global in nature) (see Figure 2).
Profiles of weaker and stronger districts presented here were amalgamated by
abstracting characteristics from the continua in order to construct profiles of typical
districts. Doing so enables comparison of the impact of the evaluation process in weaker
and stronger settings. Quotations used in the amalgamated profiles are taken directly from
interviews with informants in the weaker and stronger districts.
Profile of a “Weaker” Evaluation Process.
The coordinator of programs for the gifted in the school district may be new in her
job, and the current program for gifted students may be new as well. She wants to know
“whether the program works,” and in addition, she has a sense that she is accountable
for what is happening in the program. This will require some sort of documentation,
probably an evaluation. A procedure for evaluating will evolve, but not a strong policy
to direct evaluation. “Lack of support and funding (for evaluation) are real problems.”
159
Evaluation Designs and Practices
Factors
District A: Weak Evaluation
Practices
Evaluation Purpose
0
Method of Evaluation
l
0
0
new gifted coordinator
needs to know how program is functioning
likert scale questionnaire
directed at parents
measuring satisfaction
with program
l
tally of responses
l
none exists
Evaluation Report
0
program description,
questionnaire, and tally
of responses
Stakeholder Participation
l
data sources (see methods
section)
l
committee consists of gifted.
coordinator, teachers of the
gifted, building principals
Data Analysis
Implementation
Plan
District B: Strong Evaluation
Practices
district policy to evaluate all
programs
likert scale and open-ended
questionnaire
directed at parents, students,
teachers, and administrators
measuring satisfaction with program
achievement test data from students
focus group interviews with parents, students, and teachers
analysis of program documents
and curriculum
descriptive statistics
content analysis
inferential statistics
document analysis
recommendations provided
goal setting based upon
recommendations
board action
policy development
development of a plan
procedures and resources for
implementation
timeline for tasks
directions for further study based
upon recommendations
executive summary for teachers
formal report for administrators
and school board
dissemination via newsletter to
parents and other program
stakeholders
greater number of human data
sources (see method and analysis
section)
representatives from data sources
participate as a member of evaluation committee: parents, program teachers, regular classroom
teachers, parents of students not
enrolled in program, school board
members, administrators, program coordinators
160
EVALUATION PRACTICE, 15(2), 1994
(conrinued)
Dtitrict A: Weak Evaluation
Practices
Factors
District B: Strong Evaluation
Practices
Stakeholder Participation
(continued)
l
roles: data source, survey
design
l
Qualifications of Personnel
l
program coordinator has
training or experience in
gifted education
l
0
l
Nature of the Change
0
a
0
Figure 2.
Factors
evaluation goals not stated, l
therefore changes not tied
0
to goals, “random”
0
nature of services was
changed to better meet the
needs of students
additional program resour- l
ces secured staff development provided to clarify
misconceptions
schedules and other prol
gram elements were
changed to assist general
instruction in the school
l
information on the program provided to parents
roles for participants: data source,
evaluation committee team
member (including evaluation
design, data collection, and dissemination), implementation team
member (including planning,
implementation, evaluation)
staff member trained and experienced in gifted education
staff member trained and experienced in evaluation
cooperative relationship between
the two fields
goals are focused on specific program elements, “systematic”
results tied to evaluation goals
nature of services was changed to
better meet the needs of identified
students
additional staff provided to assist
with meeting needs of the students
and the parameters of the new
program initiatives
additional resources provided by
school board to enact those
changes
staff development implemented to
prepare for change in the services
provided
Delineating Profiles of Weaker Evaluation
and Stronger Evaluation Practices
Practices
There seem to be two approaches to deciding what to do nextpither
“doing what they
did last year,” or “winging it.” Feeling that it would be better for several individuals to be
involved in the process, the coordinator “forms a committee.” “Committee members include
representatives of teachers of the gifted, coordinators,
principals,” and perhaps parents or
school board members. After several meetings with committee members, questionnaires
are
developed “to address concerns.” Most are Likert-type surveys “with a few open-ended
questions.” It is perceived to be advantageous
if the form is short and the questions few.
“Questionnaires
are distributed to cooperating teachers, students, and parents.”
The coordinator
herself distributes the surveys, collects them, and analyzes results
by “tabulating
frequencies and percentages, and noting every comment that was made.”
Within a month or two of administering
the survey, the coordinator
shares “results with
161
Evaluation Designs zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
and
Pr act ices
committee members for discussion about recommendations on program improvement or
development. ” “The information is then shared with the superintendent who, in turn,
informs the school board of the additional opportunities for students” which could come
about as a result of the evaluation.
Evaluation findings in a weaker district tend to be less directed by evaluation goals than
to stem from more general evaluation focus. That is, the district generates a set of questions
designed to “see what people are thinking,” and often uses the same set of questions year
after year. Nonetheless, findings result in program change. “We saw that regular classroom
teachers were unclear about goals and contents of our special classes for gifted learners, so
we provided additional information for teachers and we made sure students in our resource
rooms knew how to tell their regular classroom teachers what kinds of learning took place
in there.” HAfterward, it appeared there was greater clarity among regular classroom teachers
about the special classes.” “We found out from our surveys that our resource room schedule
resulted in identified students reentering their regular classrooms at a point in the regular
classroom schedule that interrupted instruction for their classmates and teachers and generally
made things awkward for everybody.” This finding resulted in a shifting of the resource room
schedule, “so that it more closely matched the regular classroom schedule.” “We could see
that identified students continued receiving special services even in the absence of indicators
that the students were achieving in the program.” As a result, “students now have to show
a pattern of achievement to stay within the program.” zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONM
Profile of a “Stronger” Evaluation Process
In this district, the coordinator of gifted programs has filled her current position for
some time. She is aware of the political mandate for evaluation which exists in her district
for programs for the gifted as it does “for all other programs with a curriculum.” There is
a policy which both requires and supports evaluation. She also understands the power of
evaluation to improve the program and “to build awareness of and support for what we
are doing.” “We work hard to look at ourselves honestly,” she says. “We realize when we
need to change, and that is healthy.” She is also aware of the role of the evaluation in the
political process, “politically, evaluation findings allow support to be built for programs.”
Here, evaluation is an ongoing and multifaceted process. “There is formative
evaluation of everything specialists do in the classroom with general teachers.” “The
teachers tell us what is working and what we can modify. In the process, they also come
to understand our goals better too.” And there are feedback sheets on “how teachers feel
about administration of the testing program we are in charge of” “to assist us with the
management of testing.” “We are very diligent in following through with findings.““There
is at least one kind of survey every semester-periodic
surveys of building principals,
students and teachers in that school.“There are “standard, self-monitoring devices in place
in schools” and staff there with enduring responsibility for interpreting findings to building
personnel as they relate to that school.
There is a team of district professionals who can collaborate on evaluation
procedures-at
times members of the gifted education staff with strong credentials in
evaluation as well, at times a partnership between a district evaluation department and
members of the gifted education staff. In smaller districts, the expert in evaluation might
be a teacher or a building level administrator with an advanced degree which led to training
and/or experience in evaluation. Thus while one person assumes responsibility for the
162
EVALUATION PRACTICE, 15(2), 1994
evaluation process as it relates to gifted education, it is a leadership responsibility, and
not sole responsibility. There is a steering committee for gifted programs which plays a
key role in evaluation, but there are other groups and committees which are engaged in
the process as well. “We don’t want to rely just on one source.”
There is also a clear awareness of the varied stakeholders in the district. Stakeholders
are a part of evaluation planning, execution, and followup. Stakeholder committees assist
in determining specific program areas to be studied and to be able to propose questions
whose answers could be valuable in providing program support. “We want them to have
all the information they need. . .” “. . .to understand what we are about.. .” “to keep them
apprised of findings so there are no surprises in the end.. .” “. . .so they will buy into the
evaluation.. .” and “ . ..support program changes which follow.” When findings are
generated, they are brought back to stakeholder committees “first orally, and then in
preliminary reports. . .” “ . . .to give the stakeholders a chance to see whether the findings
made sense and to determine if the recommendations are feasible.”
In addition to process evaluation, the district employs outcome indicators. “The
school board pays some attention to achievement data. ” “Recently, we conducted a panel
study comparing test data for all students. In our self-contained program, all scores went
up, which is amazing given the likelihood of regression to the mean. There was also strong
evidence that these programs were benefiting minority achievement.” “We have begun
using portfolios as a means of assessing the impact of the critical and creative thinking
components in our program.”
From time to time, external evaluations of the program are conducted. “There is a
built-in suspicion that if the gifted/talented staff is conducting all the evaluations, they
can’t be really legitimate. ” “A few years ago there was a huge external evaluation with
university support to set a future direction for our gifted programs. The process was useful
and we have built steadily on its findings.”
Data analysis is done with appropriate technical support and qualitative and/or
quantitative methods appropriate to the questions asked and evaluation formats used. A
final, formal report is released, on a preset timeline, to appropriate groups including
stakeholders, school board, staff, and frequently with report summaries available for news
media and parent groups. The formal report is written in a format similar to that of a
research study, with appropriate data tables and accompanying explanations. A standard
part of the report is an implementation section, “outlining what is to be done as a result
of the evaluation findings, who has oversight responsibility for the new plans, and a timeline
for completion.” There is also a plan in place “to monitor next year how we’ve done with
our commitment.”
Program changes which result from evaluation are tied specifically to evaluation goals.
A goal to examine a perception that a high dropout rate existed among gifted high school
students led to data analysis indicating that the perception was inaccurate. As a result
of the evaluation and data analysis, “information was provided to stakeholders to address
the misperception.” Resulting from an evaluation goal to examine identification and
participation of culturally diverse students in the program, “identification procedures were
revised to facilitate proportional representation of minorities in programs for the gifted,
and strategies were implemented to improve achievement for culturally diverse students.”
An evaluation goal of assessing instructional effectiveness, indicated a need to provide
support for teachers attempting to meet the needs of identified gifted learners. Thus, “a
position was created to provide assistance to teachers.” While both “strong” and “weak”
163
EvaluationDesignsandzyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Pr act ices
evaluations produced findings which were translated into positive program change, the
way in which findings were generated and acted upon tended to be more linear or systematic
than random. That is, a question is asked, an answer found, plans made for appropriate
changes, policy-makers lobbied, staff training provided, and so forth. In the following
evaluation, there is followup to determine the effectiveness of the change, and new
questions are generated as appropriate. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFE
A Tale from the Blind Group
It was at least symbolic that the two districts directly in the middle of the ranking
of the 12 “changed places” as the study unfolded. The one which had been ranked as
“strongest among the weak,” had clearly moved up in the world since its original materials
had been received. A new coordinator had come aboard-one
who used terms like
“portfolio assessment” and “outcome-based evaluation.” She was moving away from sole
use of attitude surveys. “We need to look at performance and program benefits in
achievement instead of just whether parents, students, and teachers like the program.”
She has used the drawings of primary students to study attitude changes about science
and scientists in youngsters who have participated in a magnet program where they work
directly with scientists, compared with youngsters who have not had that opportunity.
She is working to integrate some evaluation components of services for gifted learners
into the evaluation processes of individual schools, and she talks about working with other
administrators and board members, as well as use of evaluation data showing a gap
“between predicted and actual test scores of gifted students for action at both local and
state levels.”
In the district which was initially classified as “weakest among the strong,” there was
a clear backslide. In this setting, there had once been a coordinator of gifted programs
who worked with a strong and knowledgeable planning committee on the districtmandated evaluation process. “Two people (building-level administrators) who worked
on the committee had Ph.D.s (in evaluation), and the other (a teacher) was working on
one.” “There were also consultants involved in developing the evaluation processes and
procedures.” From both oral reports and evaluation documents, the evaluation system
was judged to be effective.
At some point, staff assignments changed, and the new coordinator inherited and
elected to maintain the previous evaluation design. Talking about the evaluation plan,
she explained that she “wasn’t quite sure how decisions were made regarding questions
to be asked in the evaluation process. ” “The chief audience for the evaluation findings
was the Gifted and Talented Planning Committee.” “ Principals were also given results
of the evaluation by schools and helped to analyze them.” “Principals who had
preconceptions probably didn’t change as a result of the meetings, but those who were
open to suggestions and wanted to listen were helped to make changes.““Ultimately these
meetings were instrumental in leading to a model shift in the program model used for
the district’s gifted program.” “There was no systematic followup on these meetings to
see whether plans had been executed.”
At this point, the “new” coordinator has moved on. A new program has been put
in place “based on evaluation findings.” The “school board has adopted the new program,
but not funded it.” “There is no evaluation procedure in place for the new program, “
“ . . .and there is no staff to work on evaluation.““ Regular classroom teachers are supposed
EVALUATION
PRACTICE, 15(2), 1994
to assume responsibility for the (new) model as well as their own assignments. It makes
their attitude toward the program negative. There is no acknowledgement of what they
are doing.” Interviewees from this district suggested that the evaluation procedure had
become institutionalized and potent during the tenure of the administrator who led in
its development. Left to supervision by someone unskilled in programming for gifted
learners, evaluation, and political savvy, the evaluation process continued somewhat like
a driverless vehicle. It amassed information, but without leadership in developing
implementation plans and interpreting the need for proposed changes to varied stakeholder
groups, information was used to make decisions detrimental to participants in the program
which the evaluation was designed to strengthen. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONML
A Cross-Group Comparison
It appears that the great difference between those school districts categorized as having
weaker evaluation reports and those having stronger ones lies in sharply contrasting levels
in awareness of the need for evaluation and support for the evaluation process. There
is the intent to evaluate and to do it to the best of one’s capacity in both settings-and,
in fact, there are indications of success in both groups as measured by positive program
changes which arise from evaluation findings. In the stronger settings, those in charge of
the evaluation process understand the value of evaluation as a field of study, and evaluation
is a prescribed feature of the planning cycle for all programs. Leaders may use vocabulary
like “stakeholders, ” “formative evaluation,” “outcome indicators,” “chi-square.” They
understand the peculiar pitfalls of measuring academic growth in students who top-out
on tests, and can discuss the use of portfolios, comparison of achievement and aptitude
scores, and regression to the mean. They have a level of political sophistication which
helps them to see both a need and a means for building networks of support through
evaluation processes for the programs which they administer. Further, they seek out
technical and collegial support in the evaluation process, a reality which further enhances
the range and potency of the evaluation.
By contrast, coordinators in the districts categorized as weaker sense a need to know
“how things are going,” and they use the only tool at their disposal-common
sense. They
work alone (or perceive that they do), and join forces with others via committee, gaining
a sense of partnership, and feeling reinforced in their common sense strategies. Evaluation
is seen as worthwhile, but proceeds more as a reactive than a proactive process.
The study does not indicate that the number of personnel or fiscal resources are the
sole determiners between “weak” and “strong” districts in regard to use of evaluation
findings. In fact, both the “strong” and “weak” groups contained small school districts
(with fewer administrative staff members) and larger ones (with more administrative staff
members). Both categories also contained urban (relatively poorer) and suburban
(relatively more affluent) districts. Stronger school districts tended to be those in which
a leader in the evaluation process understood the need to involve persons with evaluation
expertise and drew on that expertise through establishing partnerships with other
administrators when possible, and through placing teachers and/or community members
knowledgeable of evaluation on the steering committee when administrative partnerships
were unavailable. It was not the case that stronger districts were strong in evaluation by
virtue of fiscal superiority, but rather because a decision was made to allocate available
fiscal and personnel resources to evaluation. In these districts, evaluation was an integral
Evaluation
Designsana’Practices
165
component of job descriptions of program coordinators, upper-level administrators
conveyed both expectations for and support of evaluation to program coordinators, office
support was provided for conducting evaluations and disseminating findings, and there
was an institutional expectation that evaluation would result in positive program
development. In districts using weaker evaluation practices, there was little or no apparent
institutional valuing of evaluation. Rather, evaluation was more likely perceived as a
process to be feared because it might cause problems for the programs being evaluated.
Not surprisingly,
therefore,
substantial support for evaluation (in financing,
encouragement, time, personnel) was lacking in the weaker districts.
Decisive Factors in Use of Findings
This study indicates two key factors which promote use of evaluation findings in
districts studied-will and skill. It appears that the will to evaluate on the part of some
key personnel in a district, supplemented with systematic procedures and resources for
doing so, results in generation of evaluation findings and translation of those findings into
program change. This will to evaluate existed in both the weaker and stronger districts
studied. The second factor-skill in evaluation and related processes-appears
to be the
demarcation between the two categories of districts and affects the robustness of program
change stemming from evaluation findings. Utilization appeared more likely and changes
from the findings more potent and systemic in direct relationship to the following
conditions:
1.
2.
3.
4.
5.
Evaluation of programs for the gifted was a part of a district-wide policy requiring
routine evaluation for all program areas.
Systematic written plans were in place delineating steps and procedures for
ensuring implementation of findings.
Multiple stakeholders were consistently involved in planning, monitoring, and
reviewing the evaluation process and its findings.
Stakeholders played an active role in planning for and advocating before policy
makers for program change based on evaluation findings.
Key program personnel were knowledgeable about gifted education, evaluation,
the political processes in their districts, and the interconnectedness of the three.
In instances where program administrators with expertise in both gifted education
and evaluation were not available, leaders involved volunteer steering committee
members with such expertise.
DISCUSSION AND RECOMMENDATIONS
The study reported is qualitative and thus does not claim broad genera&ability.
Nonetheless, findings from these cases should be useful for extrapolation in informing
further research about evaluation utility and in examining evaluation utility in a variety
of settings, including but not limited to programs for gifted learners (Cronbach, Ambron,
Dornbusch, Hess, Hornik, Phillips, Walker, & Weiner, 1980).
The cases studied support key assumptions in the theoretical literature regarding
factors affecting evaluation utilization, finding that the Joint Committee Standards (198 1)
EVALUATION
PRACTICE, 15(2), 1994
were more closely adhered to in the school districts with stronger evaluation reports. The
study does not support earlier findings from our literature review (Tomlinson, et al., 1993)
that stakeholders are less likely to agree with evaluation reports which they believe are
written by female evaluators and which are written by researchers as opposed to evaluators
or content specialists (Braskamp, Brown, & Newman, 1981). Evaluations in stronger and
weaker districts were conducted, written, and presented by both males and females.
The cases also indicate that where intent to evaluate programs for the gifted exists,
some form of evaluation is likely to evolve. Even when such evaluation schemes are
relatively weaker, at least in comparison to evaluation reports that closely follow utility
standards such as those developed by the Joint Committee (198 l), utilization of evaluation
findings can and does occur in ways that result in positive program change, at least in
the short-term. It appears a reasonable hypothesis that, over time, program quality and
support would be more positively affected through use of systematic evaluation processes
and procedures than through use of random evaluation processes and procedures.
Additional study is necessary to make that determination.
Generally more robust evaluation designs and procedures appear to evolve when
responsible personnel have specific training in evaluation, in gifted education, and in
problems of evaluating programs for the gifted (e.g., evaluation challenges created by
programs which are long-term, individualized, complex, and poorly measured by
traditional standardized means (Tomlinson et al., 1993)), and/or when they have support
in the way of policy expectations and well-trained colleagues or volunteers. Such program
personnel thus have access to vocabulary, procedures, and a level of political sophistication
which enable them to maximize the capacity of evaluation both to chart program growth
and amass program support, including economic support for the gifted program.
Datta (1989) points to lack of economic resources as a major factor in reduced
evaluation efforts in education. The current study indicates that while such resources
facilitate evaluation of programs for the gifted, they are not a sole determiner of evaluation
effectiveness or evaluation utility in a district,
The clearest need emerging from the study is for training of program personnel in
evaluation. Further, there is a need to appropriately apply a full range of evaluation
methodology to the problems presented in assessment of student growth in programs for
the gifted. Even many of the “strong” districts showed only fledgling movement in the
direction of experimental design to demonstrate student growth (Beggs, Mouw, & Barton,
1989; Callahan, 1983; Carter, 1986; Payne & Brown, 1982), and few appear to have tapped
the range of possibilities of qualitative design for evaluating programs for the gifted
(Janesick, 1989; Lundsteen, 1987).
Certainly the weaker districts have need for personnel or volunteers with knowledge
of the value of evaluation and how to employ varied data collection modes (Gilberg, 1983;
Janesick, 1989; Rimm, 1982); how to address concerns of both internal and external
audiences by asking questions which are relevant, useful, and important and which will
thus directly facilitate positive and powerful decision-making (Callahan, 1986); how to
identify decision-makers at various levels as well as actions over which they have control
(Callahan, 1986; Dettmer, 1985; Gilberg, 1983; Renzulli, 1984; and Rimm, 1982); and how
to find out what course of action will result from data supplied as well as how to make
recommendations with an eye toward program improvement (Gilberg, 1983). To function
at a lesser state is to compromise the positive possibilities of evaluation.
167 zyxwvutsrqpon
EvaluationDesigns and Pmctices
ACKNOWLEDGMENTS
The work reported herein was sponsored by The National Research Center on the Gifted
and Talented under the Jacob K. Javits Gifted and Talented Students Education Act
(Grant No. R206ROOOOl)and administered by the Office of Educational Research and
Improvement and the United States Department of Education.
REFERENCES
Alkin, M. (1980). Uses and users of evaluation. In E.L. Baker (Ed.), Evaluating federal education
programs (Report No. CSE-R-153; ERIC No. ED 205 599) (pp. 39-52). Los Angeles, CA:
University of California at Los Angeles, Center for the Study of Evaluation.
Beggs, D., Mouw, J., & Barton, J. (1989). Evaluating gifted programs: Documenting individual
and programatic outcomes. Roeper Review, 12, 73-76.
Bissel, J. (1979). Program impact evaluations: An introduction for managers of Title Vllprojects.
A draft guidebook (ERIC No. ED 209 301). Los Alamitos, CA: Southwest Regional
Laboratory for Educational Research and Development.
Braskamp, L., Brown, R., Kc Newman, D. (1981). Studying evaluation utilization through
simulations. Evaluation Review, 6, 114-126.
Callahan, C. (1983). Issues in evaluating programs for the gifted. Gifted Child Quarterly, 27, 37.
Callahan, C. (1986). Asking the right questions: The central issue in evaluating programs for the
gifted and talented. Gifted Child Quarterly, 30, 38-42.
Carter, K. (1986). Evaluation design: Issues confronting evaluators of gifted programs. Gifted Child
Quarterly, 30, 88-92.
Cronbach, L., Ambron, S., Dombusch, S., Hess, R., Hornik, R., Phillips, D., Walker, D., Weiner,
S. (1980). Toward reform of program evaluation. San Fransisco, CA: Jossey-Bass.
Datta, L. (1979). 0 thou that bringest the tidings to lions: Reporting the findings of educational
evaluations. Paper presented at the annual Johns Hopkins University National Symposium
on Educational Research, Baltimore, MD.
Datta, L. (1989). Education information: Production and quality deserve increased attention,
statement before the Subcommittee on Government Information and Regulation. Committee
on Government Affairs, U.S. Senate. Washington, DC: General Accounting Office.
Dawson, J., & D’Amico, J. (1985). Involving program staff in evaluation studies: A strategy for
increasing information use and enriching the data base. Evaluation Review, 9, 173-88.
Dettmer, P. (1985). Gifted program scope, structure and evaluation. Roeper Review, 7, 146-152.
Gilberg, J. (1983). Formative evaluation of gifted and talented programs. Roeper Review, 6, 4344.
Hunsaker, S., & Callahan, C. (1993). Evaluation of gifted programs: Current practice. Journalfor
the Education of the Gifted, 16,190-200. Janesick, V. (1989). Stages of developing a qualitative
evaluation plan for a regional high school of excellence in upstate New York. Paper presented
at the American Evaluation Association, San Francisco, CA.
Joint Committee on Standards for Educational Evaluation. (198 1). Standards for evaluations of
educationalprograms, projects, and materials. New York: McGraw-Hill.
King, J., Thompson, B., & Pechman, E. (1981). Improving evaluation use in local school settings
(NIE-G-80-0082; ERIC No.ED 214 998). Washington, DC: National Institute of Education.
Lundsteen, S. (1987). Qualitative assessment of gifted education. Gifted Child Quurterly, 31, 2529.
168
EVALUATION PRACTICE, 15(2), 1994
Mathis, W. (1980, April). Evaluating: The policy implications. Paper presented at the American
Educational Research Association, Boston, MA. Mutschler, E. (1984). Evaluating practice:
A study of research utilization by practitioners. Social Work, 29,332-37.
Ostrander, S., Goldstein, P., & Hull, D. (1978). Toward overcoming problems in evaluation research:
A beginning perspective on power. Evaluation and Program Planning, I, 187-193.
Payne, D., & Brown, C. (1982). The use and abuse of control groups in program evaluation. Roeper
Review, 5, 11-14.
Raizen, S., & Rossi, P. (Eds.). (1981). Program evaluation in education: When? How? To what
ends?(Report No. ISBN-O-309-03143-5; ERIC No. ED 205 614). Washington, DC: National
Academy Press.
Renzulli, J. (1984). Evaluating programs for the gifted: Four questions about the larger issues. Gifted
Education International, 2, 83-87.
Rimm, S. (1982). Evaluation of gifted programs: As easy as ABC. Roeper Review, 5, 8-l 1.
Smith, N. (1981). Evaluation studies: Evaluating evaluation methods. Studies in Educational
Evaluation, 7, 173-81.
Tomlinson, C., Bland, L., 8t Callahan, C. (1992, November). Use of evaluationjindings inprograms
for the gifted. Paper presented at the meeting of the National Association for Gifted Children,
Los Angeles.
Tomlinson, C., Bland, L., & Moon, T. (1993). Evaluation utilization: A review of the literature with
implications for gifted education. Journalfor the Education of the Gifred, 16, 171-189.