Assessment and Evaluation

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 8

Assessment and Evaluation

ASSESSMENT is the systematic collection of data to monitor the success of a program or course in
achieving intended learning outcomes (ILOs)* for students. Assessment is used to determine:
What students have learned (outcome)
The way they learned the material (process)
Their approach to learning before, during, or after the program or course
You can assess students before instruction to get a baseline of what students know (for example, by
administering a pretest). During instruction, assessment can be used to determine what students are
learning so you can adjust your teaching, if needed. Quizzes or mud cards, which ask students to
identify the muddiest point that remains for them after the class, are two methods of this kind of
formative assessment. After instruction, you can use assessment for two purposes: (1) to determine
if there has been a change in knowledge (final exams can be used for summative assessment); and
(2) to provide you with information to revise the class or program.

EVALUATION is a judgment by the instructor or educational researcher about whether the program or
instruction has met its Intended Learning Outcomes (ILO).

Types of Assessment and Evaluation

Assessment and evaluation studies may take place at the subject, department, or Institutional level,
and range in size and scope from a pilot study to a complex project that addresses a number of
different topics, involves hundreds of students, and includes a variety of methodologies. Typically,
assessment efforts are divided into two types, formative or summative. Below, each is described briefly
along with a third less frequently seen type called process assessment. Included, as well, is a grid that
classifies different assessment methodologies.

Formative Assessment

Formative assessment implies that the results will be used in the formation and revision process of an
educational effort. Formative assessments are used in the improvement of educational programs. This
type of assessment is the most common form of assessment in higher education, and it constitutes a
large proportion of TLLs assessment work. Since educators are continuously looking for ways to
strengthen their educational efforts, this type of constructive feedback is valuable.
Summative Assessment

Summative assessment is used for the purpose of documenting outcomes and judging value. It is
used for providing feedback to instructors about the quality of a subject or program, reporting to
stakeholders and granting agencies, producing reports for accreditation, and marketing the attributes
of a subject or program. Most studies of this type are rarely exclusively summative in practice, and they
usually contain some aspects of formative assessment.
Process Assessment

Process assessment begins with the identification of project milestones to be reached, activities to be
undertaken, products to be delivered, and/or projected costs likely to be incurred in the course of
attaining a projects final goals. The process assessment determines whether the project has been on
schedule, deliverables produced, and cost estimates met. The degree of difference from the expected
process is used to evaluate success.
Methods of Measuring Learning Outcomes Grid

How colleges and universities can measure and report on the knowledge and abilities their students
have acquired during their college years is an issue of growing interest. The Methods of Measuring
Learning Outcomes Grid provides a way to categorize the range of methodologies that can be used to
assess the value added by a college education.

The Assessment and Evaluation Process

There are a sequence of activities that should be followed in order to implement an assessment.
Below you will find a description of each step in the process and useful guidelines for approaching
them.

Identify Intended Learning Outcomes


Identify Research Questions
Develop Research Design
Select Sampling Frame
Select Appropriate Data Collection Methods
Construct Measurement Instruments
Select Appropriate Data Analysis Techniques
Consider Communication and Dissemination of Findings

Identify Intended Learning Outcomes

The assessment and evaluation process begins by helping instructors or program managers identify
the intended learning outcomes (cognitive, attitudinal, and behavioral) of a class or a project. Those
outcomes are then refined and operationally defined to make them measurable.

Intended learning outcomes lay the groundwork for a meaningful assessment. For example, an
intended learning outcome of "students will demonstrate team leadership skills" should lead to an
assessment where these skills are observed or documented in some manner. Similarly, an intended
learning outcome of "students will be able to develop Matlab code to help solve authentic engineering
problems" should lead to an assessment that includes the elements to measure that skill.

Intended Learning Outcomes

What Are Intended Learning Outcomes?

It may be best to start with what intended learning outcomes arent. They arent simply a list of the
topics to be covered in the course. Certainly, there will be a body of knowledge that students should
know and understand by the time the course is complete. But if the goals for what students should
achieve stops there, there may be many missed opportunities for providing them with a more
productive learning experience.
An intended learning outcome should describe what students should know or be able to do at the end
of the course that they couldnt do before. Intended learning outcomes should be about student
performance. Good intended learning outcomes shouldnt be too abstract (the students will
understand what good literature is); too narrow (the students will know what a ground is); or be
restricted to lower-level cognitive skills (the students will be able to name the countries in Africa.).

Each individual intended learning outcome should support the overarching goal of the course, that is,
the thread that unites all the topics that will be covered and all the skills students should have mastered
by the end of the semester. Best practice dictates that intended learning outcomes be kept to no more
than half a dozen.
Writing Intended Learning Outcomes

Experts often talk about using the acronym SKA to frame learning objectives. SKA stands for:

What students should be able to do by the time the course is


Skills
completed.
Knowledg What students should know and understand by the time the
e course is completed.
What the students opinions will be about the subject matter of
Attitudes
the course by the time it is completed.
It is best to identify the skills, knowledge, and attitudes the students should gain throughout the course
by writing sentences that begin:

By the time the students finish the course, they should be able to . . .
and then supplying a strong, action verb. Examples of verbs that define student performance in a
particular area include:

explain
list
describe
demonstrate
calculate
report
compare
analyze
Some instructors use well-defined learning taxonomies to create intended learning outcomes for their
course. Learning taxonomies, the most well-known of which is Blooms Taxonomy of Objectives for
the Cognitive Domain (1956), categorize cognitive tasks, usually in increasingly sophisticated order.

Two Examples of Taxonomies of Educational Outcomes

Bloom's Taxonomy of Intended Learning Outcomes

is defined as the remembering of previously learned material


Knowledge represents the lowest level of learning
involves recalling or reciting: facts, observations, or definitions

Comprehension is defined as the ability to grasp the meaning of material


represents the lowest level of understanding
involves explaining, interpreting, or translating

refers to the ability to use learned material in new and concrete situations
Application requires higher level of understanding than comprehension
involves applying: rules, methods, laws, principles

refers to the ability to break down material into its component parts so that its
organizational structure may be understood
represents a higher level than previous categories because of requirement of
Analysis
understanding of both the content and structural form of the material
involves analyzing relationships, distinguishing between facts and inferences,
evaluating data relevance

refers to the ability to put parts together to form a new whole


represents creative behaviors, with emphasis on the formulation of new patterns or
Synthesis
structures
involves proposing plans, writing speeches, creating classification schema

is concerned with the ability to judge the value of material for a given purpose
represents highest level because of inclusion of elements of all other categories plus
Evaluation
conscious value judgments based on criteria
involves judging logical consistency, adequacy of data support for conclusions

Feisel-Schmitz Technical Taxonomy of Intended Learning Outcomes

Judge: To be able to critically evaluate multiple solutions and select an optimum solution

Solve: Characterize, analyze, and synthesize to model a system (provide appropriate assumptions)

Explain: Be able to state the outcome/concept in their own words


Compute Follow rules and procedures (substitute quantities correctly into equations and arrive at a correct
: result, Plug & Chug)

Define: State the definition of the concept or is able to describe in a qualitative or quantitative manner

Made available by:


Diane Soderholm, Ph.D., Instructional Designer
MIT Department of Aeronautics & Astronautics
2005
Identify Research Questions

Research questions narrow the focus of your assessment or evaluation, and restate the purpose of the
assessment as specific questions to be answered. Research questions are generated for both
quantitative and qualitative research, but differ slightly in their focus.

Quantitative research questions might describe students reaction to a classroom innovation, e.g.,
How long do students spend on homework after implementation of a flipped classroom?, compare
groups of students on a learning outcome, e.g., How do students enrolled in online and in-person
classes differ with regard to time spent reading the textbook?, or explore a relationship between two
variables, e.g., How does an increased choice of electives relate to students' satisfaction with their
major?

Qualitative research questions are open-ended, more general questions that are designed to explore
the participants perspective, e.g. How do students who travel abroad describe their experience?

Develop Research Design

With well-defined goals, suggestions for metrics and methods are easier to make. If quantitative
methods are used, published tests and questionnaires are recommended. These instruments have the
advantage of established validity and reliability and can be discussed in terms of previous uses and
findings. It also is possible to custom tailor and/or create measures for the needs of specific research
questions or educational goals. For example, TLL educational researchers have developed a set
of Learning Behavior Surveys designed to measure student attitudes about those aspects of the
educational environment that contribute to their learning. In many cases, qualitative measures are
appropriate. These include focus groups, interviews, classroom observations, and think-aloud
protocols.

A mixed method (quantitative and qualitative) approach is often the most useful, and both direct and
indirect approaches can be employed to assess learning. The direct approach uses quantitative
performance assessments such as:

Portfolios
Oral presentations
Exams
Problem sets
Pretest-posttest comparisons of learning
The indirect approach uses qualitative and quantitative approaches, such as:

Naturalistic descriptive observations


Focus groups
Journals
Structured and open-ended interviews
Surveys
Where it is possible, historical or matched comparison groups and experimental procedures are used.
However, because we are an applied laboratory and our research settings are generally real
classrooms, the preferred measures and methods adjust to whatever the best approach is for
answering the questions being asked in a specific situation.

Select Sampling Frame

A sampling frame is the target population to which you hope to generalize your findings. The target
population usually shares some defining characteristic, e.g., mechanical engineering majors at MIT.
From the sampling frame or target population, you need to select your sample to be studied. This
sample should be as representative of your target population as possible.

Select Appropriate Data Collection Methods

The choice of appropriate data collection methods should be based on the research questions, design,
sample, and the possible data sources. The technique used for data collection should gather
information that will allow the research questions to be answered, take into account the characteristics
of the sample, and provide information that is linked to each intended learning outcome. Some
common data collection methods include observations, interviews, focus groups, surveys, and the use
of secondary data such as test scores. TLL researchers will work with faculty to develop a matrix that
links the intended learning outcomes to research questions and data collection strategies to ensure
that the data collected will allow the research questions to be accurately addressed.

Select Appropriate Data Analysis Techniques

There are many well-developed methods available for conceptually or statistically analyzing the
different kinds of data that can be gathered. When analyzing qualitative data, one can develop
taxonomies or rubrics to group student comments collected by questionnaires and/or made in
classroom discussions. The frequency of certain types of comments can be described, compared
between categories, and investigated for change across time or differences between classes.
Frequency data and chi-square analysis can supplement the narrative interpretation of such
comments. For the analysis of quantitative data, a variety of statistical tests are available, ranging from
the simple (t-tests) to the more complex (such as the use of factor analysis to develop scales).

Consider Communication and Dissemination of Findings

The assessment and evaluation staff can make recommendations about where to publish the results of
your assessment studies. Since our activities are conducted at the request of a faculty member or
investigator, the information gathered belongs to that person. It is the investigators choice as to
whether to convey information to oversight authorities (internal or external to the Institute) or to
publicize findings. Assessment results are not used in any way for faculty or staff evaluation.

Assessment and evaluation staff can assist and/or collaborate with faculty members and investigators
in the preparation and presentation of written reports, conference presentations, posters, and
manuscripts for publication. The degree of involvement can range from simple suggestions to co-
authorship. With the permission of the instructors and investigators, TLL can disseminate results about
educational and technological innovations both within and outside the Institute through written reports,
journal articles, and presentations.

Glossary of Terms
Educational assessment: The collecting, synthesizing, and interpreting of data
whose findings aid pedagogical decision making. Areas of assessment include
student performance, instructional strategies, educational technologies, and
learning environment.

Experiment: A study undertaken in which the researcher has control over some
of the conditions in which the study takes place and control over (some aspects
of) the independent variables being studied. Random assignment of subjects to
control and experimental groups is usually thought of as a necessary criterion of
a true experiment.

External validity: The extent to which the findings of a study are relevant to
subjects and settings beyond those in the study. Another term for
generalizability.

Formative assessment: A type of assessment conducted during the course of


program implementation whose primary purpose is to provide information to
improve the program under study.4

Internal validity: The extent to which the results of a study (usually an


experiment) can be attributed to the treatment rather than to flaws in the
research design; in other words, the degree to which one can draw valid
conclusions about the causal effects of one variable on another.3

Quasi-experiment: A type of research design for conducting studies in field or


real-life situations where the researcher may be able to manipulate some
independent variables but cannot randomly assign subjects to control and
experimental groups.3

Program evaluation: The systematic investigation of the process and


outcomes of an educational program or policy.4

Qualitative research: Research that examines phenomena primarily through


words and tends to focus on dynamics, meaning, and context. Qualitative
research usually uses observation, interviewing, and document reviews to collect
data.4
Quantitative research: Research that examines phenomena that can be
expressed numerically and analyzed statistically.4

Reliability: The extent to which scores obtained on a measure are reproducible


in repeated administrations.2

Summative evaluation: A study conducted at the end of a program (or a


phase of a program) to determine the extent to which anticipated outcomes were
produced. Summative evaluation is intended to provide information about the
worth of the program.4

Threats to validity: Conditions other than the program [treatment] that could
be responsible for observed net outcomes; conditions that typically occur in
quasi-experiments and, unless controlled, limit confidence that findings are due
solely to the program. Threats to validity include selection, attrition, outside
events or history, instrumentation, maturation, statistical regression, and
testing.4

Triangulation: Using multiple methods and/or data sources to study the same
phenomenon; qualitative researchers frequently use triangulation to verify their
data.4

Validity: The extent to which appropriate inferences and decisions can be made
based on the data collected from a measure.5

You might also like