Test, Measurement & Evaluation

Test, Measurement, Assessment,
and Evaluation in Education

"Anything not understood in
more than one way is not
understood at all

by: Dr. Bob Kizlik

Test, measurement, assessment, and evaluation
mean very different things, and yet most of my
students and some of us were unable to adequately
explain the differences.
We test knowledge, we measure distance, we
assess learning, and we evaluate results in terms of
some set of criteria. These four terms are certainly
connected, but it is useful to think of them as
separate but connected ideas and processes.

BASIC CONCEPT
Test is an instrument or systematic procedure
designed to measure the quality, ability, skill or
knowledge of students by giving a set of question in a
uniform manner. Since test is a form of assessment,
tests also answer the question How does individual
student perform?
Testing is a method used to measure the level of
achievement or performance of the learners.

Measurement
Refers to the process by which the attributes or
dimensions of some physical object are determined.
When we measure, we generally use some standard
instrument to determine .
is a process of obtaining a numerical description of the
degree to which an individual possesses a particular
characteristic. Measurement answers the questions
How much?

Assessment
is a process by which information is obtained relative
to some known objective or goal.
Assessment is a broad term that includes testing. A
test is a special form of assessment. Tests are
assessments made under contrived circumstances
especially so that they may be administered.
In other words, all tests are assessments, but not all
assessments are tests. We test at the end of a lesson or
unit. We assess progress at the end of a school year
through testing,
Assessment refers to the process of gathering,
describing or quantifying information about the
student performance. It includes paper and pencil
test, extended responses (example essays) and
performance assessment are usually referred to as
authentic assessment tasks (example presentation
of research work).
A test or assessment yields information relative to
an objective or goal. In that sense, we test or assess to
determine whether or not an objective or goal has been
obtained.
Evaluation
It refers to the process of examining the
performance of student. It also determines whether
or not the student has met the lesson instructional
objectives.
Perhaps the most complex and least understood of the
terms. Inherent in the idea of evaluation is "value."
When we evaluate, what we are doing is engaging in
some process that is designed to provide information
that will help us make a judgment about a given
situation.
Types of Measurement

There are two ways of interpreting the student performance in relation to classroom
instruction. These are the Norm-referenced tests and Criterion-referenced tests.

Norm-referenced test is a test designed to measure the performance of a student compared with
other students. Each individual is compared with other examinees and designed a score-usually
expressed as a percentile, a grade equivalent score, or a stanine. The achievement of student is
reported for broad skill areas, although some norm-referenced tests do report student achievement
for individual.

The purpose is to rank each student with respect to the achievement of others in broad
areas of knowledge and to discriminate high and low achievers.

Criterion-referenced test is a test designed to measure the performance of students with
respect to some particular criterion or standard. Each individual is compared with a preset
standard for acceptable achievement. The performance of the other examinees is irrelevant. A
students score is usually expressed as a percentage and student achievement is reported for
individual skills.

The purpose is to determine whether each student has achieved specific skills or
concepts. And to find out how much students know before instruction begins and after it has
finished.

Other terms less often used for criterion-referenced are objective referenced, domain
referenced, content referenced and universe referenced.

According to Robert L. Linn and Norman E. Gronlund (1995) pointed out the common
characteristics and differences of Norm-Referenced Test and Criterion-Referenced Tests.

Common Characteristic of Norm-Referenced Tests and Criterion-
Referenced Tests
1. Both require specification of the achievement domain to be measured.
2. Both require a relevant and representative sample of test items
3. Both use the same types of test items.
4. Both used the same rules for item writing (except for item difficulty)
5. Both are judge with the same qualities of goodness (validity and reliability)
6. Both are useful in educational assessment.

Difference between Norm-Referenced Tests and Criterion-Referenced Tests

Types of Assessment

Assessments can be classified in many different
ways. The most important distinctions are:
1. formative and summative;
2. objective and subjective;
3. referencing (criterion-referenced,
norm- referenced, and ipsative)and
4. informal and formal.

Formative and Summative
Formative assessment is generally carried out
throughout a course or project. Formative assessment,
also referred to as educative assessment, is used to
aid learning, providing feedback on a student's work,
and would not necessarily be used for grading
purposes.
Summative assessment is generally carried out at the
end of a course or project. In an educational setting,
summative assessments are typically used to assign
students a course grade.

Objective and subjective

Objective assessment is a form of questioning which
has a single correct answer.

Subjective assessment is a form of questioning which
may have more than one correct answer (or more than
one way of expressing the correct answer).

Objective and subjective Test
Test (either summative or formative) can be objective
or subjective. Objective test is a form of questioning which
has a single correct answer. Subjective Test is a form of
questioning which may have more than one correct answer
(or more than one way of expressing the correct answer).
There are various types of objective and subjective
questions. Objective question types include true/false
answers, multiple choice, multiple-response and matching
questions. Subjective questions include extended-response
questions and essays. Objective test is becoming more
popular due to the increased use of online assessment (e-
assessment) since this form of questioning is well-suited to
computerization.

Bases of comparison (Referencing)
Criterion-referenced assessment, typically using a
criterion-referenced test, as the name implies, occurs
when candidates are measured against defined (and
objective) criteria.
Norm-referenced assessment (colloquially known
as "grading on the curve"), typically using a norm-
referenced test, is not measured against defined
criteria. This type of assessment is relative to the
student body undertaking the assessment. It is
effectively a way of comparing students.
Ipsative assessment is self comparison either in the
same domain over time, or comparative to other
domains within the same student.

Informal and formal

Assessment can be either formal or informal. Formal
assessment usually implicates a written document,
such as a test, quiz, or paper. Formal assessment is
given a numerical score or grade based on student
performance. Whereas, informal assessment does not
contribute to a student's final grade. It usually occurs
in a more casual manner, including observation,
inventories, checklists, rating scales, rubrics,
performance and portfolio assessments, participation,
peer and self evaluation, and discussion.

MODES OF ASSESSMENT

A. Traditional Assessment
1) Assessment in which students typically select and
answer or recall information to complete the
assessment. Test may be standardized or teacher
made test, these tests may be multiple choice, fill-in-
the-blanks, true-false, matching type.
2) Indirect measures of assessment since the test items
are designed to represent competence by extracting
knowledge and skills from their real life context.
3) Items on standardized instrument tend to test only the
domain of knowledge and skill to avoid ambiguity to the
test takers.
4) One-time measures to rely on a single correct answer
to each item. There is a limited potential for traditional
test to measure higher order thinking skills.

B. Performance Assessment
1) Assessment in which students are asked to perform real-world tasks that
demonstrate meaningful application of essential knowledge and skills (Jon
Mueller)
2) Direct measures of student performance because task are design to
incorporate contexts, problems, and solution strategies that students would
use in real life.
3) Designed ill-structured since challenges since the goal is to help students
prepare for the complex ambiguities in life.
4) Focus on processes and rationales. There is no single correct answer; instead
students are led to craft polished, through and justifiable responses,
performances and products.
5) Involve long-range projects, exhibits, and performances are linked to the
curriculum.
6) Teacher is an important collaborator in creating tasks, as well as in developing
guidelines for scoring and interpretation.

C. Portfolio Assessment
1) Portfolio is a collection of students work specifically selected to tell a particular
story about the student.
2) A portfolio is not a pile of student work that accumulates over a semester or
year.
3) A portfolio contains a purposefully selected subset of student work.

TYPES OF EVALUATION

There are four type of evaluation in terms of their functional role in
relation to classroom instruction. These are the placement evaluation,
diagnostic evaluation, formative evaluation and summative evaluation.

A. Placement Evaluation is concerned with the entry performance of
student. The purpose of placement evaluation is to determine the
prerequisite skills, degree of mastery of the course objectives and the best
mode of learning.

B. Diagnostic Evaluation is a type of evaluation given before instruction. It
aims to identify the strengths and weaknesses of the students regarding the
topics to be discussed. The purpose of diagnostic evaluation:
1) To determine the level of competence of the students
2) To identify the students who have already knowledge about the lesson
3) To determine the causes of learning problems and formulate a plan for
remedial action

C. Formative evaluation is a type of evaluation used to monitor the
learning progress of the students during or after instruction.
Purpose of formative evaluation:
1) To provide feedback immediately to both student and teacher
regarding the success and failures of learning
2) To identify the learning errors that is in need of correction
3) To provide information to the teacher for modifying instruction and
used for improving learning and instruction.

D. Summative Evaluation is a type of evaluation usually given at
the end of a course or unit. Purpose of summative evaluation:
1) To determine the extent to which the instructional objectives have
been met;
2) To certify student mastery of the intended outcome and used for
assigning Grades;
3) To provide information for judging appropriateness of the
instructional objectives;
4) To determine the effectiveness of instruction.

Evaluation

Is the process of gathering and interpreting evidence regarding the
problems and progress of individuals in achieving desirable
educational goals.

Chief Purpose of Evaluation
The Improvement of the individual learner

Other Purposes of Evaluation
To maintain standard
To select students
To motivate learning
To guide learning
To furnish instruction
To appraise educational instrumentalities

Function of Evaluation
Prediction
Diagnosis
Research

Areas of Educational Evaluation
Achievement
Aptitude
Interest
Personality

A well defined system of evaluation:
Enable one to clarify goals
Check upon each phase of development
Diagnose learning difficulties
Plan carefully for remediation

Evaluation & the Teaching-Learning Process
Teaching, Learning and Evaluation are three
interdependent aspects of the educative process.
(Gronlund 1981) This interdependence is clearly seen when
the main purpose of instruction is conceived in terms of
helping pupils achieve a set of learning outcomes which
include changes in the intellectual, emotional or physical
domains. Instructional objectives or in other words,
desired changes in the pupils, are brought about by
planned learning activities and pupils progress is evaluated
by tests and other devices.
This integration of evaluation into the teaching-
learning process can be seen in the following stages of the
process:

Setting instructional objectives
Determining pupil variables that can affect instruction
Providing instructional activities that are relevant and
necessary to achieve the desired learning outcomes
Determining the extent to which desired outcomes are
achieved

This integration of evaluation into the teaching-
learning process can be seen in the following
stages of the process:
Principles of Educational Evaluation
Evaluation must be based on previously accepted
educational objectives.
Evaluation should be continuous comprehensive and a
commutative process.
Evaluation should recognize that the total individual
personality is involved in learning.
Evaluation should be democratic and cooperative.
Evaluation should be positive and action-directed
Evaluation should give opportunity to the pupil to become
increasingly independent in self-appraisal and self-
direction.
Evaluation should include all significant evidence from
every possible source.
Evaluation should take into consideration the limitations
of the particular educational situations.

The Key to Effective Testing

Objectives: The specific statements of the aim of the instruction; it
should express what the students should be able to do or know as a
result of taking the course; the objectives should indicate the
cognitive level and psychomotor level of expected performance.

Instruction: It consists all the elements of the curriculum designed
teach the subject, including the lessons plans, study guide and
reading and homework assignment; the instruction should
corresponds directly to the objectives.

Assessment: The process of gathering, describing or quantifying
information about the performance of the learner; testing
components of the subjects; the weight given to different subject
matter areas on the test should match w/ the objectives as well as
the emphasis given to each subject area during instruction.

Evaluation: Examining the performance of students and comparing
and judging its quality. Determining whether or not the learner has
met the objectives of the lesson and the extent of understanding.

INSTRUCTIONAL OBJECTIVES
Instructional objectives play a very important role in the
instructional process and the evaluation process. It serves as guides
for teaching and learning, communicate the intent of the instruction to
others and it provide a guidelines for assessing the learning of the
students.

Instructional objectives also known as behavioral objectives or
learning objectives are statement which clearly describes an
anticipated learning outcome.

Characteristics of well-written and useful instructional
objectives
1) describe a learning outcome
2) be student oriented-focus on the learner not on the teacher
3) be observable or describe an observable product
4) be sequentially appropriate
5) be attainable within a reasonable amount of time
6) be developmentally appropriate

BLOOMS TAXONOMY OF EDUCATIONAL OBJECTIVES

1) COGNITIVE DOMAIN call for outcomes of mental
activity such as memorizing, reading problem
solving, analyzing, synthesizing and drawing
conclusions.
2) AFFECTIVE DOMAIN refers to a persons
awareness and internalizations of objects and
stimulation, it focus on emotions.
3) PSYCHOMOTOR DOMAIN it focus on the physical
and kinaesthetic skills of the learner. This domain is
characterized by the progressive levels of
behaviors from observation to mastery of physical
skills.

Blooms Cognitive Taxonomy

Bloom identified six levels within the cognitive domain from simple recall of facts as
the lowest level through increasingly more complex and abstract mental level, to the
highest level that can be classified as evaluation. Verb samples for stating specific
learning outcomes that represent intellectual activity on each level are presented here.

1. Knowledge-recognizes students ability to use rote memorization and recall facts.
Verb samples: define, name, recognize, repeat, list, label, memorize, select, cite and
reproduce, state.
Test questions focus on the identification and recall of informations.

2. Comprehensive-involves students activity to read subject matter, extrapolate and
interpret important information and put other ideas to their own words.
Verb samples: describe, classify, explain, discuss, express, identify, translate, restate,
review, give examples, interpret, summarize.
Test questions should focus on the use of facts, rules and principles.

3. Application-students take new concept and apply them to another situation.
Verb samples: construct, arrange, compute, discover, show, relate, produce, prepare,
predict, solve, dramatize and interpret.
Test questions focus on applying facts or principles.
4. Analysis-students have the ability to take new information and
break it down into parts to differentiate between them.
Samples verbs: determine, differentiate, distinguish, estimate, point
out, discriminate, categorize, compare, criticize, examine, experiment
and debate.
Test questions focus on separation of a whole into components
and parts.

5. Synthesis- students able to take various types of information a
form a whole creating a pattern where one did not previously exist
Sample verbs:: assemble compose, create, formulate, plan, prepare,
formulated, design, reorganize, propose, set up.
Test question focus on combining ideas to form a new whole.

6. Evaluation- involves students ability to look at someone elses
ideas or principles and see the worth of the work and the value of the
conclusion.
Sample verbs: conclude, justify, criticize, assess, judge, predict, rate,
evaluate, select, choose, support, compare, argue, appraise
Test questions focus on developing opinions, judgement or
decisions.

Krathwohls Affective Taxonomy refers to a persons awareness and
internalization of objects and stimulation.

Anderson and krathwohl (2001) revised the Blooms original taxonomy by
combining both the cognitive process and knowledge dimensions. From
lowest level to highest level.

1. Receiving- listens to ideas
Verb samples: identify, select, give and listen to ideas

2. Responding- answers questions about ideas
Verb samples- read, select, tell, write, assist, present

3. Valuing- thinks about how to take advantages of ideas, able to explain
them well.
Verb samples: explain, follow, initiate, justify and propose

4. Organizing- commits to using ideas, incorporate them to activity
Verb samples: prepare, follow, explain, relate, synthesize, integrate, join,
and generalize.

5. Characterizing- incorporate ideas completely into practice, recognized by the use
of them.
Verb samples: solve, verify, propose, modify, practice, and qualify

Psychomotor Domain
This domain is characterized by the progressive
levels o behaviors from observation to mastery of
physical skills. From lowest level to highest level.

1) Observing-active mental attending of a physical
event.
2) Imitating- attempted copying of a physical
3) Practicing-trying a specific physical activity over
and over.
4) Adapting-fine tuning, making minor adjustment in
the physical activity in order to perfect it.

Criteria to Consider when Constructing Good Test

A. VALIDITY is the degree to which the test measures what is intended to measure. It is
the usefulness of the test for a given purpose. A valid test is always reliable.

B. BRELIABILITY it refers to the consistency of score obtained by the same person
when retested using the same instrument or one that is parallel to it.

C. ADMINISTRABLITY the test should be administered with ease, clarity and uniformity
so that scores obtained are comparable. Uniformity can be obtained by setting the
time limit and oral instructions.

D. SCORABILITY the test should be easy to score such that directions for scoring are
clear, the scoring key is simple; provisions for answer sheets are made.

E. ECONOMY test should be given in the cheapest way, which means that the answer
sheets must be provided so that the test can be given from time to time.

F. ADEQUACY the test should be contain a wide sampling of items to determine the
educational outcomes or abilities so that the resulting scores are representatives of
the total performance in the areas measured.

G. AUTHENTICITY the test should be stimulating real life situations.

Test, Measurement & Evaluation

Uploaded by

Copyright:

Available Formats

Test, Measurement & Evaluation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Test, Measurement & Evaluation

Uploaded by

Copyright:

Available Formats

Test, Measurement, Assessment,

and Evaluation in Education

You might also like