New Ed 9 Module 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Module 2: The Purpose of Testing


In your first module, you studied the basic concepts and terminologies in
assessment. This module deals with the purpose of testing. In classroom
assessment, there are three basic questions to consider: why we test, what we test
and how we test. This module will help you answer the first question.


At the end of this module, you are expected to:

1. discuss the what, how and why of testing; and

2. illustrate mismatches between what is being tested and how it is being tested.


Activity 1

In the classroom, decisions are constantly being made. As a teacher you may have to
decide such things as the following:
John, Don, Marie, and Jeri are ready to advance to level 7 in reading, but Chris and Linda
are not. Marks receives an A; Mary receives a C.
Ed has difficulty discriminating between long and short vowel sounds.
My teaching method is not effective for this group of students.
Arthur needs help developing social skills.
Mike’s attitude towards school has improved.
Mary should be moved to a higher reading group.
Mrs. Morrison’s class is better at math concepts than my class.

Ed 9 Assessment of Learning 1, First Semester 2021-2022 14

Module 2: The Purpose of Testing

Donna is a” slow learner”.

On what basis do teachers make decisions such as these?


To make instructional decisions, some teachers rely solely on personal judgment,
others solely on measurement data and many others combine measurement data with
judgment or subjective data. Which approach is best?
Antitest advocates suggest that testing should be done away with. Yet, decisions will
still have to be made. Teachers, as human beings, are subjected to good and bad days,
biases, student and parent pressure, faulty perceptions, and a variety of other influences. In
other words, relying solely on a teacher’s judgment means relying on a subjective decision
making process. Naturally, no teacher would intentionally make a “wrong’’ decision about a
student. However, all of us make mistakes.
Tests represent an attempt to provide objective data that can be used with
subjective impressions to make better, more defensible decisions. This is the purpose
of testing, or why we test. Tests are not subject to the ups and downs or other influences
that affect teachers. Subjective impressions can often place in perspective important
aspects of a decision making problem for which no objective data exist. Thus, it seems
likely that a combination of subjective judgments and objective data will result in more
appropriate rather than less appropriate decisions. Reliance on measurement data alone
can even prove to be detrimental to decision making. Although such data are objective, we
must remember that they are only estimates of a student’s behavior. Test data are never
100% accurate!
In summary, we test to provide objective information, which we combine with our subjective,
commonsense impressions to make better educational decisions. However, suggesting that
combining objective measurement data and subjective impressions results in better
educational decisions assumes that:
1. Measurement data are valid; and
2. The teacher or individual interpreting such data understands the uses and limitations
of such data.
Unfortunately, too often one or both of these assumptions are violated in the decision
making process. Obviously, when this happens, the resultant educational decisions may be
Teachers are not only accountable for reporting test data to students, parents, principals,
counselors and other stakeholders but for interpreting test data and being fully aware of the
uses and limitations of such data. One of the first steps toward acquiring the ability to

Ed 9 Assessment of Learning 1, First Semester 2021-2022 15

Module 2: The Purpose of Testing

interpret test data is to understand the different types of educational decisions that are made
from measurement data.

What are the types of educational decisions every classroom teacher must make?
1. Instructional decisions – are the nut-and-bolts types decisions made by all
classroom teachers. In fact, these are the most frequently made decisions in
education. Examples of such decisions include deciding to: spend more time in
math class on addition with regrouping; skip the review you planned before the
test and stick to your instructional plan.

Because educational decisions at classroom levels have a way of affecting decisions

at higher levels, it is important that these types of decisions be sound ones. Deciding
that more time should be spent on addition with regrouping, when it doesn’t need to
be, wastes valuable instructional time and may have very noticeable effects on
classroom management. Students may get turned off, tune you out, or act up. Such
problems may not be confined to the classroom. A student’s motivation, interaction
with peers and even their lifelong ambitions can be affected by these seemingly small
misjudgments at the classroom level.

2. Grading decisions – educational decisions based on grades are also made by the
classroom teacher, but much less frequently than instructional decisions.

For most students, grading decisions are probably the most influential
decisions made about them during their school years. All students are familiar
with the effects grades have on them, their peers and their parents. Given all the
attention and seriousness afforded to grades today, it is advisable for teachers to
invest extra care and time in these important decisions, because they may be
called to defend their decisions more frequently in the future. The teacher who is
most likely to be able to defend the grades that are assigned will be the teacher who:
a. adheres to an acceptable grading policy;
b. uses data obtained through multiple, appropriate measurement instruments, and
c. knows the uses and limitations of such data.
3. Diagnostic decisions – are those made about a student’s strengths and
weaknesses and the reasons for them. For examples, a teacher may notice in test
results that Ryan can successfully subtract four-digit numbers, but not if carrying is
involved. Given this information, the teacher decides that Ryan does not fully
understand the carrying process. The teacher has made a diagnostic decision based,
at least in part, on information yielded by an informal, teacher-made test. Such
decisions can also be made with the help of a standardized test. Because these
decisions are of considerable importance, we believe that objective test data should
always be used along with the teacher’s subjective judgment?.

Ed 9 Assessment of Learning 1, First Semester 2021-2022 16

Module 2: The Purpose of Testing

Activity 2
Important question for reflection: Was there an instance in your life as a student that you
ask your teacher to defend/justify the grade he/she has given to you? Share your experience
on the space below.

Thus far, we have considered the purpose of testing, or why we test. Next, let’s consider two
other aspects of educational measurement: how we measure and what we measure.
Please read the case below.
A Pinch of Salt

Jean-Pierre, the master French chef, was watching Marcel, who was Jean-Pierre’s best
student, do a flawless job of preparing the masters hollandaise sauce. Suddenly, Jean-
Peirre began pummeling Marcel with his fists. “Fool!” he shouted. “I said a pinch of salt, not
a pound!” Jean-Peirre was furious. He threatened to pour the sauce over Marcel’s head, but
before he could, Marcel indignantly emptied the whole salt container into the pot.
“There you old goat, I only added a pinch to begin with , but now there is a pound of salt in
the sauce –and I’m going to make you eat it!”
Startled by his student’s response, Jean-Pierre regained his composure. “All right, all right.
So you didn’t add a pound, but you certainly added more than a pinch!”
Still upset, Marcel shouted,” Are you senile? You were watching me all the time and you saw
me add only one pinch!” Marcel pressed his right thumb and index finger together an inch
from the master’s nose to emphasize his point.
“Aha you see! There you have it!” Jean-Pierre said, “That is not the way to measure a pinch.
Only the tips of the thumb and index finger make contact when you measure the pinch!’’
Marcel looked at the difference between his idea of a ‘’pinch’’ of salt and the master’s.
Indeed, there was quite a difference. Marcel’s finger and thumb made contact not just at the
fingertips, but all the way down to the knuckle. At Jean-Pierre’s request, they both deposited

Ed 9 Assessment of Learning 1, First Semester 2021-2022 17

Module 2: The Purpose of Testing

a “pinch” of salt on the table. Marcel’s pinched contained four or five times as much salt as
Who is correct? Is Marcel’s pinch too much? Is Jean-Pierre’s pinch too little? Whose method
would you use to measure a pinch of salt? Perhaps relying on an established standard will
Webster’s New Collegiate Dictionary defines a pinch this way: “As much as may be taken
between the finger and the thumb.” If Webster’s is our standard of comparison, we may
conclude that both Jean-Pierre and Marcel are correct. Yet, we see that Marcel’s pinch
contain a lot more salt than Jean-Pierre’s. It seems that we have a problem. Until Marcel and
Jean-Pierre decided on who is correct and adopt a more specific or standard definition of a
“pinch”, they may never resolve their argument. Furthermore, if we were to try to match the
recipes they develop their culinary masterpieces, we might never succeed unless we know
which measuring method to use.
This measure problem resulted from the lack of clear, unambiguous method of
measurement. Who is to say Jean-Pierre’s method is better than Marcel’s eyes, his method
is correct and Jean-Pierre’s not. According to Webster’s, both are correct. A clear and
unambiguous method of measuring a “pinch” would resolve the problem. It seems
reasonable to suggest that any time measurement procedures—that is, how we measure—
are somewhat subjective and lack specificity, similar interpretive problems may arise.
“Pinching” in the Classroom
Mr. Walsh assigns his history grades based entirely on his monthly test and a
comprehensive final examination. He takes attendance, comments on homework
assignments, encourages classroom participation, and tries to help his students develop
positive attitudes toward history, but none of these is considered in assigning grades. Mr.
Crater, another history teacher, assigns grades in a following
Monthly test 20%
Comprehensive finals 20%
Homework 10%
Attendance 20%
Class participation 15%
Attitude 15%

Both teachers assign roughly the same numbers of A’s, B’s, C’s, D’s, and F’s each
semester. Does an A in Mr. Walsh’s class mean the same thing as an A in Mr. Carter’s
class? Obviously, Mr. Walsh “pinches” more heavily when it comes to test data than does
Mr. Carter (100% versus 40%). On the other hand, Mr. Carter “pinches” more heavily on
attendance and participation in assigning grades than does Mr. Walsh (35% versus 0%).
Which method is correct? In interpreting final grades earned in each class, would you say
the students who earned A’s in either class are likely to do equally well on history test
constructed by a third teacher?
Should Mr. Walsh “pinch,” that is, assigned grades, more like Mr. Carter, or should Mr.
Carter assign grades more like Mr. Walsh? Obviously, there are no easy answers to these
questions. The differences in how Jean-Pierre and Marcel measure a “pinch” of salt result
from the lack of a clear, unambiguous definition of a “pinch.” Similarly, the differences in how
Mr. Walsh and Mr. Carter measure learning in history result from the lack of clear,
unambiguous definition of what constitute a final grade. As a result, their final grades

Ed 9 Assessment of Learning 1, First Semester 2021-2022 18

Module 2: The Purpose of Testing

represent somewhat different aspects of their student’s performance. The way they “pinch”
or how they measure differs.
For Jean-Pierre and Marcel, there is relatively easy way out of the disagreement. Only their
method of measurement or how they measure is in question, not what they are measuring. It
would be easy to develop a small container, a common standard, that could then be used to
uniform measure “pinches” of salt. In the classroom, the task is more difficult for Mr.
Walsh and Mr. Carter. What is in question is not only their measurement method, or
how they weigh components of a grade, but what they are measuring.
Much of what we measure or attempt to measure in the classroom is not clearly
defined, For example, think of how many different ways you have heard learning,
intelligence, or adjustment defined. Yet, we constantly attempt to assess or measure these
traits. Furthermore, the methods we use to measure these often ill-defined or undefined

Only when both what to measure and how to measure have been considered,
specified, and clearly defined can we hope to eliminate the problems involved in
measuring and interpreting classroom information. The task is formidable. We cannot
hope to solve it entirely, but we can minimize the subjectivity, the inaccuracy, and the more
common interpretative errors often inherent in classroom measurement. Achieving these
goals will go far towards helping us “pinch’’ properly.
Sound measurement practice will benefit you professionally. Moreover, it will benefit
most those unwitting and captive receptors of measurement practice, your student.
The examples presented are intended to alert you to two general problems encountered in
classroom measurement:
1. defining what is that you want to measure; and
2. determining how to measure whatever it is that you are measuring.

Defining what to measure may, at the first glance, not appear like much of a problem, but
consider the following example:
Mrs. Norton taught first-year math in a small private high school comprised of high achieving
students. She prided herself on her tests, which stressed the ability to apply math concepts
to real-life situations. She did this by constructing fairly elaborate word problems. When she
moved to another part of the state, she went to work on an inner city high school teaching an
introductory math skills course. Because she assumes that her new students were not
nearly as “sharp” as her private school students, she “toned down” her tests by substituting
simpler computations in her word problems. Even with the substitution, 29 of her 31 students
failed this “easier” test. Dejected, she substituted even simpler computations into her next
test, only to have 30 out of 31 fail. She concluded that her students “totally lack even basic
math skills,” and applied for a transfer.

Activity 3
Think about this questions:
Do you agree with Mrs. Norton’s conclusion? If, so why? If not, why not?

Ed 9 Assessment of Learning 1, First Semester 2021-2022 19

Module 2: The Purpose of Testing


Are there any possible alternative conclusions?


What might she have done differently?


We disagree with Mrs. Norton. Whereas it may be that some of her students lack or are
weak in basic math skills, there seems to be a little conclusive evidence that all, or even
most, lack these skills. There may be another explanation for her students’ poor
Her test were originally designed to measure the ability of her high-achieving private-school
students to apply their skills to real-life situations. This is what she wanted to measure.
In her public-school skill class, what Mrs. Norton wanted to measure was a bit different.
What she wanted to measure was not the ability to apply skills (at least not at first) but
whether the skills were ever acquired. How she measured must also be considered. Her
tests consisted of fairly elaborate word problems. They may have been tests of reading
ability as much as tests of math applications. Their reading level may have been
appropriated for her “bright” students but too advanced for her new students. Since her tests
measured skill application and reading ability rather than skill acquisition, how Mrs. Norton
measured also was not appropriate. Assuming you are convinced it is important to define
what you are measuring, let’s look in more detail at what we mean by determining how to
Mrs. Norton got into trouble mainly because she wasn’t sure what she was measuring. At
least by giving a written test she had the right idea about one aspect of how to measure
basic math skills. Or did she? How else could basic math skills be measured? An oral test,
you say? But in a class of 31 students? Any other ideas? What about a questionnaire
without any math problems, just questions about math skills to which students respond yes
or no to indicate whether they believe they have acquired a certain skill? How about simply
observing your students in the process of completing a math problem? Or how about
requiring your students to complete a practical project requiring math skills?
The techniques mentioned are all possible measurement methods, some for measuring
basic math skills, others for measuring attitudes toward math. Questionnaires, oral
responses, observation, and projects are common method of measurement. These and a
variety of performance assessment techniques, including portfolios, will be discussed later.
For now, simply be aware that there are alternatives to written tests and that how we

Ed 9 Assessment of Learning 1, First Semester 2021-2022 20

Module 2: The Purpose of Testing
measure is often determined by what we measure. Let’s return to Mrs. Norton and the
most commonly used form of measurement, the written test.
Types of Written Tests
1. Verbal – emphasizes reading, writing or speaking.
2. Non-verbal – does not require reading, writing or speaking ability. Tests compose of
numerals or drawings.
3. Objective – refers to the scoring of tests. When two or more scorers can easily agree
on whether an answer is correct or incorrect, the test is an objective one. Ex. True-
false, multiple choice, matching type
4. Subjective - also refers to the scoring of tests. When it is difficult for two scorers to
agree on whether an answer is correct or incorrect, the test is a subjective one. Ex.
Essay tests
5. Teacher-made tests – entirely constructed by teachers for use in the classroom.
6. Standardized tests – constructed by measurement experts over a period of years.
They are designed to measure broad, national objectives and have a uniform set of
instructions that are adhered to during each administration. Most also have table of
norms, to which a student’s performance may be compared to determine where the
students stands in relation to a national sample of students at the same grade or
7. Power test – it has liberal time limits that allow each student to attempt each item.
Items tend to be difficult.
8. Speed test – it has time limits so strict that no one is expected to complete all items.
Items tend to be easy.
The test Mrs. Norton used was heavily verbal which suggest that it relied almost exclusively
on words to ask questions. Although the answer to her questions may have been numerical,
the questions themselves were word problems. While there is probably no better way to
measure basic math skills than through a written test, was the written test Mrs. Norton used
was the better type of written test? We would say no, and suggest that her test should have
looked more like the following instead of consisting only word problems.
1.) 1431 – 467 =
2.) 798 – 581 =
3.) 125 x 7 =
4.) 21 + 11 =
The advantages of a basic math skills test with items similar in format to the above test items
include the following:
1. reading ability is eliminated;
2. basic math skills are measured directly; and
3. more items can be included in the same amount of test time.
All these points are important. The first two help ensure that the test measures what it is
suppose to measure. The last improves the reliability, or the consistency of the score it
yields over time.
Furthermore, inspecting a student’s written work on such a test can help you to diagnose
errors in the process used by students to arrive at an answer. So, you see, there can be
different types of written tests. The point of our discussion about Mrs. Norton is simply that
how you measure must always match what you measure. Whether you give word
problems, use number formats, or provide real world examples depends on whether you are
measuring problem-solving ability, knowledge of facts, process, application and so on.

Ed 9 Assessment of Learning 1, First Semester 2021-2022 21

Module 2: The Purpose of Testing


In this module, we have considered the importance of knowing what we want to measure
and how we want to measure. It also important to note that determining what and how to
measure may not be as simple as it appears. However, both considerations are vitally
important in classroom measurement, because failing to be clear about them is likely to
result in invalid measurement.
This module has introduced you to why we test, what we test, and how we test. Its major
points are as follows:
1. The purpose of testing is to collect objective information that may be used in
conjunction with subjective information to make better educational decisions.
2. In our age in increasing demands for accountability, it has become imperative that
teachers be able to understand and demonstrate the role that objective test data can
play in educational decision making.
3. Classroom teachers are responsible for the bulk of educational decision making,
such as everyday instructional decisions, grading decisions, and diagnostic
decisions. Such decisions are often based, or ought to be based, on information
obtained from teacher-made tests.
4. Other, less frequent kinds of educational decisions are usually made by
administrators or specialist other than the classroom teacher. These include
decisions about selection, placement, counseling and guidance, programs and
curriculum, and administration. Such decisions are usually based on information
obtained from standardized tests, and increasingly from high stakes tests.
5. Measurement problems may be expected any time testing procedure lack definition
or specificity, or when we fail to clearly specify the trait we are measuring.

6. Specifying or defining what we want to measure often determines how the trait
should be measured.

Activity 4
Using Mrs. Norton’s problem as a guide, cite at least three examples in any
subject area that illustrates mismatches between what is being tested and how it is
being tested.

Ed 9 Assessment of Learning 1, First Semester 2021-2022 22

Module 2: The Purpose of Testing



1. Explain the importance of the alignment of the why, what and how of testing.



De Guzman and Adamos (2015) Assessment of Learning 1. Adriana Publishing

Company, Inc. Philippines.

Kubiszyn and Borich (2003) Educational Testing and Measurement Classroom

Application and Practice. 7th Edition. John Wiley and Sons (Asia) Pte. Ltd.

Rico (2011) Assessment of Student Learning A Practical Approach. Assessment of

Students’ Learning A Practical Approach. Anvil Publishing, Inc. Philippines.

Ed 9 Assessment of Learning 1, First Semester 2021-2022 23

Module 2: The Purpose of Testing

Ed 9 Assessment of Learning 1, First Semester 2021-2022 24

You might also like