Educ. 105 Assessment of Learning 1 Module 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

MODULE 4 EDUC 105-Assessment of Learning 1

LESSON 7 Measures of Central Tendency and Dispersion/Variability

Objectives:
1. Explain the meaning and function of the measures of central tendency and measures of
dispersion/variability
2. Distinguish among the measures of central tendency and measures of variability/dispersion
3. Explain the meaning of normal and skewed score distribution

INTRODUCTION

A measure of central tendency is a single value that attempts to describe a set of data (like scores) by identifying
the central position within that set of data or scores. As such, measures of central tendency are sometimes called
measures of central location. Central tendency refers to the center of a distribution of observations. Where do scores tend
to congregate? In a test of 100 items, where are most of the scores? Do they tend to group around the mean score of 50
or 80?
There are three measures central tendency - the mean, the median and the mode. Perhaps you are most familiar
with the mean (often called the average). But there are two other measures of central tendency, namely, the median and
the mode. Is there such a thing as best measure of central tendency?

If the measures of central tendency indicate where scores congregate, the measures of variability indicate how
spread out a group of scores is or how varied the scores are or how far they are from the mean? Common measures of
dispersion or variability are range, interquartile range, variance and standard deviation.

7.1 The Measures of Central Tendency


The mean, mode and median are valid measures of central tendency but under different conditions, one
measure becomes more appropriate than the others. For example, if the scores are extremely high and extremely low,
the median is a better measure of central tendency since mean is affected by extremely high and extremely low scores.

The Mean (Arithmetic)


The mean (or average or arithmetic mean) is the most popular and most well-known measure of central
tendency. The mean is equal to the sum of all the values in the data set divided by the number of values in the data set.
For example, 10 students in a Graduate School class got the following scores in a 100 - item test: 70, 72, 75,77, 78, 80,
84, 87, 90, 92. The mean score of the group of 10 students is the sum of all their scores divided by 10. The mean,
therefore, is 805/10 equals 80.5. 80.5 is the average score of the group. There are 6 scores below the average score
(mean) of the group (70, 72, 75,77,78, and 80) and there are 4 scores above the average score (mean) of the group (84,
87, 90 and 92).

When Not to Use the Mean


The mean has one main disadvantage. It is particularly susceptible to the influence of outliers. These are values that are
unusual compared to the rest of the data set by being especially small or large in numerical value. For example, consider
the scores of 10 Grade 12 students in a 100-item Statistics test below:

Score 1 2 3 4 5 6 7 8 9 10
5 38 56 60 67 70 73 78 79 95

The mean score for these ten Garde 12 students dents is 62.1 However, inspecting the raw data suggests that
this mean score may not be the best way to accurately reflect the score of the typical Grade 12 student., as most
students have scores in the 5 to 95 range. The mean is being skewed by the extremely low and extremely high scores.
Therefore, in this situation, we would like to have a better measure of central tendency. As we will find out later, taking
the median would be a better measure of central tendency in this situation.
Median
The median is the middle score for a set of scores arranged from lowest to highest. The mean is less affected by
extremely low and extremely high scores. How do we find the median? Suppose we have the following data:
65 55 89 56 35 14 56 55 87 45 92

To determine the median, first we have to rearrange the scores into order of magnitude (from smallest to largest).

14 35 45 55 55 56 56 65 87 89 92

Our median is the score at the middle of the distribution. In this case, 56. It is the middle score. There are 5
scores before it and 5 scores after it. This works fine when you have an odd number of scores, but what happens when
you have an even number of scores? What if you had 10 scores like the scores below?
65 55 89 56 35 14 56 55 87 45
Arrange that data according to order of magnitude (smallest to largest). Then take the middle two scores (55
and 56) and compute the average of the two scores. The median is 55.5. This gives us a more reliable picture of the
tendency of the scores. There are indeed scores of 55 and 56 in the score distribution.

Mode
The mode is the most frequent score in our data set. On a histogram or bar chart it represents the highest bar. If
is a score of the number of times an optionis chosen in a multiple choice test You can, therefore, sometimes consider
the mode as being the most popular option. Study the score distribution given below:
14 35 45 55 55 56 56 65 87 89

There are two most frequent scores 55 and 56. So we have a score distribution with two modes, hence a bimodal
distribution.

7.2 Normal and Skewed Distributions


A score distribution a sample has a "normal distribution" when most of the values are aggregated around the
mean, and the number of values decrease as you move below or above the mean: the bar graph of frequencies of a
"normally distributed" sample will look like a bell curve.

• If mean is equal to the median and median is equal to the mode, the score distribution shows a perfectly
normal distribution. This is illustrated by the perfect bell shape or normal curve shown in Figure 13.
• If mean is less than the median and the mode, the score distribution is a negatively skewed distribution. See
Figure 14. In a negatively skewed distribution, the scores tend to congregate at the upper end of the score
distribution.
• If mean is greater than the median and the mode, the score distribution is a positively skewed distribution. See
Figure 15. In a positively skewed distribution, the scores tend to congregate at the lower end of the score
distribution.

If scores tend to be high because teacher taught very well and students are highly motivated to learn, the score
distribution tends to be negatively skewed, i.e. the scores will tend to be high. On the other hand, when teacher does
not teach well and students are poorly motivated, the score distribution tends to be positively skewed which means that
scores tend to below. So which score distribution should we work for?

7.3 Outcome-based Teaching-Learning and Score Distribution


If teachers teach in accordance with the principles of outcome-based teaching-learning and so align content and
assessment with the intended learning outcomes and re-teach till mastery what has/ have not been understood as
revealed by the formative assessment process, then student scores in the assessment phase of the lesson will tend to
congregate on the higher end of the score distribution. Score distribution will be positively skewed.

On the other hand, if what teachers teach and assess are not aligned with the intended learning outcomes, the
opposite will be true. Score distribution will be negatively skewed which means that scores tend to congregate on the
lower end of the score distribution.

7.4 Measures of Dispersion or Variability

If the measures of central, tendency indicate where scores congregate, the measures of variability indicate how spread
out a group of scores is or how varied the scores are. Common measures of dispersion or variability are range, variance
and standard deviation.

Range
What is variability?
Variability refers to how "spread out" a group of scores is. The terms variability, spread, and dispersion are synonyms,
and refer to how spread out a distribution is. Here are two sets of score distribution:

A- 5, 5, 5,,5, 6, 6, 6,6,6, 6 - Mean is 5, 6


B - 1, 3,4,5 ,5, 6,7,8,8, 9 - Mean is 5.6

The two score distributions have equal mean scores and yet the scores are varied. Score distribution A shows
scores that are less varied than score distribution B. That is what we mean by variability or dispersion. If we have to
study both score distributions, assuming that the highest possible score in the quiz is 10, we can say that Groups A and B
are equal in terms of mean but Group A has more similar scores and are closer to the mean while Group B, while its
mean is equal to the mean of Group A, students in Group B have more varied scores than Group A. In fact, the lowest
score is extremely low compared to Group A and the highest score is much higher than the highest score in Group A.

To see more what we mean by spread out, consider graphs in Figure 1. These graphs represent the scores on
two quizzes. The mean score for each quiz is 7.0. Despite the equality of means, you can see that the distributions are
quite different. Specifically, the scores on Quiz 1 are more densely packed and those on Quiz 2 are more spread out. The
'differences among students were much greater on Quiz 2 than on Quiz 1.
http://onlinestatbook.com/2/summarizing_distributions/variability.html

Range
The range is the simplest measure of variability. The range is simply the highest score minus the lowest score. Here are
examples: Let's take a few examples. at is the range of the following group of scores: 10, 2, 5, 6, 7, 3, 4? The highest
number is 10, and the lowest number is 2, so 10 - 2 8. The range is 8.

Here are other examples:


Here is a set of scores in a test: 99, 45, 23, 67, 45, 91, 82, 78, 62, 51. What is the range? The highest number is 99 and
the lowest number is 23, so 99 - 23 equals 76; the range is 76. Here is another set of scores: 40, 40, 42, 50, 53, 56, 67, 68,
70, 89. at is the range? 89 minus 40 equals 49. The range is 49. The set of scores with a range of 76 is more varied or
more spread than the set of scores with a range of 49.

Variance
Variability can also be defined in terms of how close the scores in the distribution are to the middle of the distribution.
Using the mean as the measure of the middle of the distribution, the variance is defined as the average squared difference
of the scores from the mean. The data from Quiz 1 are shown in Table 1. The mean score is 7.0. Therefore, the column
"Deviation from Mean" contains the score minus 7. The column "Squared Deviation" is simply the previous column
squared.

Table 6. Calculation of Variance for Quiz 1 Scores

SCORES Deviation from Mean Squared Deviation


X Score minus Mean
Mean =7
9 9-7=2 22=4
9 9-7=2 22=4
9 9-7=2 22=4
8 8-7=1 12=1
8 1 1
8 1 1
8 1 1
7 0 0
7 0 0
7 0 0
7 0 0
7 0 0
6 -1 1
6 -1 1
6 -1 1
6 -1 1
6 -1 1
6 -1 1
5 -2 4
5 -2 4
Means
140÷20=7 10+(-10)=0÷20=0 30÷20=1.5

One thing that is important to notice is that the mean deviation from the mean is 0. This will always be the case.
The mean of the squared deviations is 1.5. Therefore, the variance is 1.5. The formula for the variance is:

Standard Deviation
To calculate the standard deviation of those numbers:
1. Work out the Mean (the simple average of the numbers).
2. Then for each number: subtract the Mean and square the result
3. Then work out the mean of those squared differences.
Take the square root of that and we are done!

The Formula Explained


First, let us have some example values to work on:
Example: Sam has 20 rose bushes.
The number of flowers on each bush is 9, 2, 5, 4, 12, 7, 8, 11,
9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4

Let's solve for the Standard Deviation.


Step 1. Work out the mean
In the formula above (the Greek letter "mu") is the Mean
of all our values...
Example: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
The mean is: 9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4 20 = 140/20 = 7
So:µ=7

Step 2. Then for each number: subtract the Mean and square the result.
This is the part of the formula that says:

So what is x. ? They are the individual x values 9, 2, 5, 4, 12,7, etc... In other words x1= 9, x2 = 2, x3 = 5, etc.
So it says "for each value, subtract the mean and square the result," like this

Example (continued):

(9 - 7)2= (2)2 : 4
(2 - 7)2 = (-5)2 :25
(4-7)2=(-3)2=9
(12 - 7)2 = (5)2 = 25
(7 - 7)2 = (0)2 = 0
(8-7)2=(1)2=1
...etc...
And we get these results:
4, 25, 4, 9, 25, 0, 1, 16, 4, 16, 0, 9, 25, 4, 9, 9, 4, 1, 4, 9
Step 3. Then work out the mean of those squared differences. To work out the mean, add up all the values then divide
by how many.
First add up all the values from the previous step.
But how do we say "add them all up" in mathematics? We use "Sigma": E
The handy Sigma Notation says to sum up as many terms as we want:
We want to add up all the values from 1 to N, where N=20 in our case because there are 20 values:
Example (continued):
Which means: Sum all values from (x1-7)2 to (xN-7)2
We already calculated (x1-7)2=4 etc. in the previous step, so just sum them up: =
4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+ 9+4+1+4+9 = 178
But that isn't the mean yet, we need to divide by how many, which is done by multiplying by 1/N (the same as dividing
by N):

Example (continued):

Mean of squared differences = (1/20) x 178 : 8.9


(Note: this value is called the "Variance")

Step 4. Take the square root of that:


Example (concluded):

Sample Standard Deviation


But sometimes our data are only a sample of the whole population.
Example: Sam has 20 rose bushes, but only counted the flowers on 6 of them!
The "population" is all 20 rose bushes, and the "sample" is the 6 bushes that Sam counted among the 20.
Let us say Sam's flower counts are: 9, 2, 5, 4, 12, 7 We can still estimate the Standard Deviation.
But when we use the sample as an estimate of the whole population, the Standard Deviation formula changes to this:
The formula for Sample Standard Deviation:

The important change is "N-1" instead of "N" (which is called "Bessel's correction").
The symbols also change to reflect that we are working on a sample instead of the whole population:
• The mean is now x (for sample mean) instead of µ (the population mean),
• And the answer is s (for Sample Standard Deviation) instead of a.
But that does not affect the calculations. Only N-1 instead of N changes the calculations.

Here are the steps in calculating the sample standard Deviation:


Step 1. Work out the mean
Example 2: Using sampled values 9,2,5, 4, 12,7
The mean is (9+2+5+4+12+7) I 6:3916: 6.5
So: x: 6.5

Step 2. Then for each number: subtract the Mean and square the result
Example 2 (continued):
(9-6.5)2=(2.5) 2:6.25
(2 '6.5) 2= (-4.5)2 =20.25
(5 - 6.5) 2 = (-1.5)2 =2.25
(4 - 6.5) 2= (-2.5)2=6.25
(12 - 6.5) 2= (5.5)2 = 30.25
(7 - 6.5) 2= (0.5)2 = 0.25
Step 3. Then work out the mean of those squared differences. To work out the mean, add up all the values then divide
by how many.
But hang on... we are calculating the Sample Standard Deviation, so instead of dividing by how many (N), we will divide
by N-1

Example 2 (continued):
Sum = 6.25 + 20.25 + 2.25 + 6.25 + 30.25 + 0.25 = 65.5
Divide by N-1: (1/5) x 65.5 = 13.1
(This value is called the "Sample Variance")

Step 4. Take the square root of that: Example 2 (concluded):

S = √𝟏𝟑. 𝟏=3.619
7.5 Comparing
When we used the whole population, we got: Mean= 7, Standard Deviation =2.983...
When we used the sample, we got: Sample Mean: 6.5, Sample Standard Deviation = 3.619...
Our Sample Mean was wrong by 7%, and our Sample Standard Deviation was wrong by 21%.

Why Take a Sample?


Mostly because it is easier and cheaper. Imagine you want to know what the whole university thinks. If you can't ask
thousands of people, so instead you ask maybe only 300 people. Samuel Johnson once said “You don't have to eat the
whole ox to know that the meat is tough."

(source: https://www.mathsisfun.com/data/standard-deviatlon-formulas.html

7.6 More Notes on Standard Deviation The standard deviation is simply the square root of the variance. The standard
deviation is an especially useful measure of variability when the distribution is normal or approximately normal because
the proportion of the distribution within a given number of standard deviations from the mean can be calculated.

For example, 68% of the distribution is within one standard deviation of the mean and approximately 95% of the
distribution is within two standard deviations of the mean. Therefore, if you had a normal distribution with a mean of
50 and a standard deviation of 10, then 65% of the distribution would be between 50 – 10=40 and 50 +10=60. Similarly,
about 95% of the distribution would be between 50 - 2x 10 = 30 and 50 + 2 x 10 = 70
The symbol for the population standard deviation is Ơ;

Figure 2 shows two normal distributions. The distribution (bold line) has a mean of 40 and a standard deviation of 5; The
other distribution has a mean of 60 and a standard deviation of 10. For the distribution (bold line), 68% of the distribution
is between 35 and 45; for the other distribution, 68% is between 50 and 70.

http://onlinestatbook.com/2/summarizing_distributions/variability.html
Figure 18. Normal distributions with standard deviations of 5 and 10.
Standard Deviation is a measure of dispersion, the more dispersed the data, the less consistent the data are. A lower
standard deviation means that the data are more clustered around the mean and hence the data set is more consistent.
You need to read your calculator instructions to see what notation your calculator uses for the standard deviation.
An example. Standard deviation for a data set with frequency 1. Using the following data: 10 15 13 25 22 53 47

We found the mean to be x = 26:4285714. You should also see from the same calculation that the standard deviation is
(SD) = 16:98879182.
(2009 ASU School of Mathematical & Statistical Sciences and Terri L. Miller, retrieved, 1-15-19)

7.7 Interpretation of Standard Deviation


Let us use the standard deviation to compare two data sets. Let us use the standard deviation to interpret how
consistent the data are. The lower the standard deviation, the more consistent the data are.
Example - Two bowlers, Katie and Mike have the scores given
below:
Katie's Scores 189 146 200 241 231
Mike's Scores 235 201 217 168 186
Both sets of data have a mean (x) = 201.4.
Does this mean they are equivalent bowlers?
No, consider the standard deviations. Katie has a standard deviation of SD = 37.6470 and Mike has a standard deviation
of SD= 26.1017. Since Mike has a smaller standard deviation, he is a more consistent bowler than Katie, i.e. Mike is more
likely to get a score of 201.4.

Let's presume that Katie's and Mike's scores are scores in a long test:
Katie's Scores - 189 146 2OO 241 231
Mike's Scores - 235 201 217 168 186
If you compute the mean for both sets of scores, you get 201. SD for Katie's scores is 37.6470 while that of Mike is
26.1017. Mike's scores indicate greater consistency than those of Katie. This means that Mike tends to do better than
Katie because his scores are more consistent than those of Katie.

(Source; 2009 ASU School of Mathematical & Statistical Sciences and Terri L. Miller), Retrieved, 1 -25-19)

LESSON 8 Grading Systems and the Grading System of


the Department of Education
Objectives:
1. Distinguish between norm-referenced and criterion-referenced grading; cumulative and averaging
grading system
2. Compute grades of students in various grade levels observing DepEd guidelines

INTRODUCTION

Assessment of student performance is essentially knowing how the student is progressing in a course (and,
incidentally, how a teacher is also performing with respect to the teaching process). The first step in assessment is, of
course, testing (either by some pencil-paper objective test or by some performance-based testing procedure) followed
by a decision to grade the performance of the student. Grading, therefore, is the next step after testing. Over the course
of several years, grading systems had been evolved in different schools’ systems all over the world. In the American
system, for instance, grades are expressed in terms of letters, A, B, B+, B-, C, C-, D or what is referred to as a seven-point
system. In Philippine colleges and universities, the letters are replaced with numerical values: 1, 1.25, 1.50, 1.75, 2.0,
2.5, 3.0 and 4.0 or an eight-point system. In basic education, grades are expressed as percentages (of accomplishment)
such as 80% or 75%. With the implementation of the K to 12 Basic Education curriculum, however, student's
performance is expressed in terms of level of proficiency. Regardless of grading system adopted, it is clear that there
appears to be a need to convert raw score values into the corresponding standard grading system. This Chapter is
concerned with the underlying philosophy and mechanics of converting raw score values into standard grading formats.
Norm-Referenced Grading
The most commonly used grading system falls under the category of norm-referenced grading. Norm-referenced
grading refers to a grading system where a student's grade is placed in relation to the performance of a group. Thus, in
this system, a grade of 80 means that the student performed better than or same as 80% of the class (or group). At first
glance, there appears to be no problem with this type of grading system as it simply describes the performance of a
student with reference to a particular group of learners. The following example shows some of the difficulties associated
with norm-referenced grading.

Example: Consider the following two sets of scores in an English 1 class for two sections of ten students each:
A = { 30, 40,50, 55, 60, 65,70,75,80, 85 }
B={ 60, 65, 70, 75, 80, 85, 90, 90, 95, 100 }

In the first class, the student who got a raw score of 75 would get a grade of 80% while in the second class, the
same grade of 80% would correspond to a raw score of 90. Indeed, if the test used for the two classes are the same, it
would be a rather "unfair" system of grading. A wise student would opt to enroll in class A since it is easier to get higher
grades in that class than in the other class (class B).
The previous example illustrates one difficulty with using a norm-referenced grading system. This problem is called the
problem of equivalency. Does a grade of 80 in one class represent the same achievement level as a grade of 80 in
another class of the same subject? This problem is similar to the problem of trying to compare a Valedictorian from
some remote rural high school with a Valedictorian from some very popular University in the urban area. Does one
expect the same level of competence for these two valedictorians?

As we have seen, norm-referenced grading systems are based on a pre-established formula regarding the
percentage or ratio of students within a whole class who will be assigned each grade or mark. It is therefore known in
advance what percent of the students would pass or fail a given course. For this reason, many opponents to norm-
referenced grading av€r that such a grading system does not advance the cause of education and contradicts the
principle of individual differences.
In norm-referenced grading, the students, while they may work individually, are actually in competition to
achieve a standard of performance that will classify them into the desired grade range. It essentially promotes
competition among students or pupils in the same class. A student or pupil who happens to enroll in a class of gifted
students in Mathematics will find that the norm-referenced grading system is rather worrisome. For example, a
teacher may establish a grading policy whereby the top 15 percent of students will receive a mark of excellent or
outstanding, which in a class of 100 enrolled students will be 15 persons. Such a grading policy is illustrated below:
1.0 Excellent = Top 15% of the Class
1.50 Good = Next 15% of the Class
2.0 Average/Fair = Next 45% of the class
3.0 Poor/Pass = Next 15% of the Class
5.0 Failure = Bottom 10% of the Class

The underlying assumption in norm-referenced grading is that the students have abilities (as reflected in their
raw scores) that obey the normal distribution. The objective is to find out the best performers in this group. Norm-
referenced systems are most often used for screening selected student populations in conditions where it is known that
not all students can advance due to limitations such as available places, jobs, or other controlling factors. For example, in
the Philippine setting, since not all high school students can actually advance to college or university level because of
financial constraints, the norm referenced grading system can be applied.

Example: In a class of 100 students, the mean score in a test is 70 with a standard deviation of 5. Construct a
nonreferenced grading table that would have seven-grade 'scales and such that students scoring between plus or minus
one standard deviation from the mean receives an average grade. Solution: The following intervals of raw scores to
grad.e equivalents are computed:
Raw Score Grade Equivalent Percentage
Below 55 Fail 1%
55-60 Marginal Pass 4%
61-65 Pass 11%
66-75 Average 68%
76-80 Above Average 11%
81-85 Very Good 4%
Above 85 Excellent 1%
Only a few of the teachers who use norm-referenced grading apply it with complete consistency. When a
teacher is faced with a particularly bright class, most of the time, he does not penalize good students for having the bad
luck to enroll in a class with a cohort of other very capable students even if the grading system says he should fail a
certain percentage of the class. On the other hand, it is also unlikely that a teacher would reduce the mean grade for a
class when he observes a large proportion of poor performing students just to save them from failure. A serious problem
with norm-referenced grading is that, no matter what the class level of knowledge and ability, and no matter how much
they learn, a predictable proportion of students will receive each grade. Since its essential purpose is to sort students
into categories based on relative performance, norm-referenced grading and evaluation is often used to weed out
students for limited places in selective educational programs.

Norm-referenced grading indeed promotes competition to the extent that students would rather not help fellow
students because by doing so, the mean of the class would be raised and consequently it would be more difficult to get
higher grades. Similarly, students would do everything (legal) to pull down the scores of everyone else in order to lower
the mean and thus assure him/her of higher grades on the curve.

A more subtle problem with norm-referenced grading is that a strict correspondence between the evaluation
methods used and the course instructional goals is not necessary to yield the required grade distribution. The specific
learning objectives of norm-referenced classes are often kept hidden, in part out of concern that instruction not "give
away" the test or the teacher's priorities, since this might tend to skew the curve. Since norm referenced grading is
replete with problems, what alternatives have been devised for grading the students?

Criterion-Referenced Grading

Criterion-referenced grading systems are based on a fixed criterion measure. There is a fixed target and the
students must achieve that target in order to obtain a passing grade in a course regardless of how the other students in
the class perform. The scale does not change regardless of the quality, or lack thereof, of the students. For example, in a
class of 100 students using the table below, no one might get a grade of excellent if no one scores 98 above or 85 above
depending on the criterion used. There is no fixed percentage of students who are expected to get the various grades in
the criterion-referenced grading system.

1.0 Excellent 98-100 Or 85-100


1.5 Good 88-97 Or 80-84
2.0 Fair 75-87 Or 70-79
3.0 Poor/Pass 65-74 Or 60-69
5.0 Failure Below 65 Or below 60

Criterion-referenced systems are often used in situations where the teachers are agreed on the meaning of a
"standard of performance" in a subject but the quality of the students is unknown or uneven; where the work involves
student collaboration or teamwork, and where there is no external driving factor such as needing to systematically
reduce a pool of eligible students.

Note that in criterion-referenced grading system, students can help a fellow student in a group work without
necessarily worrying about lowering his grade in that course. This is because the criterion-referenced grading system
does not require the mean (of the class) as basis for distributing grades among the students. It is therefore an ideal
system to use in collaborative group work. When students are evaluated based on predefined criteria, they are freed to
collaborate with one another and with the instructor. with criterion-referenced grading, a rich learning environment is
to everyone's advantage, so students are rewarded for finding ways to help each other, and for contributing to class and
small group discussions.

Since the criterion measure used in criterion-referenced grading is a measure that ultimately rests with the
teacher, it is logical to ask: What prevents teachers who use criterion referenced grading from setting the performance
criteria so low that everyone can pass with ease? There is a variety of measures used to prevent this situation from ever
happening in the grading system. First, the criterion should not be based on only one teacher's opinion or standard. It
should be collaboratively arrived at. A group of teachers teaching the same subject must set the criterion together.
Second, once the criterion is established, it must be made public and open to public scrutiny so that it does not become
arbitrary and subject to the whim and caprices of the teacher.
Four Questions in Grading

Marinila D. Svinicki (2007) of the Center for Teaching Effectiveness of the University of Texas in Austin poses
four intriguing questions relative to grading. we share these questions here in this section and the corresponding
opinion of Ms. Svinicki for your own reflection:
1. Should grades reflect absolute achievement level or achievement relative to others in the same class?
2. Should grades reflect achievement only or nonacademic components such as attitude, speed and diligence?
3. Should grades report status achieved or amount of growth?
4. How can several grades on diverse skills combine to give a single mark?

What Should Go into a Student's Grade

The grading system an instructor selects reflects his or her educational philosophy. There are no right or wrong systems,
only systems which accomplish different objectives. The following are questions which an instructor may want to answer
when choosing what will go into a student's grade.

1. Should grades reflect absolute achievement level or achievement relative to others in the same class?

This is often referred to as the controversy between nonreferenced versus criterion-referenced grading. In norm-
referenced grading systems the letter grade a student receives is based on his or her standing in a class. A certain
percentage of those at the top receive A's, a specified percent of the next highest grades receive B's and so on. Thus, an
outside person, looking at the grades, can decide which student in that group performed best under those circumstances.
Such a system also takes into account circumstances beyond the students' control which might adversely affect grades,
such as poor teaching, bad tests or unexpected problems arising for the entire class. Presumably, these would affect all
the students equally, so all performance would drop but the relative standing would stay the same.
On the other hand, under such a system, an outside evaluator has little additional information about what a student
actually knows since that will vary with the class. A student who has learned an average amount in a class of geniuses will
probably know more than a student who is average in a class of low ability. Unless the instructor provides more
information than just the grade, the external user of the grade is poorly informed.

The system also assumes sufficient variability among student performances that the difference in learning
between tl-rem justifies giving different grades. This may be true in large beginning classes, but is a shaky assumption
where the student population is homogeneous such as in upper division classes.

The other most common grading system is the criterion referenced system. In this case the instructor sets a
standard of performance against which the students' actual performance is measured. All students achieving a given
level receive the grade assigned to that level regardless of how many in the class receive the same grade. An outside
evaluator, looking at the grade, knows only that the student has reached a certain level or set of objectives. The
usefulness of that information to the outsider will depend on how much information he or she is given on what behavior
is represented by that grade. The grade, however, will always mean the same thing and will not vary from class to class.
A possible problem with this is that outside factors such as those discussed under norm-referenced grading might
influence the entire class and performance may drop. In such a case all the students would receive lower grades unless
the instructor made special allowances for the circumstances.

A second problem is that criterion-referenced grading does not provide “selection" information. There is no way
to tell from the grading who the “best" students are, only that certain students have achieved certain levels. Whether
one views this as positive or negative will depend on one's individual philosophy.
An advantage of this system is that the criteria for various grades are known from the beginning. This allows the
student to take some responsibility for the level at which he or she is going to perform. Although this might result in
some students working below their potential, it usually inspires students to work for a high grade. The instructor is then
faced with the dilemma of a lot of students receiving high grades. Some people view this as a problem.

A positive aspect of this foreknowledge is that much of the uncertainty which often accompanies grading for
students is eliminated. Since they can plot their own progress toward the desired grade, the students have little
uncertainty about where they stand. With competency-based teaching-learning or outcome -based teaching-learning
observed in Philippines school the criterion-referenced system is what is used in the country.
2. Should grades reflect achievement only or non-academic components such as attitude, speed and diligence?

It is a very common practice to incorporate such things as turning in assignments on time into the overall grade
in a course, primarily because the need to motivate students to get their work done is a real problem for instructors.
Also, it may be appropriate to the selection function of grading that such values as timeliness and diligence be reflected
in the grades. External users of the grades may be interpreting the mark to include such factors as attitude and
compliance in addition to competence in the material.

The primary problem with such inclusion is that it makes grades even more ambiguous than they already are. It
is very difficult to assess these nebulous traits accurately or consistently. Instructors must use real caution when
incorporating such value judgments into final grade assignment. Two steps instructors should take are (l) to make
students aware of this possibility well in advance of graded assignment and (2) to make clear what behavior is included
in such qualities as prompt completion of work and neatness or completeness. In short, non-academic component such
as attitude, speed and diligence may be reflected in the student's grades provided they are informed in advance and
that these qualities should be well understood.

3. Should grades report status achieved or amount of growth?

This is a particularly difficult question to answer. In many beginning classes, the background of the students is so
varied that some students can achieve the end objectives with little or no trouble while others with weak backgrounds
will work twice as hard and still achieve only half as much. This dilemma results from the same problem as the previous
question, that is, the feeling that we should be rewarding or punishing effort or attitude as well as knowledge gained.

A positive aspect of this foreknowledge is that much of the uncertainty which often accompanies grading for
students is eliminated. Since they can plot their own progress toward the desired grade, the students have little
uncertainty about where they stand.

There are many problems with "growth" measures as a basis for change, most of them being related to
statistical artifacts. [n some cases the ability to accurately measure entering and exiting levels is shaky enough to argue
against change as a basis for grading. Also, many courses are prerequisites to later courses and, therefore, are intended
to provide the foundation for those courses. "Growth" scores in this case would be disastrous.

Nevertheless, there is much to be said in favor of "growth" as a component in grading. We would like to
encourage hard work and effort and to acknowledge the existence of different abilities. Unfortunately, there is no easy
answer to this question. Each instructor must review his or her own philosophy and content to determine if such factors
are valid components of the grade.

4. How can several grades on diverse skills combine to give a single mark?

The basic answer is that they can't really. The results of instruction are so varied that the single mark is really a
"Rube Goldberg" (doing something by an unnecessarily complicated means what could be done simply) as far as
indicating what a student has achieved. It would be most desirable to be able to give multiple marks, one for each of the
variety of skills which are learned. There are, of course, many problems with such a proposal. It would complicate an
already complicated task. There might not be enough evidence to reliably grade any one skill. The “halo" effect of good
performance in one area could spill over into others. And finally, most outsiders are looking for only one overall
classification of each person so that they can choose the "best." Our system requires that we produce one mark.
Therefore, it is worth our while to see how that can be done even though currently the system does not lend itself to
any satisfactory answers.

Standardized Test Scoring

Test standardization is a process by which teacher or researcher-made tests are validated and item analyzed.
After a thorough process of validation, the test characteristics are established. These characteristics include: test
validity, test reliability, test difficulty level and other characteristics as previously discussed. Each standardized test uses
its own mathematical scoring system derived by the publisher and administrators, and these do not bear any
relationship to academic grading systems. Standardized tests are psychometric instruments whose scoring systems are
developed by norming the test using national samples of test-takers, centering the scoring formula to assure that the
likely score distribution describes a normal curve when graphed, and then using the resulting scoring system uniformly
in a manner resembling a criterion-referenced approach. If you are interested in understanding and interpreting the
scoring system of a specific standardized test, refer to the policies of the test's producers.
Cumulative and Averaging Systems of Grading

In the Philippines, there are two types of grading systems used: the averaging and the cumulative grading
systems. In the averaging system, the grade of a student on a particular grading period equals the average of the grades
obtained in the prior grading periods and the current grading period.

Example: Student's grades are:


80 – Prelim
90 – Midterm Grade
85 - Final
80+90+85
= 85
3

85 is the final grade for the semester

The Department of Education makes use of the averaging grading system.

In the cumulative grading system, the grade of a student in a grading period equals his current grading period
grade which is assumed to have the cumulative effects of the previous grading periods.

Example: 80 - Prelim
90 - Midterm Grade
80 - Tentative Final Grade
Final
Grade: 1/3 of Midterm
Grade + 213 of Tentative Final Grade
1/3 of 90 + 2/3 of 80 =
30+53.33 = 83
In which grading system would there be more fluctuations observed in the students' grades? How do these
systems relate with either norm or criterion-referenced grading?

Policy Guidelines 0n Classroom Assessment for the K to 12 Basic Education, DepEd Order No. 8, s. 2015
Below are some of the highlights of the new K to 12 Grading System which was implemented starting SY
2015.2016. These are all lifted from DepEd Order No. 8, s. 2015

Weights of the Components for the Different Grade Levels and Subjects

The student's grade is a function of three components: 1) written work, 2) performance tasks and 3) quarterly
assessment. The percentages vary across clusters of subjects. Languages, Araling Panlipunan (AP) and Edukasyon sa
Pagpapahalaga (EsP) belong to one cluster and have the same grade percentages for written work, performance tasks and
quarterly assessment. Science and Math are another cluster with the same component percentages. Music, Arts, Physical
Education and Health (MAPEH) make up the third cluster with same component percentages. Among the three
components, performance tasks are given the largest percentages. This means that the emphasis on assessment is on
application of concepts learned.

Table 7 presents the weights of the components for the Senior High School subjects which are grouped into 1)
core subjects, 2) all other subjects (applied and specialization) and work immersion of the academic track, and 3) all other
subjects (applied and specialization) and work immersion/research/exhibit/performance. An analysis of the figures reveal
that among the components, performance tasks have the highest percentage contribution to the grade. This means that
DepEd's grading system consistently puts most emphasis on application of learned concepts and skills.
Steps in Grade Computation

Based on the same DepEd Order (8, s. 2015), here are the steps to follow in computing grades.

Table 9. Steps for Computing Grades

STEPS EXAMPLE
Learner’s Raw Score Highest Possible Score
Written Work 1 18 20
Written Work 2 22 25
Written Work 3 20 20
Written Work 4 17 20
Written Work 5 23 25
Written Work 6 26 30
Written Work 7 19 20
Total 145 Total 160

1. Get the total score for Learner’s Raw Score Highest Possible Score
each component
Performance Task 1 12 Total 15
Performance Task 2 13 15
Performance Task 3 19 25
Performance Task 4 15 20
Performance Task 5 16 20
Performance Task 6 25 25
Total 100 Total 120

Learner’s Raw Score Highest Possible Score


Quarterly Assessment 40 50

2. Divide the total raw score


by the highest possible 145
Percentage (PS) = x 100%
160
score then multiply the
quotient by 100%
PS of Written Work is 90.63

100
Percentage (PS) = x 100%
120

PS of Performance Task is 83.33

40
Percentage (PS) = x 100%
50

PS of Quarterly Assessment is 80.00

STEPS EXAMPLE
3. Convert Percentage Written Work for English Grade 4 is 30%
Scores to weighted scores. Weighted $core (WS)= 90.63 x 0.30
Multiply the Percentage The weighted score of written work is 27.19
score by the weight of the
component indicated in Performance tasks for English Grade 4 is 50%
table 4 and 5 Weighted Score (WS) = 83.33 x 0.50
The weighted score of Performance is 41.67

Quarterly Assessment for English Grade 4 is 20%


Weighted Score (WS) = 80.00 x 0.20
The weighted score of Quarterly Assessment is 16

{The scores can be found in the sample class record on table 6)


4. Add the Weighted Scores Component Weighted Score
of each component. The Written Works = 27.19
result will be the initial Performance Tasks = 41.67
grade. Quarterly Assessment = 16.00

TOTAL 84.86
The Initial Grade is 84.86

5. Transmute the Initial The Initial Grade is 84.86


Grade using the The Transmuted Grade is 90
Transmutation Table in the The Quarterly Grade in English for the First Quarter is 90
Appendix B. This will be reflected in the Report Card

For MAPEH, individual grades are given to each area, namely Music, Arts, Physical Education, and Health. The Quarterly
for MAPEH is the average of the quarterly grades in the four areas.

Quarterly Grade = QG for Music + QG for Arts + QG for PE + QG for Health


QG for MAPEH = _______

Grade Computation
What follows is a description of how grades are computed based on DepEd Order 8, s. 2015.

For Kindergarten There are no numerical grades in Kindergarten, Descriptions of the learners progress in the various
learning areas are represented using checklists and student portfolios These are presented to the parents at the end of
each quarter for discussion. Additional guidelines on the Kindergarten program will be issued.

For Grades 1-10


The Average of the Quarterly Grades (QG) produces the Final Grade.

Final Grade by Learning Area = 1st Quarter Grade + 2nd Quarter Grade + 3rd Quarter Grade + 4th Quarter Grade
4

General Average = Sum of Final Grades of All Learning Areas


Total number of Learning Areas in Grade Level

The Final Grade in each learning area and the General Average are reported as whole numbers. Table 10 shows
an example of the Final Grades of the different learning areas and General Average of Grade 4 students.
How is the Learner’s Progress Reported?

The summary of learner progress is shown quarterly to parents and guardians through a parent-teacher
conference, in which the report card is discussed. The Grading Scale, with its corresponding descriptors, is in the table 12.
Remarks are given at the end of the grade level.

Using the sample class record in Table 14, Learner A received an Initial Grade of 84.86 in English for the first
quarter which, when transmuted to grade of 90, is equivalent to outstanding. Learner B received a transmuted grade of
88, which is equivalent to Very Satisfactory. Learner C received a grade of 71 which means that the learner Did not Meet
Expectations in the First Quarter of Grade 4 English.

When a learner’s raw scores are consistently below expectations in Written Works and Performance Tasks, the
learner’s parents or guardians must be informed not later than the fifth week of that quarter. This will enable them to
help and guide their child to improve and prepare for the Quarterly Assessment. A learner who receives a grade below 75
in any subject in a quarter must be given intervention through remediation and extra lessons from the teacher/s of that
subject.

Promotion and Retention at the end of the School Year

These are what DepEd Order 8, s. 2015 say.

A Final Grade of 75 or higher in all learning areas allows the student to be promoted to the next grade level. Table 13
specifies the guidelines to be followed for learner promotion and retention.
Alternative Grading System

Pass-Fail Systems. Other colleges and universities, faculties, schools and institutions use pass-fail grading systems
in the Philippines, especially when the student's work to be evaluated is highly subjective (as in the fine arts and music),
there are no generally accepted standard gradations (as with independent studies), or the critical requirement is meeting
a single satisfactory standard (as in some professional examinations and practicum).

Non-Graded Evaluations. While not yet practiced in Philippine schools, and institutions, non-graded evaluations
do not assign numeric or letter grades as a matter of policy. This practice is usually based on a belief that grades introduce
an inappropriate and distracting element of competition into the learning process, or that they are not as meaningful as
measures of intellectual growth and development as are carefully crafted faculty evaluations. Many faculty, schools, and
institutions that follow a no-grade policy will, if requested, produce grades or convert their student evaluations into
formulae acceptable to authorities who require traditional measures of performance.

The process of deciding on a grading system is a very complex one. The problems faced by teacher who tries to
design a system which will be accurate and fair are common to any manager attempting to evaluate those for whom he
or she is responsible. The problems of teachers and students with regard to grading are almost identical to those of
administrators and faculty with regard to evaluation for promotion and tenure. The need for completeness and
objectivity felt by teachers and administrators must be balanced against the need for fairness and clarity felt by students
and faculty in their respective situations. The fact that the faculty member finds himself or herself in both the position of
evaluator and evaluated should help to make him

You might also like