EBS 234 Assessment in Basic Schools
EBS 234 Assessment in Basic Schools
EBS 234 Assessment in Basic Schools
Definition of Terms:
The general public uses the terms assessment, test, measurement and evaluation interchangeably,
Assessment: The process of obtaining information that is used for making decisions about
students, curricula and programs, and educational policy. It includes the full range of
procedures used to gain information about student learning. These procedures may be formal
(pencil and paper tests) or informal (observations). Certain concepts and terms are associated
Test: A task or series of tasks, which are used to measure specific traits or attributes in people.
In educational settings, tests include paper and pencil instruments, which contain questions
that students and pupils respond to. The responses provided to the questions help the test
giver to obtain an estimate of the specific trait being measured. It answers the question,
The process of assigning (giving) numbers to the attributes or traits possessed by persons,
way that the numbers describe a degree to which the person possesses the attribute or trait. It
is limited to the quantitative descriptions of students. It answers the question, ‘How much?’
book). For example a teacher may judge a student’s writing as exceptional good for his grade
placement.
Evaluation
person, programme or process and trying to make a value judgement about the effectiveness
achievement means we are judging the quality of student’s achievement while the student
questions, home assignments and short tests or quizzes. The main purpose is to provide
feedback to both the teacher and the learner about progress being made and not to grade
the student.
Summative evaluation is the process of judging the worth of teaching and learning at the
what extent the broad objectives of teaching and learning have been attained. In other
words, it is judgement about the quality of students’ achievement after instructional or
3. Proper use of assessment procedures requires that the user is aware of limitations of each
technique.
PURPOSES OF ASSESSMENT
Assessment provides information for decisions about students, curricula and programs, and
2. Selection decisions
3. Placement decisions
manifested.
measured.
SCALES OF MEASUREMENT
The type of data obtained or collected determines the appropriate measurement scale
used. There are 4 types of measurement scales: Nominal, Ordinal, Interval, and
Ratio
Categorical data are numbers that are simply used as identifiers or names
Kwame Nkrumah 6.
An ordinal scale does not only group subjects (data) but also ranks them in
An ordinal scale puts subjects in order from highest to lowest, from most to
least. For example the height of 5 students can be ranked from 1st to 5th.
NOTE: Though ordinal scales indicate that some subjects are higher or better than
others, they do not indicate how much higher or better. Thus, intervals between the
addition has equal intervals between adjacent scores or points (Unlike Ordinal
scale).
Again, it has no true zero point. Zero point or score here, does not mean
categories;
learning.
LEARNING
should be provided.
Public and defensible reference point for making judgement should be
available
should be considered.
and peer assessment and decide (often with the help of the teacher,
particularly in the early stages) what their next learning will be.
assessment
and descriptive
essential part.
3. It aims to help pupils to know and to recognise the standards for which
them.
assessment data.
practice
for teachers.
learner motivation.
Characteristics
Continuous assessment is cumulative
The final grade awarded a student at the end of the term or year is an accumulation of all
the attainments throughout the term or year.
Continuous assessment is comprehensive
Opportunities are provided for the assessment of the total personality of the student.
Continuous assessment is diagnostic
Continuous assessment involves a constant and continual monitoring of a student’s
performance and achievement. This process enables each student’s strengths and
weaknesses to be identified.
Continuous assessment is formative
Continuous assessment allows for immediate and constant feedback to be provided to the
student on his performance.
Continuous assessment is guidance-oriented
Continuous assessment is systematic
2. It enables the classroom teacher as well as the school administration to be actively and more
meaningfully involved in the assessment of the students throughout the period of teaching
and learning.
3. It enables the measurements of the three important domains in the taxonomy of education
objectives thus, cognitive, affective and psychomotor domains.
4. It helps to minimise the students’ fears and anxieties about failure in the examinations.
6. Constant feedback is given and this provides the groundwork for teachers to engage in
diagnostic teaching.
8. Parents are provided with better and clearer pictures of their wards’ performance and
achievement in school over a period of time and learning experience.
4. Continuous assessment, especially in the first and second cycle levels, means less
dependence on an external examining body. This implies that the uniformity that goes with
external written examinations in the form of standard test items and scoring, are reduced to
some extent. The fate of the individual student lies more in the hands of the classroom
teacher. This situation generates fears, doubts and apprehensions in the minds of the public
about the degree of fairness in assessing the achievement of students.
5. In the first and second cycle institutions, certificates obtained are based on performances and
achievements in external examinations in Ghana. This situation enables the certificates to
have credibility, since efforts are made to maintain standards across years and test items.
However, with the continuous assessment, if schools award certificates based on the
attainments of their own students, standards will vary from school to school as well as
certificates. The credibility of certificates becomes doubtful in most cases.
6. Another problem is that of supervision. Continuous assessment requires co-operation and co-
ordination at different levels. Close supervision is needed at all levels. Unfortunately,
supervisors in most cases who are heads of institutions are already laden with loads of work.
They are therefore not effective in their supervisory roles.
The three assessments give a total score of 100, which is scaled down to 30% as the internal
mark for each pupil. The end of term examination is given 70%.
At the end of the junior and senior secondary schools, all the scores a pupil obtains are scaled to
30% and forwarded to the WAEC where 70% is obtained for external assessment.
For the policy to be successful, teachers are expected to perform the following roles.
1. The teacher must accept the philosophy of continuous assessment.
2. The teacher needs to be knowledgeable about continuous assessment.
3. At the beginning of each academic year and term (or semester), the teacher must make a
timetable for the assessments to be made.
4. The teacher must break the learning programme of the period of instruction into smaller,
specific and well-defined units.
5. The teacher must assess the learning outcomes and performances at the end of each unit of
instruction.
6. The teacher must spread the assessment over all areas of student’s behaviour. These are the
cognitive, affective and psychomotor domains.
7. The teacher must formulate measurable, specific and attainable instructional objectives for
each unit for instruction. This helps him to make his teaching more effective and meaningful.
8. The teacher must provide constant feedback. Class assignments and exercises, projects, tests
and home wok must be promptly scored and returned to the students.
9. The teacher must record all the assessment of the student in all the areas of learning and
instruction in the appropriate records. This must be done promptly at the end of each
measurement. The records must be well kept and maintained.
10. The teacher must be involved in remedial and individualised teaching.
11. The teacher must also engage in guidance and counselling. He must identify the weaknesses
and strengths of students in the various areas of learning. He should then use the information
to guide and counsel the student for his full personal development and growth as well as
preparing the student for his future career.
12. The teacher must engage in constant evaluation of himself and of the continuous assessment
programme. The scores obtained from the various assessments should be used to measure his
own performance and the effectiveness of his methods and techniques. He must also evaluate
the success of the programme regularly to identify the lapses and improve upon them.
SCHOOL-BASED ASSESSMENT (SBA)
schools.
Assessment is in use in Ghana as part of the new Educational Reforms starting September
2008.
SBA is a very effective system for teaching and learning if carried out properly.
The new SBA system is designed to provide schools with an internal assessment system
Standardize the practice of internal school-based assessment in all schools in the country.
Provide reduced assessment tasks for each of the primary school subjects.
Provide teachers with guidelines for constructing assessment items/questions and other
assessment tasks.
Introduce standards of achievement in each subject and in each class of the school
system.
Provide guidance in marking and grading of test items/questions and other assessment
tasks.
Introduce a system of moderation that will ensure accuracy and reliability of teachers’
marks.
Provide teachers with advice on how to conduct remedial instruction on difficult areas of
The marks for the SBA should together constitute the School Based Assessment component marked
out of 60 per cent. The emphasis is to improve students’ learning by encouraging them to perform at
End-of-month tests
Project
The SBA system will consist of 12 assessments a year instead of the 33 assessments in
the previous continuous assessment system. This will mean a reduction by 64% of the
Task 1 will be administered as an individual test coming at the end of the first month of
the term. The equivalent of Task 1 will be Task 5 (first individual test in second term)
administered as a Group Exercise and will consist of two or three instructional objectives
that the teacher considers difficult to teach and learn. The selected objectives could also
be those objectives considered very important and which therefore need pupils to put in
more practice. Task 2 will be administered at the end of the second month in the term.
Task 3 (also task 7 and 11 for second and third terms respectively) will also be
administered as individual test under the supervision of the class teacher at the end of the
Task 4 (and also Task 8 and Task 12) will be a project to be undertaken throughout the
term and submitted at the end of the term. Schools will be supplied with 9 project topics
divided into three topics for each term. A pupil is expected to select one project topic for
each term. Projects for the second term will be undertaken by teams of pupils as Group
Projects. Projects are intended to encourage pupils to apply knowledge and skills
acquired in the term to write an analytic or investigative paper, write a poem (as may be
required in English and Ghanaian Languages), use science and mathematics to solve a
Apart from the SBA, teachers are expected to use class exercises and home work as processes for
continually evaluating pupils’ class performance, and as a means for encouraging improvements
in learning performance.
End-of-Term Examination
The end-of-term examination is a summative assessment system and should consist of a sample
of the knowledge and skills pupils have acquired in the term. The end-of-term test for Term 3
should be composed of items/questions based on the specific objectives studied over the three
terms, using a different weighting system such as to reflect the importance of the work done in
each term in appropriate proportions. For example, a teacher may build an end of term 3 test in
such a way that it would consist of the 20% of the objectives studied in Term 1, 20% of the
The new SBA system is important for raising pupils’ school performance. For this reason, the 60
marks for the SBA will be scaled to 50 in schools. The total marks for the end of term test will
also be scaled to 50 before adding the SBA marks and end-of-term examination marks to
determine pupils’ end of term results. The SBA and the end-of-term test marks will hence be
combined in equal proportions of 50:50. The equal proportions will affect only assessment in the
school system. It will not affect the SBA mark proportion of 30% used by WAEC for
To improve assessment and grading and also introduce uniformity in schools, it is recommended
that schools adopt the following grade boundaries for assigning grades:
The grading system presented above shows the letter grade system and equivalent grade
boundaries. In assigning grades to pupils’ test results, or any form of evaluation, you may apply
the above grade boundaries and the descriptors. The descriptors (Excellent, Very Good etc)
indicate the meaning of each grade. For instance, the grade boundary for “Excellent” consists of
scores between 80 - 100. Writing “80%” for instance, without writing the meaning of the grade,
or the descriptor for the grade i.e. “Excellent”, does not provide the pupil with enough
information to evaluate his/her performance in the assessment. You therefore have to write the
meaning of the grade alongside the score you write. Apart from the score and the grade
descriptor, it will be important also to write a short diagnosis of the points the pupil should
consider in order to do better in future tests etc. Comments such as the following may also be
Keep it up
Has improved
Could do better
Hardworking
Note that the grade boundaries above are also referred to as grade cut-off scores. When you
adopt a fixed cut-off score grading system as in this example, you are using the criterion-
referenced grading system. By this system a pupil must make a specified score to earn the
appropriate grade. This system of grading challenges pupils to study harder to earn better grades.
Educational Goals
An educational goal is a very general statement of what students will know and be
able to do.
They are those human activities which can be acquired through learning and
In the school system, educational goals are listed as defining the mission of the
system.
Some examples of educational goals require that students will be able to:
Educational Outcome
In very simple terms, educational outcomes are the products or end results of learning
experiences.
ii. Understanding
v. General skills
vi. Attitudes
vii. Interests
ix. Adjustment
These outcomes, similar to educatooal goals are broad. Each outcome cao thus take several
forms.
Learning Outcome
Learning outcomes reflect a nation’s concern with the level of knowledge acquisition
completed.
students are able to demonstrate at the end of instruction to show that they have learned what
was expected of them. Example, “by the end of the lesson, students should be able to define
Behavioural objectives: A statement that specifies what observable performance the learner
Learning objectives: These specify what the teacher likes the students to do, value, or feel at the
1. Learning objectives make the general planning for an assessment procedure easier through
3. Evaluating an existing assessment instrument becomes easier when specific outcomes are
known.
4. They help to judge the content relevance of an assessment procedure. Specific learning
Namely;
Cognitive
Affective
Psychomotor
Cognitive domain objectives: produce outcomes that focus on knowledge and abilities
Affective domain objectives: produce outcomes that focus on feelings, interests, attitudes,
Psychomotor domain objectives: produce outcomes that focus on motor skills and perceptual
processes.
i. Knowledge. This involves the recall of specific facts, methods and processes. It is
such as define, identify, label etc are calling for the students’ knowledge
ii. Comprehension. It is the ability to grasp the meaning of material. It is shown by
iii. Application. This refers to the ability to use learned material in new and concrete
situations. This includes the application of such things as rules, methods, concepts,
iv. Analysis. This is the ability to break down material into its component parts so that
parts, analysis of the relationships between parts etc. Illustrative verbs include break
v. Synthesis. This refers to the ability to put parts together to form a new whole. This
vi. Evaluation. This is the ability to judge the value of material (e.g. novel, poem, and
research report) for a given purpose. The judgments are based on definite criteria.
Recognizing (identifying)
Recalling (retrieving)
2. Understand Construct meaning from instructional messages, including
Implementing (using)
4. Analyse Break material into its constituent parts and determine how the parts
Attributing (deconstructing)
Critiquing (judging)
Generating (hypothesizing)
Planning (designing)
Producing (constructing)
Continuation Of Taxonomies Of Educational Objectives
Quellmalz (1985) also proposed a cognitive taxonomy, which has five categories. These are:
i. Recall. This requires that students recognize or remember key facts, definitions,
ii. Analysis. Students divide a whole into component elements, e.g. What are the
differences. e.g. How was this story like the last one?
iv. Inference. Students are given a generalization and are required to recognize evidence
or details and are required to come up with the generalization. e.g. What might be a
This was developed by David Krathwohl, Benjamin Bloom and Masia in 1964. They classified
closely to the classroom activities. Illustrative verbs that are used include asks,
student/pupil does not only attend to particular stimuli but also reacts to it in some
homework, obeys school rules and regulations. Illustrative verbs that are used include
iii. Valuing: It is concerned with the value, worth or importance a student/pupil attaches
instructional objectives include; shows concern for the welfare of others, appreciates
the role of science in everyday life. Illustrative verbs include completes, describes,
iv. Organization: It is the ability to bring together different values, resolving conflicts
between them, and beginning to build and internally consistent value system.
understands and accepts own strengths and weaknesses. Illustrative verbs include
affective domain. At this level, the individual student/pupil has a value system that
has controlled his/her behaviour for a sufficiently long time for him/her to have
Simpson (1972) and Harrow (1972) developed categories in this domain. Simpson produced 7
Simpson’s categories
1. Perception. This is the lowest level. It is the ability to use the sense organs to
obtain cues that guide motor activity. For example relating the sound of drums to the
position to save a penalty kick in a soccer game. Illustrative verbs include, begin,
3. Guided response. It involves the early stages in learning a complex skill. For
example, starting a car while beginning to learn how to drive. Illustrative verbs
movements are performed with confidence and proficiency. For example, typing,
operating a video recorder. Illustrative verbs include sketch, fix, fasten, dissect,
assemble.
skills to fit special requirements or situations. For example modify piano rhythms to
suit local songs. Illustrative verbs include adapt, alter, change, reorganize.
7. Origination. This is the highest level. It involves the ability to create new
originality are emphasized. For example design a new computer software, create a
new musical dance. Illustrative verbs include arrange, create, design, originate.
1. Reflex movements are actions elicited without learning in response to some stimuli.
2. Basic fundamental movements are inherent movement patterns which are formed by
combining of reflex movements and are the basis for complex skilled movements. Examples
3. Perceptual abilities refers to interpretation of various stimuli that enable one to make
4. Physical activities require endurance, strength, vigor, and agility which produces a sound,
efficiently functioning body. Examples are: all activities which require a) strenuous effort for
long periods of time; b) muscular exertion; c) a quick, wide range of motion at the hip joints;
5. Skilled movements are the result of the acquisition of a degree of efficiency when
performing a complex task. Examples are: all skilled activities obvious in sports, recreation,
and dance.
gestures, and facial expressions efficiently executed in skilled dance movement and
choreographies.
UNIT 3
Nitko (1996, p. 36) defined validity as the “soundness of the interpretations and use of students’
assessment results”. Validity emphasizes the interpretations and use of the results and not the
test instrument.
Evidence needs to be provided that the interpretations and use are appropriate.
NATURE OF VALIDITY
In using the term validity in relation to testing and assessment, five points have to be
borne in mind.
procedure for a group of individuals. It does not refer to the procedure of instrument itself.
Validity is a matter of degree. Assessment results may have high, moderate or low validity.
Results have different degrees of validity for different purposes and for different situations.
Validity involves an overall evaluative judgment. Several types of validity evidence should
There are four principles that help a test user/giver to decide the degree to which his/her
1. The interpretations (meanings) given to students’ assessment results are valid only to the
2. The uses made of assessment results are valid only to the degree that evidence can be
3. The interpretations and uses of assessment results are valid only when the educational and
4. The interpretations and uses made of assessment results are valid only when the
consequences of these interpretations and uses are consistent with appropriate values.
1. Content-related evidence
This type of evidence refers to the content representativeness and relevance of tasks
or items on an instrument.
in the test user’s domain definition when standardized tests are used.
i. How well do the assessment tasks represent the domain of important content?
ii. How well do the assessment tasks represent the curriculum as defined?
iii. How well do the assessment tasks reflect current thinking about what should be taught
and assessed?
To obtain answers for the questions, a description of the curriculum and content to be learned (or
learned) is obtained. Each assessment task is checked to see if it matches important content and
learning outcomes. Each assessment task is rated for its relevance, importance, accuracy and
meaningfulness.
specification.
2. Criterion-related evidence
This type of evidence pertains to the empirical technique of studying the relationship
between the test scores or some other measures (predictors) and some independent
external measures (criteria) such as intelligence test scores and university grade point
average.
Criterion-related evidence answers the question, ‘How well the results of an assessment
can be used to infer or predict an individual’s standing on one or more outcomes other
than the assessment procedure itself’. The outcome is called the criterion.
There are two types of criterion-related evidence. These are concurrent validity and
predictive validity.
Concurrent validity evidence refers to the extent to which individuals’ current status on
For concurrent validity, data are collected at approximately the same time and the
purpose is to substitute the assessment result for the score of a related variable. e.g. a test
For predictive validity, data are collected at different times. Scores on the predictor
variable are collected prior to the scores on the criterion variable. The purpose is to
predict the future performance of a criterion variable. e.g. Using WASSCE results to
result and the criterion. The correlation coefficient is a statistical index that quantifies the degree
of relationship between the scores from one assessment and the scores from another. This
coefficient is often called the validity coefficient and takes values from
–1.0 to +1.0.
3. Construct-related evidence: This type of evidence refers to how well the assessment results
1. Unclear directions. Validity is reduced if students do not clearly understand how to respond
to the items and how to record the responses or the amount of time available.
2. Too difficult reading vocabulary and sentence structure tends to reduce validity. The
3. Ambiguous statements in assessment tasks and items. This confuses students and makes
4. Inadequate time limits. This does not provide students with enough time to respond and thus
5. Inappropriate level of difficulty of the test items. Items that are too easy or too difficult do
6. Poorly constructed test items. These items may provide unintentional clues which may
cause students to perform above their actual level of achievement. This lowers validity.
7. Test items being inappropriate for the outcomes being measured lowers validity.
8. Test being too short. If a test is too short, it does not provide a representative sample of the
some students off and cause them to become unstable thereby performing below their level of
10. Identifiable pattern of answers. Placing the answers to tests like multiple-choice and
true/false types enables students to guess the correct answers more easily and this lowers
validity.
11. Cheating. When students cheat by copying answers or helping their friends with answers to
12. Unreliable scoring. Scoring of test items especially essay tests may lower reliability if they
13. Student emotional disturbances. These interfere with their performance thus reducing
validity.
14. Fear of the assessment situation. Students can be frightened by the assessment situation and
are unable to perform normally. This reduces their actual level of performance and
Definition
different but equivalent tasks are completed on the same or different occasions
NATURE OF RELIABILITY
Reliability refers to the results obtained with an assessment instrument and not to the
instrument itself.
defined as a correlation coefficient that indicates the degree of relationship between two
sets of scores intended to be measures of the same characteristic. It ranges from 0.0 - 1.0
Definition of terms:
True score (T): The difference between the obtained and the error scores. It is the portion of the
observed score that is not affected by error. An estimate of the true score of a student is the
mean score obtained after repeated assessments under the same conditions.
X=T+E
Reliability can be defined theoretically as the ratio of the true score variance to the observed
s 2t
is the standard deviation of the obtained scores. For example, given that, rxx = 0.8, Sx = 4.0,
It estimates the amount that a student is likely to deviate from her/his true score. e.g. SEM=4
indicates that a student’s obtained scores lies 4 points above or below the true score. An
obtained score of 75 means the true score is either 71 or 79. The true score therefore lies
between 71 and 79. 71-79 therefore provides a confidence band for interpreting an obtained
score. A small standard error of measurement indicates high reliability providing greater
between two sets of scores intended to be measures of the same characteristic (e.g. correlation
between scores assigned by two different raters or scores obtained from administration of two
forms of a test).
1. Test-retest method. With this method, the same test is given to a group of students twice on
different occasions ranging from several minutes to years. The scores on the two
administrations are correlated (compared) and the result is the estimate of the reliability of
the test. The time interval should be reasonable, not be too short nor too long. This is a
2. Equivalent forms method. Two test forms, which are alternate or parallel with the same
content and level of difficulty for each item, are administered to the same group of students.
The forms may be given on the same or nearly the same occasion or a time interval will
elapse before the second form is given. The scores on the two administrations are correlated
3. Split-half method. This is a measure of internal consistency. A single test is given to the
students. The test is then divided into two halves for scoring. The two scores for each
student are correlated to obtain the estimate of reliability. The test can be split into two
halves in several ways. These include using (i) odd-even numbered items, and (ii) first half-
second half.
4. Inter-rater reliability. Two raters (scorers) each score a student’s paper. The two scores for
all the students are correlated. This estimate of reliability is called scorer reliability or inter-
rater reliability. It is an index of the extent to which the raters were consistent in rating the
same students.
1. Test length. Longer tests give more reliable scores. A test consisting of 40 items will give a
2. Group variability. The more heterogeneous the group, the higher the reliability. The
3. Difficulty of items. Too difficult or too easy items produce little variation in the test scores.
4. Scoring objectivity. Subjectively scored items result in lower variability. More objectively
scored assessment results are more reliable. For subjectively-scored items, multiple markers
are preferred.
5. Speed. Tests, where most students do not complete the items due to inadequate allocation of
time, result in lower reliability. Sufficient time should be provided to students to respond to
the items.
6. Sole marking. Using multiple markers improves the reliability of the assessment results. A
single person grading may lead to low reliability especially of essay tests, term papers, and
regulations and practices, students’ scores may not represent their actual level of
performances and this tends to reduce reliability. In cases of the test-retest method of
Definition
Achievement tests are tests that measure the extent of present knowledge and skills. In
achievement testing, test takers are given the opportunity to demonstrate their acquired
The major difference between these two types of tests is that standardized achievement tests
are carefully constructed by test experts with specific directions for administering and scoring
the tests. This makes it possible for standardized achievement tests to be administered to
Standardized specific instructions are provided for test administration and scoring.
Directions are so precise and uniform that the procedures are standard for different users
of the test.
The test items are developed by test experts and specialists who follow well-formulated
The tests often have high quality. Reliability is often over 0.90.
They use test norms which are based on national samples of students in the classes/forms
Equivalent and comparable forms of the test are usually provided and administered.
A test manual is available as a guide for test administration and scoring. It provides
information for evaluating the test for technical quality and interpretation and use of the
results.
These tests are constructed by classroom teachers for specific uses in each classroom and
The quality of the test is often unknown but usually lower than standardized tests.
There are eight steps (8) in the construction of a good classroom test.
The basic question to answer is, “Why am I testing?” The tests must be related to the
The teacher has to answer other questions such as ‘Why is the test being given at this
time in the course?’, ‘Who will take the test?’, ‘Have the test takers been informed?’,
Objective-type tests include multiple-choice, short-answer, matching and true and false.
The choice of format must be appropriate for testing particular topics and objectives. It is
Mehrens and Lehmann (1991) mentioned 8 factors to consider in the choice of the
(1) the purpose of the test, (2) the time available to prepare and score the test,
(3) the number of students to be tested, (4) skill to be tested, (5) difficulty desired, (6) physical
facilities like reproduction materials, (7) age of pupils, (8) skills in writing the different types of
items.
Step 3. Determine what is to be tested.
The teacher asks himself or herself the question, ‘What is it that I wish to measure?’. The
teacher has to determine what chapters or units the test will cover as well as what knowledge,
skills and attitudes to measure. A test plan called table of specifications or blue print must be
made. The specification table matches the course content with the instructional objectives.
To prepare the table, specific topics and sub-topics covered during the instructional period
are listed. The major course objectives are also specified and the instructional objectives
defined. The total number of test items is then distributed among the course content and
Behaviour
Content
Knowledge Comprehension Application Analysis Synthesis Evaluation Total
Sets 1 1 1 3
Indices 1 1 1 3
Angles 1 1 1 3
polygons 2 1 1 4
factorization 2 1 3
Number plane 1 1 1 1 4
Total 4 7 5 3 1 20
1. It makes sure that justice is done to all the topics covered in the course.
5. It helps students to determine the content and behavioural areas where they have difficulty.
Teachers can also determine areas where the class has difficulty.
In writing the individual items (questions), the following general guidelines must be considered.
1. Keep the table of specifications before you and continually refer to it as you write the
items.
3. Formulate well-defined items that are not vague, and ambiguous and should be
8. The task to be performed and the type of answers required should be clearly defined.
10. Write the items and the key as soon as possible after the material has been taught.
12. Write the items in advance of the test date to permit reviews and editing.
Carefully examine each item at least a week after writing the item.
Items that are ambiguous and those poorly constructed as well as items that do not match
Check the length of the test (i.e. number of items against the purpose, the kinds of test
Prepare a scoring key or marking scheme while the items are fresh in your mind.
Give clear and concise directions for the entire test as well as sections of the test.
Directions must include number of items to respond to, how the answers will be written,
where the answers will be written, amount of time available, credits for orderly
Before administration, the test should be evaluated by the following five criteria: clarity,
complete it effectively?
Efficiency: Is this the best way to test for the desired knowledge, skill or
attitude?
shortage?
There are two major types of tests items. These are the essay-type tests and objective-type tests.
An objective test requires a respondent to provide a briefly response which is usually not
more than a sentence long. The tests normally consist of a large number of items and the
responses are scored objectively
TYPES OF OBJECTIVES
The selection type. Examples are multiple-choice type, true and false type and matching
type.
The supply type. Examples are completion, fill-in-the-blanks and short-answer.
MULTIPLE-CHOICE TESTS
Description
Examples:
Single correct response
A. 0.394
B. 0.0394
C. 0.0393
D. 0.039
In which one of the following sites would you, as a community health worker, advise a
community to dispose refuse?
A. A compost pit
B. An abandoned well
C. An incinerator
D. An uncultivated land
I. Arrest hemorrhage
II. Bath the patient
III. Immobilize injured bone
A. I only
B. 11 only
C. I and II
D. I and III
E. 1, II and III
1. The central issue of the item should be in the stem. It should be concise, easy to read and
understand.
The following are examples of poor and good items
Poor Good
Example
In constructing multiple-choice test items, options to an item should be….
A. arranged in horizontally.
B. copied directly from class notes or textbooks.
C. must have a discernible pattern of responses.
D. homogeneous in content.
Poor
'Which is the best definition of a contour-line?
A. A line on a map joining places of equal barometric pressure.
B. A line on a map joining places of equal earthquake intensity.
C. A line on a map joining places of equal height.
D. A line on a map joining-places of equal mean temperature.
E. A line on a map joining places of equal rainfall.
Good
A line on a map joining places of equal pressure is called an
A. isobar
B. isobront
C. isochasm
D. isogeotherm
E. isotherm
7. Specific determiners which are clues to the best/correct option should be avoided.
Poor Good
The first woman cosmonaut is a The first woman to go into space is a/an
A. American A. American
B. Englishman B. British
C. Irish C. French
D. Italian D. Italian
E. Russian E. Russian
In the poor example, the article, a, gives a clue that the correct option is
Russian. In addition, it is only Russians who use the term, cosmonaut. Also
Englishman does not belong to the group.
8. Vary the placement of the correct options. No discernible pattern of the correct/best
responses should be noticed.
9. Sentences should not be copied from textbooks, or from others’ (colleagues, friends etc) past
test items. Original items should be made. This builds capacity in item writing.
For example:
In constructing multiple-choice test items, options to an item should be
A. arranged horizontally.
B. copied from textbooks.
C. heterogeneous in content.
D. homogeneous in content.
10. Items measuring opinions should not be included. One option should clearly be correct or
the best.
Poor Good
The best Ghanaian medical doctor is The Ghanaian medical doctor famous
for his work on the sickle-cell disease is
A. Charlotte Gardiner A. F. I. D. Konotey-Ahulu
B. F. I. D. Konotey-Ahulu B. F. O. Acheampong
C. Mary Grant . C. K. G. Korsah
D. Mohamed Mustafa D. M. K. Mustafa
11. The responses in agreement must be itemized vertically and not horizontally.
Poor
In constructing multiple-choice test items, options to an item should be
A. arranged in a horizontally B. copied directly from class notes or
textbooks. C. must have a discernible pattern of responses. D.homogeneous
in content
Good
In constructing multiple-choice test items, options to an item should be
A. arranged horizontally
B. copied from textbooks.
C. heterogeneous in content.
D. homogeneous in content.
13. The responses in agreement must be parallel in form i.e. sentences must be about the same
length.
Poor
In constructing multiple-choice test items, options to an item should be
A. arranged horizontally.
B. copied directly from class notes or textbooks.
C. in a discernible pattern of responses for easy identification.
D. homogeneous in content.
Good
In constructing multiple-choice test items, options to an item should be
A. arranged horizontally.
B. copied from textbooks.
C. heterogeneous in content.
D. homogeneous in content.
14. Each option must be distinct. Overlapping alternatives should be avoided.
Poor Good
In a healthy adult, the liver weighs about In a healthy adult, the liver
weighs between
A. 3.0kg A. 6.5 – 7.5kg
B. 2.5kg B. 4.5 – 6.0kg
C. 2.0kg C. 3.0 – 4.0kg
D. 1.5kg D. 1.0 – 2.5kg
15. Avoid using “all of the above” as an option but "None of the above” can be used sparingly.
It should be used only when an item is of the 'correct answer' type and not the 'best answer'
type.
Poor Good
The following are local signs and In administering intramuscular
symptoms of inflammation except injection, the needle is inserted into
the muscle at an angle of
A. rashes. A. 300.
B. redness. B. 450.
C. restoration of function. C. 600.
D. sleeplessness. D. 90.
E. None of the above. E. None of the above.
In the poor example, there are other signs and symptoms not included whereas in the good
example there is one and only one answer.
16. Stems and options should be stated positively. However, a negative stem could be used
sparingly and the word not should be emphasized either by underlining it or writing it in
capital form.
An example is:
Which of these insects has NOT been incriminated to transmit diseases?
A. Bed-bug
B. Blackfly
C. Body louse
D. Housefly
E. Tsetsefly
17. Create independent items. The answer to one item should not depend on the knowledge of
the answer to a previous item.
For example:
Item 1. The perimeter of a rectangular field is 60 metres. If one side is 20 metres
long, what is the width of the field?
A. 10 metres
B. 20 metres
C. 30 metres
D. 40 metres
E. 60 metres
Item 2. Find the length of the diagonal of the rectangular field in item 1 above.
A. 10.0 metres
B. 20.0 metres
C. 22.4 metres
D. 30.6 metres
E. 40.0 metres
18. The expected response should not be put at the beginning of the stem.
Poor
…………..…printing devices transmit output to a printer via radio waves.
a. Infrared
b. Laser
c. Bluetooth
d. Large Format
Good
What printing device transmits output to a printer via radio waves?
A. Bluetooth
B. Cartridge
C. Infrared
D. Laser
19. Check each item to make sure that there is only one correct or best response to the item.
Poor
Amsterdam is the capital city of _______.
A. Holland
B. Hungary
C. Luxemburg
D. Netherlands
Both Holland and Netherlands provide the answer
20. Be consistent in the number of options used. Four or five options are good for higher
education students.
21. Read through all items carefully to ensure that the answer to one question is not revealed in
another.
Example:
Q6. What do you use to test for sugar in urine?
A. Albustix
B. Clinitest
C. Ketostix
D. Uristix
Q20. This can also be used to test for sugar in urine if clinitest is
not available.
A. Albustix
B. Ictostix
C. Ketostix
D. Uristix
UNIT 7
Description
A true and false test consists of a statement to marked true or false. A respondent is expected to
demonstrate his command of the material by indicating whether the given statement is true or
false.
Types
Example:
Sir Gordon Guggisberg was the governor who built the Takoradi Harbour. True or False
1. Compound True-False: This consists of 2 choices, True and False plus a conditional
completion response.
Example; A nurse who values equality demonstrates honesty to patients. True, False.
2. Multiple True-False: This consists of a stem with three, four or five options and the
Example: The factors that reduce the reliability of classroom tests include:
For Simple, Compound and Multiple types, statements must be definitely true or definitely
false.
Words like some, most, often, many, may are usually associated with true statements. All,
always, never, none are associated with false statements. These words must therefore be
avoided.
For simple true-false type, approximately, half (50%) of the total
number of items should be false because it is easier to construct statements that are true and
wrong answer.
Poor: A patient took one tablet of a prescribed medicine and was healed in 24 hours. 8
Poor: Akropong Teacher Training College, built in 1900, is the first teacher training
institution in Ghana
State each item positively. Negative item could however be used with
the negative word, 'not', emphasized by underlining or writing in capital letters. Double
Examples: (1) Abedi Pele is the best Ghanaian footballer. True or False
True or False
Item 1 is ambiguous because best is relative while the trick in item 2 is the spelling of Aidoo.
Good: Dr. Kwame Nkrumah was the first President of Ghana. True or False.
Arrange the items such that the correct responses do not form a discernible pattern like TTTT
To avoid scoring problems, let students write the correct options in full.
Double-barreled statements should be avoided. These statements have one part true and one
part false.
Poor: The Bond of 1844, signed by Governor Commander Hill declared the Northern
The Bond was signed by Commander Hill but did not achieve the stated purpose.
Avoid the use of unfamiliar vocabulary.
Poor: According to some politicians, the raison d’etre for capital punishment is retribution.
Good: According to some politicians, the justification for the existence of capital
punishment is retribution.
Avoid using extreme items. Words such as; all, no, always, never, the very most; or very
MATCHING-TYPE TESTS
Description
The matching type of objective test consists of two columns. The respondent is expected to
march (associate) an item in Column A with a choice in Column B on the basis of a well-defined
relationship.
Column A contains the premises and Column B contains the responses or options.
Example.
Match the vitamins in Column A with the diseases and conditions which a lack of the vitamin in
causes in column B
Column A: Column B
2. Vitamin C b. Kwashiorkor
3. Vitamin D c. Pellagra
d. Poor eyesight
e. Rickets
f. Scurvy
1. Do not use perfect matching. Have more responses than premises. There should be at least
2. Arrange premises and responses alphabetically or sequentially. This reduces the amount of
unnecessary searching on the part of the person who knows the answer.
3. Column A (premises) should contain the list of longer phrases. The shorter items should
4. Limit the number of items in each set. For each set, the number of premises should not be
more than six per set with the responses not more than ten.
Poor:
A B
1. The Battle of Dodowa a. 1824
d. 1826
e. Lord Listowel
f. Congo
Good:
Instruction: Select a river from list B to complete the description in list A. Write the answer
A B
d. Ubangi
e. Volta
f. Zambezi
6. Provide complete directions. Instructions should clearly show what the rules are and also
8. Avoid clues (specific determiners) which indirectly reveal the correct option.
Description
This type of objective test is also known as the Supply, Completion, and fill-in-the blanks.
Examples:
1. Scoring is easy.
4. Minimizes guessing.
5. They are best suited for measuring lower-level behaviours especially knowledge,
comprehension application.
8. Makes cheating more difficult and reduces its incidence as compared to multiple-choice
1. They are difficult to construct so that the desired response is clearly indicated.
5. Difficult to score since more than one answer may have to be considered.
1. Keep the number of missing words or blank spaces low. Preferably use one blank per item.
2. Use original statements that are carefully constructed. Statements should not be
4. Blanks must be placed at the end or near the end of the statement and not at the
beginning.
Items should be so clearly written that the type of response required is clearly
recognized.
Poor: A specific disease in which acute glomerular damage occurs following distant
infections, particularly with certain streptococci and usually affects children and
young adults and which clinical picture is commonly one of a dramatic onset of
Good: The disease in which acute glomerular, damage occurs following distant
infections is ___________________
Missing words must be important ones. Avoid omitting trivial words to trick the
student. Only test for important facts and knowledge.
Poor: The ___ of the June 4, 1979 revolution in Ghana was Flt. Lt. J. J.
Rawlings.
Specify the degree of precision and the units of expression required in computational
problems.
Aim at providing items that belong to the correct answer type and not the best answer type.
Good: Radios and tape recorders are regarded as ______________ audio-visual aids.
Keep all blanks the same length, and in a column to the right of the question.
Example:
Description:
An essay type test is a test that gives freedom to the respondent to compose his own response
using his own words. The tests consist of relatively few items but each item demands an
extended response.
Types Of Essay-Tests
1. The restricted response type limits the respondent to a specified length and scope of the
response. For example, 'In not more than 200 words discuss the causes of the 1948 riots.
2. The extended response type does not limit the student in the form and scope of the
answer. For example, Discuss the factors that led to the overthrow of the Dr. Kwame
Nkrumah’s government in Ghana in 1966.
2. The items should be based on novel situations. Be original. Do not copy directly from
textbooks or colleagues/others’ past test items.
3. Test items should require the students to show adequate command of essential knowledge.
The items must be restricted to the measuring of higher mental processes such as
application, analysis, synthesis and evaluation.
(b) analysis:
A Form 1 student girl was severely and unfairly punished. Describe the feelings
such treatment aroused in her.
(c) synthesis:
You are the financial secretary of a society aimed at raising money to build a
fish pond in your community. Plan and describe a promotional campaign for
raising the money.
(d) evaluation:
Evaluate the function of the United Nations Organization as a promoter of
world peace.
4. The length of the response and the difficulty level of items should be adapted to
the maturity level of students (age and educational level).
7. Prepare a scoring key (marking scheme) at the time the item is prepared.
8. Establish a framework and specify the limits of the problem so that the student
knows exactly what to do.
9. Present the student with a problem which is carefully worded so that only ONE
interpretation is possible. The questions/items must not be ambiguous or vague.
10. Indicate the value of the question and the time to be spent in answering it.
11. Structure the test item such that it will elicit the type of behaviour you really want to
measure.
12. The test items must be based on the instructional objectives for each content unit.
13. Give preference to a large number of items that require brief answers.
14. Statements and sub-questions for each item should be clearly related.
15. Avoid words such as: what, list, who, as much as possible in essay type test.
Essay tests can be scored by using the analytic scoring rubrics (also known as the point-score
method) or holistic scoring rubrics (also called global-quality scaling or rating method).
Analytic Scoring
In analytic scoring, the main elements of the answer are identified and points awarded to each
element. This works best on restricted response essays.
Holistic Scoring
In holistic scoring, major points are written.
Five (sometimes 4) levels of quality are described and marks awarded. Eg
A. Excellent 26 - 30
B. Very good 21 - 25
C. Good 16 - 20
D. Fairly good 11 - 15
E. Fail Below 11
Each response is read for a general impression of its adequacy as compared to the standard. The
general impression is then transformed into a numerical score.
• A: Excellent (26-30)
Gives an introduction
Discusses five reasons very well/in depth
Very few grammatical errors/expression
Gives conclusion
• B: Very Good (21-25)
Gives an introduction
Discusses five reasons but not too well or discusses four reasons very well
Few grammatical errors/expression
Gives conclusion
• C: Good (16-20)
Gives an introduction
Discusses five/four reasons but not in depth
OR discusses three reasons very well
Many grammatical errors/expressions
No conclusion
• D: Fairly Good (11-15)
No introduction
Discusses three reasons but not in depth
Many grammatical errors/expressions
No conclusion
• E: Fail (Below 11)
No introduction
Discusses one/two reasons but not in depth
Many grammatical errors/expressions
No conclusion
1. Prepare a form of scoring guide, either an analytic scoring rubric or a holistic scoring rubric.
2. Score (mark) tests without knowing the one whose paper is being scored. This reduces the
halo effect. Different forms of identification could be used instead of names.
3. Grade the responses item by item and not script by script. Score all responses to each item
before going to the next item. This reduces the carryover effect. The carryover effect
occurs when the mark for a question is influenced by the performance on the previous
question.
4. Keep scores of previously graded items out of sight when evaluating the rest of the items.
5. Periodically rescore previously scored papers.
6. Before starting to score each set of items the script should be shuffled.
7. Score the essay test when you are physically sound, mentally alert and in an
environment with very little or no distraction.
8. Constantly follow the scoring guide as you score. This reduces the rater drift which is the
tendency to either not paying attention to the scoring guide over time or interpreting it
differently as time passes.
9. Score a particular question on all papers at one sitting. Break when fatigue sets in.
10. Arrange for an independent scoring of the responses or at least a sample of them where
grading decision is crucial.
11. Comments could be provided and errors corrected on the scripts for class tests to facilitate
learning.
12. Avoid being influenced by the first few papers read. These could make you either too harsh
or too lenient.
13. The mechanics of writing such as correct grammar usage, paragraphing, flow of
expression, quality of handwriting, orderly presentation of material and spelling
should be judged separately from the content.
UNIT 9
1. Prepare students for the test. The following information is essential to students’ maximum
performance.
When the test will be given (date and time).
Under what conditions it will be given (timed or take-home, number of items, open
book or closed book, place of test).
The content areas it will cover (study questions or a list of learning targets).
Emphasis or weighting of content areas (value in points).
The kinds of items on the test (objective-types or essay-type tests).
How the assessment will be scored and graded.
The importance of the results of the test.
2. Students must be made aware of the rules and regulations covering the conduct of the test.
Penalties for malpractice such as cheating should be clearly spelt out and clearly adhered to.
3. Avoid giving tests immediately before or after a long vacation, holidays or other important
events where all students are actively involved physically or psychologically/emotionally.
4. Avoid giving tests when students would normally be doing something pleasant e.g. having
lunch etc.
5. The sitting arrangement must allow enough space so that pupils will not copy each others
work.
6. Adequate ventilation and lighting is expected in the testing room.
7. Provision must be made for extra answer sheets and writing materials.
8. Pupils should start the test promptly and stop on time.
9. Announcements must be made about the time at regular intervals. Time left for the
completion of the test should be written on the board where practicable.
10. Invigilators are expected to stand a point where they could view all students. They should
once a while move among the pupils to check on malpractices. Such movements should not
disturb the pupils. He/she must be vigilant.
11. Invigilators should not be allowed to read novels, newspapers, grade papers or receive calls
on mobile phones.
12. Threatening behaviours should be avoided by the invigilators. Speeches like ‘If
you don't write fast, you will fail’ are threatening. Pupils should be made to feel at ease.
13. The testing environment should be free from distractions. Interruptions within and outside
the classroom should be reduced. It is helpful to hang a “Do not DISTURB – TESTING IN
PROGRESS” sign at the door.
14. Test anxiety should be minimized. .
15. Do not talk unnecessarily before letting students start working.
16. Avoid giving hints to students who ask about individual items. Where an item is ambiguous,
it should be clarified for the entire group.
17. Expect and prepare for emergencies. Emergencies might include shortages of answer
booklets, question papers, power outages, illness etc.
Amedehe and Asamoah-Gyimah (2003), Item analysis usually includes the following;
Item difficulty
Item discrimination
Distracters analysis
Item bias
Item difficulty
Amedehe and Asamoah-Gyimah (2003), item difficulty is the percentage of students who answer
correctly each test item. It is calculated by dividing number of students who answer correctly the
R
item by total number of examinees. Mathematically, P = where R = number of students
T
who answer correctly the item and T= total number of examinees.
This means that the smaller the difficulty index, the more difficulty the item and the greater the
difficulty index, the less difficult the item.
The table below shows the difficulty index of 30 items with 30 students.
Item difficulty
Item discrimination
Distracters Analysis
The quality of the items depends partly on the effective functioning of the
distracters selected by the examinees.
By inspection of how options were selected by examinees, a good distracter
should attract at least one examinee especially from the lower group.
A good distracter must be plausible enough to attract the unknowledgeable
examines (Amedehe & Asamoah-Gyimah, 2003).
The function of the distracters is to determine whether examinees really know the
correct answer to the item.
Examples:
1. Ideal
2. Ambiguous alternative
Options Upper Group Lower Group
A 1 4
B* 10 5
C 9 5
D 0 6
Options B and C seem equally attracted to the high achiever. Option C should be checked as
well as the test item for ambiguities.
3. Miskeyed Item
Options Upper Group Lower Group
A 13 7
B 6 6
C 0 3
D* 1 4
Majority of the upper group selected A. Option A might be the correct response and not D.
4. Poor distractor
Options Upper Group Lower Group
A 2 6
B* 12 6
C 0 0
D 6 8
Option C attracted no student. It is a poor distracter and has to be replaced.
UNIT 10
INTERPRETATION OF TEST SCORES
Scores obtained in classroom quizzes, tests and examinations are known as raw scores.
They give very little information about the performance or achievement of a student.
For example if Ayisha obtained 48 in a test, it is difficult to know her level of
performance unless more information is provided.
Such types of information include;
Maximum score/best score
mean or median score
the variability of the group
the difficulty level of the items
the number of test questions and the amount of time allowed for the test.
To interpret and obtain meaning from the scores, they need to be referenced or
transformed into other scores.
There are two popular ways of interpreting test scores so the meaning can be derived from the
scores. These are:
1. Norm-referenced Interpretation
2. Criterion-referenced Interpretation
NORM-REFERENCED INTERPRETATION
These describe test scores or performance in terms of a student’s position in a
reference group that has been assessed.
In other words, it compares and individuals performance with others in the group who
have taken the same test.
The reference group is called the norm group.
In the earlier example, Ayisha’s score of 48 can be compared with the mean score for
the class.
If the mean score is 40, then one could say that Ayisha’s performance was above the
mean/average.
If the median score is 40, then one could also say that Ayisha’s performance could be
placed in the upper half of the class.
The score of 40 can appropriately be called the norm and the class that provided the
mean or median of 40 is called the norm group.
Types of norm-referenced scores
The following are the most popular norm-referenced scores;
1. Class raw score ranks. Raw scores in a class are often ordered from the highest
score (1st position) to the lowest score (last position). The ranks tell about how a student
performs compared with the others in the group.
2. Percentile and percentile ranks. A percentile is a point in a distribution below
which a certain percentage of the scores fall while a percentile rank is a person’s relative
position such that a given percentage of scores fall below the score obtained. If a raw
score of 48 is the 60th percentile, it means that a student who obtains 48 in a test, has done
better than 60 percent of all those in the group that took the test.
Z scores is based on the normal distribution such that the mean is 0. Raw scores
that are transformed to Z-scores use the formula:
X− X̄
Z=
s , where X is the raw score, X̄ is the group mean and S, the group
standard deviation.
Negative values show that performance is below average
Positive values mean that performance is above average.
T-scores are based on Z-scores and use the formula; T = 50 + 10Z. Scores above
50 show above average performance and scores below 50 show below average
performance.
4. Stanines (Standard Nine). These are derived scores based on the normal
distribution with a mean of 5 and standard deviation of 2. It uses the integers, 1 – 9. The
percentage of scores at each Stanine are: 9 (top 4%), 8 (next 7%), 7 (next 12%), 6 (17%)
5 (next 20%) 4 (next 17%) 3 (next 12%) 2 (next 7%) and 1 (lowest 4%) as shown in the
table below
Stanine(Grade) 1 2 3 4 5 6 7 8 9
CRITERION-REFERENCED INTERPRETATION
These describe test scores or performance in terms of the kinds of tasks a person with a
given score can do.
The performance can be compared to a pre-established standard or criterion.
For example a student may be able to solve 8 problems out of 10 concerning fractions.
A level of performance can be established at 6.
The criterion or standard can be used as a competency/mastery score so that students who
have obtained scores that are greater than 6 are termed competent or have mastered skills
in a particular domain.
Criterion-referenced interpretations generally indicate what an individual can or cannot
do with respect to a specified domain of knowledge attitudes or skill.
1. Percent correct scores. This is the percentage of items that a student got correct. For
example if a student obtained 8 marks out of 10, the percent correct is 80.
2. Competency scores. These are cut-off scores set to match acceptable performance. Students
who obtained the cut-off scores are believed to have achieved a required level of
competency. Cut-off scores should not be arbitrarily set. There should be a support or basis
for them.
3. Quality ratings. This is the quality level at which a student performs a task. For
example, a student can be rated as A for outstanding, B+ for excellent etc.
4. Speed of performance scores. These indicate the amount of time a student uses to
complete a task or the number of tasks completed within a specified time. For example, a
student may type 30 words in a minute or an athlete may run 100 meters in 11.5 seconds.
specific group after the scores have been arranged sequentially. This means that a
student who obtains a score of 60 has done better than 30% of the members in the
group.
P75 = 50. Fifty (50) is the score below which 75% of the scores lie in a specific
group after the scores have been arranged sequentially. This means that a student
who obtains a score of 50 has done better than 75% of the members in the specific
group.
A score in one group may be a different percentile in another group.
For example, in Mathematics Quiz 1, a student with a score of 15 may be at P90 in the Arts
class but the same score may put the student at P85 in the Home Economics class.
P50 is the same as the median. P25 is the first quartile and P75 is the third
quartile.
Percentile Ranks: The percentage of cases falling below a given point on the measurement
scale. It is the position on a scale of 100 to which an individual score lies.
Notation: PR of 60 = 75. Seventy-five is the position for a score of 60 when the distribution
is divided into 100 parts. This means that a student who obtains a score of 60 has
75% of the scores falling below him/her in the group.