4023 Epy 410 Educational Measurement and Evaluation Notes

MASENO UNIVERSITY
SCHOOL OF EDUCATION
DEPARTMENT OF EDUCATIONAL PSYCHOLOGY
Introduction to the Course
This is EPY 410: Educational Measurement and Evaluation Module. This is a 4th Year, first Semester Course. It is our belief
that you were introduced to EPY 110, EPY 311 and EPY 310, both of which made several mention of measurement and
evaluation aspects in Psychological testing.
As you read through this module, you will be introduced terminologies used in measurement and evaluation, the
importance of measurement and evaluation, types of measurement and evaluation, construction of tests and their
administration. You will also learn how to prepare a frequency table from raw data, measures of central tendency,
measures of dispersion/variability, measures of relationship, and prediction of outcomes based on students’ scores.
This module has six major topics and each topic has several sub-topics. Every user of this module has to ensure that
before he/she proceeds to a new section, each preceding sub-section is thoroughly comprehended. Each of the sub-
section presents self-check tests meant to help you assess your level of understanding. The score earned should tell you
the progress you have made in internalizing the information. It is our sincere hope that you will find the module easy to
understand and informative. However, should you have any comments or compliments, feel free to do so. Be reminded
that this is the most easiest course.
Aim
Module EPY 410 aims at equipping you with knowledge and skills in test measurement and test evaluation and various
ways of test Preparation and interpretation.
1
Objectives
By the end of the course, you should be able to:
i. Define various statistical concepts and explain their importance in educational measurement and
evaluation
ii. Explain and construct different types of tests. iii. Tabulate and depict sets of data for both ungrouped
and grouped distributions. iv. Explain and compute measures of central tendency, variability and
relationship. v. Explain regression analysis and interpret the standard error of estimate.
vi. Explain and compute the validity and reliability of a test.
COURSE CONTENT
Topic 1: Tests measurement and Evaluation…………………………………………….
Section 1: Introduction…………………………………………………………………..
Section 2: Measurement, evaluation and assessment………………………………………
Section 3: Purposes of Measurement and Evaluation………………………………………
Section 4: Tests and Examinations………………………………………………………..
Section 5: Construction of Tests………………………………………………………………

Section 6: Test Scoring……………………………………………………………………..
Section 7: Test/Examination Administration and Examination Cheating…………………………
Topic 2: Frequency distributions and graphic presentations………………………
Section 1: Statistical Concepts in Tests and Measurement………………………….
Section 2: Frequency distributions and graphical presentation………………………….
Section 3: Stated and real class limits………………………………………………….
Section 4: Histogram…………………………………………………………………..
Section 5: Frequency polygons and curves………………………………………….
Section 6: Skewness and kurtosis of a distribution………………………………………..
Topic 3: Measures of central tendency………………………………………………..
Section 1: The mode……………………………………………………………………
Section 2: The median………………………………………………………………
2
Section 3: The mean…………………………………………………………………
Section 4: Mean, mode and median compared……………………………………………
Topic 4: Measures of Dispersion…………………………………………………….
Section 1: Range………………………………………………………………………..
Section 2: Variance……………………………………………………………………….
Section 3: Standard deviation……………………………………………………………..
Section 4: Interquartile range/deviation………………………………………………..
Section 5: Percentiles…………………………………………………………………..
Topic 5: Measures of Correlation and Regression Analysis…………………………
Section 1: The concept of correlation analysis…………………………………………………
Section 2: Scatter diagram; a graphical presentation of the measures of relationship……………
Section 3: Spearman and Pearson correlation techniques of determining relationships………
Section 4: Regression Analysis………………………………………………………………… Topic 6: Test validity and

reliability………………………………………………
Section 1: Validity……………………………………………………………………….
Section 2: Reliability……………………………………………………………………….
Section 3: Item Analysis……………………………………………………………………..
References ..............................................................................................................
SYMBOLS
– Sum of
f – Frequencies
N or n – Number of variables
Mo – Mode
Md – Median
3
Welcome to EPY 410 Educational Measurement and Evaluation Module
TOPIC 1
TESTS MEASUREMENT AND EVALUATION
1.0 Introduction
In this topic, you will learn types of evaluation, types of tests and examinations, construction of tests, scoring
of tests and test administration.
1.1 Objectives
By the end of the topic, you should be able to:
 Define the terms measurement, evaluation and assessment.
 State and explain the different types of evaluation and assessment.
 Explain purposes of measurement and evaluation

 Describe various types of tests and examination.
 Explain factors to consider when constructing and scoring a test
 Explain causes, methods and effects of examination cheating
EDUCATIONAL TESTS, MEASUREMENTS AND EVALUATION
MEASUREMENT, EVALUATION AND ASSESSMENT

Definitions of terms
 Measurement - is the process of assigning a quantitative value (numerical) to a student’s attainment in a given area
of learning e.g. 64%.
 Evaluation – refers to the process of assigning a qualitative value to a student’s attainment in a given area of learning
e.g. C+.
Types of Evaluation
There are three types

1. Formative evaluation
2. Summative evaluation
3. Assessment
4
Formative Evaluation
• It is the progressive assessment of the success with which a program is being implemented. It shows whether learning
objectives are being achieved.
• It is done with a small group of people to "test run" various aspects of instructional materials.
• It is typically conducted during the development or improvement of a program and it is conducted more than once.
• The purpose of formative evaluation is to validate or ensure that the goals of the instruction are being achieved and
to improve the instruction, if necessary, by means of identification and subsequent remediation of problematic
aspects.
• Formative evaluation is research-oriented.
• Formative evaluation provides information on the product's efficacy (its ability to do what it was designed to do).
Summative Evaluation
 Summative evaluation is a method of judging the worth of a program at the end of the program activities. The focus
is on the outcome.
 It is typically quantitative and uses numeric scores or letter grades to assess learner achievement.
 It is action-oriented. That is, on the basis of the findings, the programme can be adopted entirely, modified or
abandoned altogether.
Assessment
 It is the process by which the quality of an individual’s work or performance is judged.

 It is carried out through observations of pupils’ at work or by various kinds of tests given periodically.
 When practiced as an ongoing process, such assessment is called continuous assessment.
In a group of five, discuss with specific examples from your school settings the different types of
evaluations carried out.
Types of Assessment
1. Normative Assessment/Testing
• It is also called Norm-referenced assessment/test. It is where the quality of the grade depends on the average
(norms) performance i.e. an individual’s score is judged in relation to how good the overall performance is or
was.
5
• It is not measured against defined criteria but is relative to the student body undertaking the assessment i.e. it
will tell you how a child compares to similar children on a given set of skills and knowledge.
• The IQ test is the best known example of norm-referenced assessment. Many entrance tests (to prestigious
schools or universities) are norm-referenced e.g. KCPE or KCSE.
• It is a way of comparing students implying that standards may vary from year to year, depending on the quality
of the cohort.
Advantages
i. It does not enforce any expectation of what all students should know or be able to do other than what students can
actually demonstrate.
ii. Present levels of performance and inequity are taken as fact but not as defects to be removed by a redesigned
system.
iii. Aims of student performance are not raised every year until all are proficient. Scores are not required to show
continuous improvement.
Limitations
(a) It cannot measure progress of the population of a whole, only where individuals fall within the whole.
(b) It does not set what an individual should profess to prove a mastery of a skill being tested but rather bases on the set
norm.
(c) It judges set benchmarks around items of varying difficulty without considering the ability level or age of the
examinees.
(d) The difficulty level of items that determine the levels passing vary from year to year.
2. Criterion Assessment
• It is where a decision is made as to whether a pupil has actually achieved specified level of learning regardless of
the performance of other pupils.
• Here, the criterion or level of achievement which warrants a mastery of certain skills is set in advance. It is not
flexible.
• Criterion-referenced assessment is often, but not always, used to establish a person’s competence in doing
something e.g. the driving test, when learner drivers are measured against a range of explicit criteria.
• It tells where the person stands in some population of persons who have taken the test.
• Most criterion-referenced tests involve a cut score, where the examinee passes if their score exceeds the cut
score and fails if it does not (often called a mastery test).
• However, not all criterion-referenced tests have a cut score, and the score can simply refer to a person's standing
on the subject domain.
6
Advantage
i. Many criterion-referenced tests are high-stakes tests since results of the test have serious implications for the
individual examinee.
ii. Criterion referenced tests are standard-based assessments where students are assessed with regards to set
standards that define what they "should" know.
Limitations
(a) They can be described as, "you lose a lot if you fail to pass” e.g. licensure testing where the test must be passed in
order to progress.
(b) Some tests set a standard that have failed 50 to 80 percent of students at the outset, a higher, not lower failure rate
than is possible with standard definition of 50 percent falling below average.
3. Diagnostic Assessment
It is the process of finding out the exact nature of a person’s problem or difficulties. In education, the aim is to give
relevant remedial teaching to those who deserve it.
 What is your major teaching subject? Have you ever made diagnostic
assessment of your pupils in the subject? What were your major findings?
PURPOSES OF MEASUREMENT AND EVALUATION
The primary purpose of assessment is to improve student learning.
1. To identify areas of weakness in learning..

2. Helps build a shared understanding of the progress made by pupils in order to provide pointers for further
development
3. Provide feedback to students, staff and parents/guardians on pupils’ progress and achievements.
4. Timely feedback improves motivation and achievement for the learner.
5. To grade students for purposes of promotion to next level.
6. Acts as a quality assurance mechanism both for internal and external systems i.e. tells whether objectives are been
achieved.
7
7. To appraise the effectiveness of a teaching method or methods.
8. To measure specific abilities e.g. IQ, vocabulary, creativity etc.
9. To provide information for effective educational and vocational Counselling.
TESTS AND EXAMINATIONS
Test - Is a set of questions to which an examinee has to respond.
Examination - Is a set of tests in various areas to which an examinee has to respond.
Types of Examinations
A. Internal Examination
It is usually prepared and marked by the teacher’s in-charge of the subject in question.
Advantages
i. Questions asked are based on the work covered in class and are therefore learner friendly.
ii. The language and format used in setting the questions are familiar to the learners hence learners experience less
stress compared to external examinations.
Disadvantage
i. The results may not be a true reflection of the learners’ ability since the teacher tends to be subjective in his/her
evaluation of the learners’ performance.
ii. Teacher may set the questions based on what has been covered in class hence syllabus coverage is poor.
iii. Tends to be highly subjective since the setter (teacher) sets based on certain preferences.
B. External examination
Is prepared and marked by a person or body of experts not responsible for teaching the subject being
examined.(subject in question)
Advantages
8
i. It gives a more objective assessment of the learner since the examiners are unknown to the examinee.
ii. There is good syllabus coverage since both the teacher and the learner cannot guess the examinable areas.
iii. Due to objectivity in scoring of examinees abilities across the population, higher institutions of learning and potential
employers prefer selection on this basis.
Disadvantages
i. It invalidates the importance of learning and education since it often turns out examination oriented.
ii. Encourages cramming of facts rather than application of learned materials.
iii. It increases emotional stress due to over concern about examinations results.
TESTS
A test refers to a standard set of questions to be answered in order to obtain a sample of an individual’s behavior
or attributes.
Tests can also involve a series of tasks to be performed.
A useful test measures accurately some property or behavior.
Classification of tests
Tests can be classified by
i. how they are administered

ii. how they are scored
9
iii. what sort of response they emphasize
iv. what type of response students must make
v. the nature of the group being compared
 Individual and Group Tests

Individual tests are those administered to one individual at a time. Most of individual tests are given orally and
require examiners constant attention. This is because the examiner in not only interested in the verbal response
but also the non-verbal response.
Those administered to large groups are sometimes referred to as group tests.
 Objective and Subjective Tests

The two differ in terms of their scoring.
Objective tests have standard scoring key. This implies that no matter who marks the test, the score will be the
same. Examples are multiple choice tests and true and false tests.
Subjective tests have no scoring key for example the essay questions.
 Power and Speed Tests

Power tests have generous time limits. This implies that students can attempt to the entire item within the given
time. Items in power tests however are difficult or may vary in difficulty. It is meant to test how much knowledge a
student has.
Speed tests have severe time limits. This implies that it is not easy to complete all the items. Items on speed tests
are quiet easy and so few errors are likely to be made.
It is meant to test the speed of students in answering questions within a given time limit.
10
 Performance and Paper & Pencil Test
Performance tests require students to perform tasks rather than answer questions. This are normally administered
individually
In paper and pencil tests students are asked to write down their answers. Often given in large group situations
 Teacher Made and Standardized Tests

Teacher made tests are constructed by teachers to be used within their own classrooms.
Their effectiveness depends on the skills of the teacher and the knowledge of test construction he/she posses. An
example is the exam given here at the university
Standardized tests are constructed by test specialists working curriculum experts and teachers. They are
standardized in that results from different classes and schools may be compared. An example is the KCSE exams
Comparison between Teacher Made and Standardized
Teacher made Standardized
Specificity of objectives Objectives are specified to the Objectives are general to the
needs of students in a given needs and students in most
classroom classrooms
Content Content may come from any Only the most common areas of
area of the curriculum however the curriculum are tested.
items can be added, modified Items are fixed and not
or eliminated as desired. modifiable
Rules for administration and Determined by the teacher and Determined by test publishers
scoring can be adapted to the who follow a test manual that
particular needs of the is provided to them
students.
Norms The norms may be developed Norms are provided by the

by the teacher publishers to compare class
performance
11
Evaluation Test quality is assed by the Quality of examination is
teacher provided by the publisher.
 Norm-referenced and criterion-referenced tests

Norm referenced test measure the performance of a group of test takers against another group of test takers.
Compares a person’s knowledge and skills to the knowledge or skills of the norm group. Examples are the
standardized tests which are done in Kenya
Criterion referenced tests measure whether the test taker has met the program objectives It compares a person’s
knowledge or skills against predetermined learning goals.
Other students’ performance in the group are not taken into consideration e.g when you do the end semester exam if
you get 41 it is an automatic D. it won’t matter whether you were the first in the class.
Comparison between the two
Dimension Criterion-referenced tests Norm-referenced tests (more like a

standardized test)
Purpose - Determines if each student has - To rank students with reference to

acquired specific skills, knowledge the achievement of others in broad
or concepts areas of knowledge
- To find out how much students - To discriminate high and low
know before and after instruction achievers e.g. IQ tests, personality
- Used to measure if curriculum goals tests
have been met
Content - Measure specific skills which make - Measures broad skills sampled from
up a designated curriculum. Each a variety of textbooks and syllabi
skill is expressed as an instructional and the judgment of curriculum
objective. experts
Characteristics - Each skill is tested by at least four - Each skill is tested by less than 4
items in order to obtain an adequate items
12
sample of students’ performance - Items vary in difficulty
and minimize the effect of guessing. - Items that are selected discriminate
- The items that test any given skill high and low achievers e.g. if you
are parallel in difficulty score low in an IQ test then you are
classified as retarded
Methods of test - Do not necessitate a standard - Administered in a standardized
administration administration format.
- Testing conditions should be similar
for all test takers
Score - Each individual is compared with a - Individual is compared with other
interpretation preset standard and directly related examinees based on performance. i.e
to the acquisition of curriculum the mean is computed for all the test
objectives takers and this is considered the
- Test scores are reported in average
categories or range - Student’s achievement is reported
- A student’s score is expressed as a for broad skill areas.
percentage
- Students achievement is reported
for individual skills
6 CONSTRUCTION OF TESTS
Qualities of a Good Test

1. Validity
A good test should measure what it is supposed to measure i.e. it should measure specific objective(s) of the test
set. A test that is set in a language that is not understandable is invalid.
2. Reliability
A good test should yield the same results on a re-test on the same group of learners under similar conditions.
3. Practicality /Usability
A test is said to be practical or usable if it can be readily used by the teacher in everyday classroom conditions.
A test which costs too much material to produce or a marking scheme which is hard to make renders a test useless.
13
 Suggest how a teacher can ensure that a test is valid, reliable
and usable?
Factors to consider when constructing a Test
1. Specification of objectives
The kind of vocabularies used should elicit the kind of responses required from the candidates. 2. Content
The examiner should ensure that questions set cover all topics taught/covered in class.
3. Emphasized content areas.

Some content areas/topics should be given more emphasis then others depending on the time spent to cover and
the total number of questions usually set from such topics.
4. Ability level of students

Questions set should be able to differentiate between bright, average and weak pupils.
5. Specification for types of domains to be measured.

Questions set should include cognitive, affective and psychomotor domains.
6. Specification of the cognitive domain to be measured.

This include (Bloom’s taxonomy)
a. Knowledge –ability to recall facts

b. Comprehension –ability to retell a story or given information in own words.
c. Application –ability to use newly learnt facts in novel situations.
d. Analysis –ability to break down material from component parts e.g. narrating a story based on a series of
pictures.
e. Synthesis -
f. Evaluation –ability to judge the value or worth of a given piece of information.
7. Specification Table or Grid Matrix or Test Matrix.
It shows the number of questions from a certain content area. It also shows the cognitive domain to test and the
number of items to be set from each cognitive domain.
A Test Matrix for ----------------- Test for a Std ------ Class

CONTENT COGNITIVE DOMAINS
AREAS Know Compr Appl Analy Synth Eval Total
14
TOTAL Grand Total
Importance of specification table

a) Helps to improve the content validity i.e. gives a balanced test.
b) Helps a teacher not to concentrate on a particular domain of objectives
c) Helps in accountability of education i.e. how correct or valid a test measurement is.
Prepare a test matrix in your area of specialization. Does it meet the above standards?
8. Format of test items

A test could be oral or written. A written test is better than oral since it also tests a learner’s understanding of the
concept being tested. The examiner also needs to decide before hand whether essay or objectives test items will be
used.
Essay items are preferred if testing on the higher cognitive objectives while objective items are suitable if testing
for knowledge and comprehension.
9. Number of test items

The number of test items to be included in the test must be clearly stated. However, this depends on:
(i) Items allocated for the test.

(ii) Types of items chosen i.e. objectives or essay.
(iii) Complexity of test items and thought process involved.
10. Specification of time limits

Time given for a particular test depends on the mental processes involved and the kind of item format used. For
multiple choice item, 45-60 seconds is recommended; complicated mathematics problem or complex reading
selection may require 4-5minutes while vocabulary items may take 10-15 seconds.
11. Writing the test items

The examiner should have a thorough grasp of the subject matter dealt with in the test. The setter’s qualifications
should be indicated. A single writer may be assigned a particular area or have several writers assigned to one cell.
1.7.1 Construction of Objective Test Items
15
A. Completion Test (Filling in Blanks)
 Completion test requires recall and thinking ability. In this type of test, sentences are presented from which certain
words or phrases have been omitted.
 To construct completion items, the following suggestions should be considered.
i. Instructions should be brief and clear.
ii. Rephrase text books sentences or paragraphs to avoid rote memorization.
iii. Do not have too many blanks in a short sentence. Blanks should be placed either at the beginning, near the
end, or at the end of a statement.
iv. Blanks should be of standard length to avoid clues about the length of the completing word.
v. Always specify in what unit or value a numerical answer should be given.
vi. Use phrases rather than words to avoid ambiguous responses/answers and allow objective marking.
vii. Guard against clues that may give away the answers by ensuring that completions do not depend on text
book expressions or grammatical form.
viii. Avoid long and winding statements as they tend to lose meaning and confuse pupils unless well framed.
B. Matching Item Tests
 This consists of two columns, the premises (problem to be answered) and the responses (answers). The examinee
needs to make some association between each premises and each response.
 The following suggestions need to taken into consideration when constructing matching items
i. Do not have too many items on the list. A minimum of 5 and a maximum of 7 is preferred.
ii. The responses should be more than the premises in order to reduce correct item matching by elimination
process.
iii. Materials selected should be from the same subject so that a given premise has several possible matches in
the responses. iv. Names should be arranged in an alphabetical order while dates and numbers in sequence.
This saves the examinees’ time.
v. Watch for irrelevant but revealing association (clues) which may give away the matching such as singulars and
plurals.
Prepare a matching item test based on the following information: African countries against their
heads of government.
16
C. True-False Items
Yes/No; Right/Wrong; + (Plus) or – (Minus) or Positive/Negative can also be used in the place of true/false. To construct
true/false items, consider the following suggestions:
i. Place the symbol “T” and “F” before each question. This will save time when marking.
ii. The number of true statements should equal those of false statements. iii. When arranging the items,
avoid any form of pattern of true and false answers. iv. Do not use words which will provide clues or hints as this
may give away the answer.
v. Use statements which are absolutely true or false and avoid items which express opinions or which are
trivial/tricky.
vi. Avoid the use of double negatives and single negatives should be used sparingly. However, if they must be
used, they should be underlined, capitalized or italicized.
vii. Do not lift statements/quotations from textbooks since they encourage rote memory and turn out ambiguous
when interpreted out of context.
Construct 10 True-False item test for your class taking into account the above suggestions.
D. Multiple-Choice or Best-Answer Items
A multiple–choice test consists of two parts, the stem and a list of suggested answers.
 The stem: Contains the statement, questions, phrase or word i.e. the problem part. The stem may be stated as a
direct question or as an incomplete statement
 A list of suggested answers: The correct answer is called the key while the incorrect responses are called
distracters or foils.
.
Types of multiple choice questions
a) The correct-answer variety.

Where out of the options, only one is absolutely correct e.g. Which of the following is the largest town in Kenya? A)
Mombasa B) Kisumu C) Nairobi D) Nakuru b) The best-answer variety.
Consists of a stem followed by two or more suggested responses that are correct, appropriate in varying degrees,
or down-right wrong (examine responds with an opinion) e.g. Which of the following is the leading foreign
exchange earner of Kenya?
A) Coffee B) Horticultural products C) Tourism D) Soda ash
17
c) The multiple-response variety.
Is where a number of clearly correct answers exist and the examinee is instructed to mark all the correct responses
e.g. Which of the following are not capital cities in Africa. Mark the correct responses. A) Mogadishu B) Dar es
Salaam C) Lagos D) Ouagadougou
d) The incomplete-statement variety.

Is where a portion of the stem is incomplete rather than a direct question e.g. The capital city of the Republic of
South Africa is ____________________.
e) The negative variety.

I t is where the examinee is to mark the response that does not correctly answer the question i.e. the least
satisfactory answer e.g. Three of the following are major agricultural towns in
Kenya. Which one is not? A) Bungoma B) Eldoret C) Kitale C) Kericho f) The substitution variety
It is where samples of originally well written prose or poetry are systematically altered to include errors in
punctuation, spelling, word usage and similar conventions. Selected words or phrases in these rewritten passages
are underlined and identified by a number. Several possible substitutions for each critical phrase are provided and
the examinee is asked to select the phrase (original or alternative) that provides the best expression e.g. Mr1
Wangila has been the Principal2 of WUCST3 since the inception of the college4.
(Professor, Doctor, Vice Chancellor, WUST, MMUST, Campus, University, University college)
g) The incomplete-alternatives variety

Is where incomplete or coded alternatives are used e.g. Which of the following is the fourth colour in the rainbow?
A) Y B) G C) V D) G
h) The combined-response variety

Consists of an item stem followed by several responses, one more of which may be correct. The examinee is to
choose the set of code letters or numerals which designate the correct responses. This variety tests a mastery of
sets of facts and complex organization and comparative evaluation of facts or concepts e.g. Below are political
parties in Kenya. (i) PNU (ii) ODM-K (iii) ODM (iv) GNU (v) KANU.
Which of the following combination has Kenya’s past and current heads of state been
associated with? A) (i) and (iii) B) ( i) and (v) C) (iii) and (v) D) (ii) and (iv )
List several national examinations done in Kenya. For each of the listed examination,
describe the types of test item used.
Suggestions for Constructing Multiple-Choice Items
18
i. Select problems which present real problem to the examinees and call for critical thinking. ii. Select distracters
which are attractive and plausible so that weak students can more often select them.
iii. There should be only one key and no unintentional help/clue should be given. iv. The stem
should be clear and responses should not borrow phrases from the stem.
v. Avoid the use of negatives but if they must be used, they should be underlined, capitalized or italicized.
vi. The key and the detractors should be more or less for equal length and should be short. vii. Avoid making
the correct answer to the items appear in a fixed pattern.
viii. Avoid the use of none of the above or all of the above. If not make them the correct detractor.
E. Maps, Diagrammatic and Pictorial Test Items
Look for past paper questions and make a list of errors made therein. Suggest how the
question should have been set.
These are questions that require interpretation, recognition of parts or features etc. The following should be considered
when designing such test items.
i. Maps, pictures and diagrams must be simple and clear.

ii. Do not shade pictures as they tend to be complicated beyond recognition.
iii. Those with poor drawing skills should trace or use actual /real pictures, maps or diagrams.
iv. Descriptive titles should be given to maps, pictures and diagrams and where necessary they should be framed.
Draw the map of Kenya and construct at least five (5) questions based on the
drawing?
CONSTRUCTION OF ESSAY QUESTIONS
When to use essay questions.
1. When the group is small and the test is not to be re-used.
 Do not remind the candidates of the time left frequently. This can be done after 1hr or so or after completing
one section of the paper.
 Examination timetable should be released and given at least one a week in advance to enable students prepare
adequately.
19
EXAMINATION CHEATING
It means to act dishonestly or unfairly in order to win an advantage or profit. It means to deceive and involves
dishonest tricks in order to pass exams.
Methods used.
• Impersonation- sitting an exam on behalf of somebody.
• Gaining access to exam papers or confidential material or information related to the exam prior to sitting of the
exam.
• Deliberate attempt to obtain or pass information concerning the exam when it is in progress. Information may
be obtained from fellow students, invigilators, teachers or smuggled materials, whispering or “flashing” answers.
Occasionally it involves seeking to go for a call of nature only to refer to information concealed somewhere.
• Practical subjects- teachers help in setting up equipments or offering answers or over scoring by teachers-in-
charge.
• Use of mobile phones to text the answers to a candidate before or during the exam.
• Writing on the shirt sleeves, petticoats, desks or the thighs particularly by female university students.
Causes of Cheating
• Academic weakness of some of the students/teachers.
• Euphoria attached to exam results-goods grades are a source of pride to self, families and institutions.
• Need to excel due to stiff competition.
• Corruption and lack of transparency especially those charged with the responsibility of handling exam materials.
• Cheating as an easy way out. Quest for knowledge has seemingly lost meaning.
• Lack of commitment among students especially the lazy ones who don’t take studies seriously.
• Congested curriculum and the belief that some subjects are difficult or impossible to pass.
• Uncertainty of employment among some course graduates leading to enrolment in others which may be
demanding.
• Nature of examinations e.g. practicals. There is the temptation to look at one’s neighbor’s work.
• Traditional way of delivery lecturers with exams taking the same pattern. This makes it easy to guess and cheat.
20
Effects of cheating
Diminishing credibility of examination as a measure of one’s ability and in the examiner(s).Those who cheat can’t
compare in any way with those who don’t.
• Loss of confidence in those charged with the handling of exams.
• Promotion to higher grade of education or training of the wrong people-who in turn perpetuate the practice.
• Kills teachers’ morale especially those hard working ones.
• Kills morale of the hard working and honest students.
• Cause misunderstanding between the cheats and honest candidates especially when no action is taken against
such.
• May often lead to result cancellation of the cheats with a doomed and painful future.
• Leads to repeating, suspension or expulsion form college causing more stress.
• Innocent students may suffer where results for a centre are cancelled.
• Compromises the education standards. Possible employers and other institutions doubt the authenticity of their
academic credentials.
• Lead to criminal prosecution for the culprits and their accomplices and loss of job(s).
NB: Cheating in exams is just an aspect of moral decadence of the society. It is a manifestation of a sick society, devoid
of a working culture and whose moral fiber has degenerated to irredeemable levels.
“Truly, truly, I say to you, he who does not enter the sheep fold by the door, but climbs in by another way, that man is a
thief and a robber; but he who enters by the door is the shepherd of the sheep.
Learning Outcomes
You have finished topic 1. The learning outcomes are listed below. Place a (√) in the column which reflects your
understanding.
No. Learning Outcome Agree Disagree

1 I can define the term
2 I can explain the
3 I can explain the
21
4 I can discuss
If for whatever reason you have put a tick on any of the statements, go back to the section before you proceed.
However, if you have ticked “agree’ on all the statements, you can proceed to the subsequent section
STATISTICS
The term statistics is derived from the Latin word known as status or the Italian word known as statista both
meaning a political state.
The origin of statistics was due to the administrative requirements of a state. It was required that the state’s
resources be collected and analyzed for the purpose of planning and finance as well as equitable distribution.
The earliest form of statistical data was limited to the census of population and property.
Definition
According to Boddington, statistics is the science of estimates and probabilities.
Statics is a science of counting (Bowley, A.L.)
Statistics comprises the collection, tabulation, presentation and analysis of an aggregate of facts collected in a
methodical manner without bias which is related to a predetermined purpose (Sutchiffle, W.G in his book
“Elementary Statistical Methods”).
Statistics is the act of playing with numbers and statisticians are those who play with numbers.
Educationists define statistics as the subject that describes various methods of keeping educational data in an
organized manner.
Psychologists define statistics as a branch of science that explains data’s qualitative nature and then draws
inferences after analyzing it.
Limitations of Statistics
Statistics studies only the quantitative data. Therefore traits like truth, wisdom, poverty, weakness e.t.c cannot
be analyzed by statistical methods.
Statistics does not study individuals as single entities but as groups.
Laws of statistics are applicable to the average only.
Terminologies used in statistics
22
1. Variable
 A variable is something that exists in more than one amount or more than one form. Examples; Height,
gender, weight e.t.c
2. Discrete and continuous variables

 Due to individual differences, psychological traits that we measure have much variation. Therefore,
results that we get through observation and testing are known as variables.
 Discrete variables are treated or taken as whole numbers without breaking them into fractions or
decimals e.g. student population in a classroom or number of children in a family.
 Continuous variables can be treated as whole numbers as well as fractions or decimals. Examples are;
height, weight, age e.t.c.
Note: In education it is only continuous variables that are used.
Population and Sample

A population consists of all members of some specified group.
A population consists of all elements in any given universe.
Examples,
All (800) fourth year students in the School of Education, Maseno University, in the year 2020.
A sample is a subset / representative of a population.
Example,
The 200 fourth year students in the School of Education, Maseno University in the year 2020 could be a
representative of the whole class of fourth years.
Investigators are always interested in a population. However, populations are often so large that not all the
members can be measured.
For that reason statisticians resort to measuring a sample that is small enough to be manageable but large enough
to be a representative of the whole population.
Parameters and Statistics
A parameter is some numerical (number) or nominal (name) characteristic of a population.
Example,
The mean IQ score of all fourth year students, Maseno University, in the School of Education (Parameters).
A statistic is some numerical (number) or nominal (name) characteristic of a sample.
Example,
The mean IQ score of 200 fourth year students in the School of Education is a statistic and so is the observation
that all are females.
Note;
23
A parameter is constant, it does not change unless the population itself changes.
Score
A score is an interval which extends from 0.5 units below to 0.5 units above the face value of the numerical figure.
For example a score of 45 includes all those values, which extend from 44.5 to 45.5.
Raw Scores / Data:
Raw scores are scores that have not been analysed or classified. They are an essential material for statistical
process, however they do not reveal anything until they are classified and analysed statistically.
Ungrouped Data:
When raw scores are collected from a small sample (N<30), classification in this case is not necessary and
therefore, calculations are done directly from the raw scores e.g;
10, 12, 20, 12, 13, 17, 15, 8, 6, 16 (N= 10).
Grouped Data:
When scores are collected from a large sample (N >30), classification is required to reduce the data into what is
manageable.
Range: Is the difference between the maximum score and the minimum score in any given data. It is calculated
by subtracting the minimum score from the maximum score i.e.; 90-10=80
Class – Interval:
The Range fails to give a clear picture of the distribution of scores or frequency. So, it is divided into several
equal sub-ranges known as class-intervals.
Size of C.I:
Number of scores included in one particular C.I. is known as the size of the class-interval. Example 10 - 14 = 5
5- 9=5
Number of C.I:
This is the total number of C.I. in any given frequency distribution.

No. of C.I = Range + 1
Size of C.I
Mid – point of C.I:

The mid-point of a class interval is calculated by the formula,
Max. Limit + min. Limit
2
Example, 90 + 94 = 184 = 92
2
Exact Limit of C.I:
24
To find the exact limit of a class interval (C.I), subtract .5 from the minimum limit of the C.I and add .5 to the
maximum limit of the C.I. e.g.; exact limit of C.I, 3 – 5 will be 2.5 – 5.5.
Frequency:
This is the number of times a score repeats itself in any given distribution.
Statistically, the number of times a score is repeated is called its frequency which is denoted by the symbol (f).
Example, 10, 12, 13, 10, 15, 17, 10, 19, 10. It means that 10 repeats itself 4 times.
Frequency Distribution
When scores in a given data are arranged according to their size and magnitude, this type of arrangement is known
as frequency distribution in addition to ordering them.
Series:
A well-organized form of C.I. is called series. They are of two types:-
1. Exclusive Series
Here the upper limit of the class interval is not considered in the same C.I. but in the next class interval. In
other words, we exclude the upper limit of the C.I.
Example,
C .I.
40 - 50
30 - 40
20 - 30
10 - 20
2. Inclusive Series
Here the upper limit of the C.I. is considered in the same C.I. In other words, both limits of the C.I. are
included in the same C.I.
Example,
40 – 49
30 – 39
20 – 29
10 – 19
Tallies:
Tallies denote the number of scores included in a particular group or C.I.
Frequency Distribution Table:

This is a table in which the total range of data is divided / classified into class intervals of such size as it makes
the table to be understood without distorting the overall character of the data.
TOPIC 2: DATA ORGANIZATION

- Before applying any statistical test in any data, the researcher has to organize the data in a way that facilitates
further analysis.
- Basically, there are two common ways of organizing or presenting data:-
1. Simple / Ungroup Frequency Distribution Table for ungrouped data.
2. Grouped frequency distribution table for the grouped data.
25
Simple / Ungrouped Frequency Distribution Table
 When scores / data are arranged according to their size and magnitude.
 Here scores are arranged or put in an ordered manner, usually in descending order and counted according
to their frequency of occurrence.
Suppose a group of 12 students were administered a Continuous Assessment Test in Tests, Measurement and
Evaluation and their scores were given as indicated below;
22 25 30 35
30 28 32 31
27 30 31 29
The scores provided above can be arranged into a frequency distribution table.
The scores are arranged in descending order so that the highest score is placed at the top and the lowest score is
placed at the bottom.
The investigator carefully counts the number of times each score has occurred in the total distribution and this
number is written against that score in the column of frequency.
For example, a score of 35 has occurred once, hence, one has been written against this score in frequency (f)
columns.
A score of 30 has occurred three times, hence 3 has been written in the frequency column.
Table 1: Frequency distribution of scores shown in Table 1
Scores f
35 1
34 0
33 0
32 2
31 2
30 3
29 1
28 1
27 1
26 0
25 ___1____
N = 12
Properties of the Class Intervals

There are several characteristics of class intervals, which must be kept in mind by any researcher who is
converting the scores into the frequency distribution table.
1. The class intervals must be such that a single score must not belong to more than one class interval. In other
words, the given set of class intervals must be mutually exclusive.
26
2. All class intervals must be of the same size.
3. The class intervals should be continuous throughout the distribution. For example, there are no scores in the
class interval in table 1.
It would be unwise to break the frequency distribution table into two at this point.
Not only this, it would cause further difficulties in calculation of statistics.
4. Ordinarily, the number of class intervals should not be fewer than 10 and more than 20. However, this is
not a must.
5. The class interval containing the higher score should be placed at the top. However, this rule is not rigidly
followed by some exerts but it is conventional and saves the troubles of reading and learning new frequency
distribution table.
Steps in Constructing Ungrouped Distribution Data

1. Arrange the scores in descending order so that the highest score is placed at the top and the lowest score is
placed at the bottom (score column).
2. Count the number of times each score occurs in the distribution and write that number against that score in
the frequency column (f).
3. Add all the scores in the frequency column to give you N.
4. Check your answer to confirm whether your N equals to the number of cases given.
Limitation of Ungrouped Distribution

- This method cannot be utilized in situations where the number of scores is large / huge.
Example,
Suppose the highest score is 100 and the lowest score 10 in a distribution. In such a situation, it will be highly
inconvenient to group the data according to the above method.
Therefore to deal with such a situation it is often convenient to group the scores into class intervals, in addition
to ordering them.
TOPIC 4: MEASURES OF CENTRAL TENDENCY

Introduction
- Measures of average are also known as measure of central tendency.
- The purpose is to provide a single numerical score which may describe or represent the entire distribution.
Example,
Let five (5) students join together or team up to do a project. Since they stay in the same estate, but far a part from
one another, the students agree to be meeting at a friend’s house, which according to them may sound to be the
central place or point.
9 John 10 Janes
11 Janet
12 Juma 13 Jane
27
Thus they all agree to be meeting at Janet’s house since she stays at the Central place according to their
measurements. Janet therefore tends to be the center for all or their representative.
However, note that the score / person at the centre may not necessarily be the centre as such but because he / she
is surrounded by the rest of the scores / persons, therefore takes charge / centre stage.
Definition
It is the tendency of the scores to concentrate or to bunch somewhere near the centre. It is that value which typifies
or best represents the whole distribution (Ross, C.C.).
Need for such measures
Measures of central tendency give a bird’s eye view of the huge mass of statistical data which ordinarily are not
easily interpretable.
The value of a measure of central tendency is that;
 Average which represents all the scores in the group.
 A score that help to compare two or more groups in terms of typical performance (Garrett).
There are three measures of central tendency:-
Mean (M), Median (Md) and Mode (Mo)
Mean is usually referred to as the “the average”
MEAN
 The mean of a series of values is the quotient of the sum of the values by their number.
 Mean is the sum of a set of measurement divided by the number of measurements in the set.
 Mean is the sum of data divided by the number of subjects in the set.
 The mean of a distribution of scores is the value on the scores scale corresponding to the sum of the scores
divided by their number or size of sample.
Computation
1. Ungrouped Data
Calculate mean from the following data;
7, 10, 8, 13, 11, 14, 9, 9, 13, 15
Formula for the ungrouped data is:-

M = ∑X
N
M = Mean
∑X = Sum of the scores
N = Total number of cases or scores
Therefore, ∑X = 7 + 10 + 8 + 13+ 11 + 14 + 9 + 9 + 13 + 15 = 109
So, Mean = ∑X = 10.9
N
2. Grouped Data
In the case of grouped data, mean is calculated using two methods.
- Long method
- Short / Assumed Mean method
Formula for the long method:-
M = ∑fX
N
M = Mean
N = Total number of scores / cases
fx = Multiplication of middle / Mid points of a C.I. with its respective
frequency.
28
Example;
Calculate the mean of the following scores using the long method.
CI f
100 – 109 5
90 – 99 9
80 – 89 14
70 – 79 19
60 – 69 21
50 – 59 30
40 – 49 25
30 – 39 15
20 – 29 10
10 – 19 8
0-9 6
N = 162
Procedure:-
With the data provided above, calculate the midpoints (x column) and multiply it with the responding frequencies
(f column) thus creating fx column as shown below.
CI f X (Mid-Points) fx
100 – 109 5 104.5 522.5
90 – 99 9 94.5 850.5
80 – 89 14 84.5 1,183
70 – 79 19 74.5 1,415.5
60 – 69 21 64.5 1,354.5
50 – 59 30 54.5 1,635
40 – 49 25 44.5 1,112.5
30 – 39 15 34.5 517.5
20 – 29 10 24.5 245
10 – 19 8 14.5 116
0-9 6 4.5 27
N = 162 8,979
Therefore mean = ∑fX

N
= 8,979 = 55.43
162
Assumed Mean / Short Method
Introduction
The straight forward method called the long method gives accurate result but often require handling of large
numbers with tedious calculations because of that “Assumed Mean” method has been devised for computing the
mean.
In calculating the mean by the short method, we “guess” or “assume” a mean and later apply a correction
to this assumed value (AM) in order to obtain the actual mean (M).
There is no set rule for assuming a mean. The best way is to take the midpoint of an interval somewhere near the
centre of the distribution; and if possible the midpoint of that interval which has the largest frequency (f).
Example,
29
Calculate the mean for the following frequency distribution using the short method.
Class interval Frequency

scores (f)
100 – 109 5
90 – 99 9
80 – 89 14
70 – 79 19
60 – 69 21
50 – 59 30
40 – 49 25
30 – 39 15
20 – 29 10
10 – 19 8
0-9 6
N = 162
Formula
Mean = AM + C
Note that;
AM - Assumed Mean
C - Correction
Procedure
Calculate the Midpoints of each class interval and locate the assumed mean at the class interval with the highest
frequency.
- In the table provided above, the largest (f) is on interval 50-59, which also happens to be almost at the
centre of the distribution. Therefore, the Assumed Mean (AM) is taken at 54.5.
CI f X (Mid-Points)
100 – 109 5 104.5
90 – 99 9 94.5
80 – 89 14 84.5
70 – 79 19 74.5
60 – 69 21 64.5
50 – 59 30 54.5 (Assumed Mean)
40 – 49 25 44.5
30 – 39 15 34.5
20 – 29 10 24.5
10 – 19 8 14.5
0-9 6 4.5
30
The question of the Assumed Mean (AM) settled, we determine the correction which must be applied to the AM
in order to get Mean (M).
APPROACH 1
Steps
- We fill in the x’ column. Here enter the deviations of the mid-point (MP) of the different steps measured
from the Assumed Mean (AM) in units of class-interval. Thus 64.5, the midpoint of 60-69, deviates from
54.5, the assumed mean (AM), by 10 interval; and “10” is placed in the x’ column opposite 64.5. e.t.c.
- The x’ column completed, we compute the f x’ column. Here each x’ is multiplied or “weighted” by the
appropriate (f). All f x’on the intervals above (greater than) the AM are positive; and all f x’ on intervals
below (smaller than) the AM are negative, since the signs of the f x’ depend upon the signs of the x’.
- From the f x’ column the correction (C) is obtained.
- The sum of the positive values / scores in the f x’ column is +1620; and the sum of the negative values /
scores is -1470.
- There are therefore +150 more plus f x’ values / scores than minus (the algebraic sum is 150); and +150
divided by 162 (N) gives us 0.925 or 0.93, which is the correction (C) in units of class interval.
Scores M.P F x1 Fx1

100 – 109 104.5 5 50 250
90 – 99 94.5 9 40 360
80 – 89 84.5 14 30 420
70 – 79 74.5 19 20 380
60 – 69 64.5 21 10 210
50 – 59 54.5 30 0 +-1620
40 – 49 44.5 25 -10 -250
30 – 39 34.5 15 -20 -300
20 – 29 24.5 10 -30 -300
10 – 19 14.5 8 -40 -320
0-9 4.5 6 -50 -300
N = 162 -1470
150
- When the correction (C) is added to the Assumed Mean (AM) then we get our Mean (M)
Assumed Mean + Correction = 54.5 +
0.93
55.43
Mn = 55.43
APPROACH II
Mean = A.M + Ci (Assumed Mean + Correction x the Class Interval
Note;
If we multiply C (0.093) by i, the length of the interval (here 10), the result of Ci (0.93), the score correction, or
the correction in score units.
When 0.93 is added to 54.5, the Assumed Mean (AM), the result is the Actual Mean, 55.43.
Procedure
Scores M.P F X’ Fx’
31
100 – 109 104.5 5 5 25
90 – 99 94.5 9 4 36
80 – 89 84.5 14 3 42
70 – 79 74.5 19 2 38
60 – 69 64.5 21 1 21
50 – 59 54.5 30 0 +-162
40 – 49 44.5 25 -1 -25
30 – 39 34.5 15 -2 -30
20 – 29 24.5 10 -3 -30
10 – 19 14.5 8 -4 -32
0-9 4.5 6 -5 -30
N = 162 -147
15
Mn = A.M + Ci
Mn = ∑ fx’ = 15
N 162
C = 15 = 0.093
162
i = 10 = 0.093 x 10 = 0.93
Mn = 54.5
0.93+
55.43
Mn = 55.43
MEDIAN
Median is that point on the scale of scores above which 50% of the scores lie or fall and below which 50% of the
scores lie or fall.
The median occupies the middle position in the distribution of the scores.
According to Lindquist “Median” is that point in the scale of scores below which half of the scores (i.e. 50%) lie
and above which another half of the scores (50%) lie.
Median is that point which divides the whole distribution into two equal halves i.e; 50% above and 50% below.
For this reason it is called the balancing point of the distribution.
Calculate the mean
Ungrouped data
Formula
Mdn = (N + 1)th
2
Mdn = Median
Calculate median for the following scores:-
32
23, 23, 22, 20, 19, 17, 16, 15, 15, 25, 18, 13, 28
Illustration
Solution:-
Re-arranging the given scores in ascending order gives us;
13, 15, 15, 16, 17, 18, 19, 20, 22, 23, 23, 25, 28
Now applying the formula

Mdn = (N + 1)th term
2
Mdn = (13 + 1)th term
2
Mdn = (14)th term

2
Mdn = 7th term
Mdn = 19 (Count from L to R to get 7th term).
Hence = Mdn = 19
Calculation of the Median when Data are ungrouped:

When ungrouped scores / data or other measures are in place the median is the midpoint in the series.
Two situations arise in the computation of the median from ungrouped data:
(a) When N is odd.
(b) When N is even.
When N is odd:
Suppose we have the following integral “mental ages”: 7, 10, 8, 12, 9, 11, 7, calculate from seven performance
tests. If we arrange these seven scores in order of size. 7, 7, 8, (9), 10, 11, 12
The median is 9.0 since 9.0 is the midpoint of that score which lies midway in the series.
Calculation:
There are three scores above, and three scores below 9, and since a score of 9 covers the interval 8.5 to 9.5, its
midpoint is 9.0.
This is the median.
When scores are Even:
Now if we drop the first scores of 7 the series contains six scores;
7, 8, 9, 9.5, 10, 11, 12
Therefore the median is 9.4
Counting three scores from the beginning of the series, we complete score 9 (which is 8.5 to 9.5) to reach 9.5, the
upper limit of score 9.
In like manner, counting three scores from the end of the series, we move through score 10 (10.5 – 9.5) reaching
9.5, the lower limit of score 10.
Formula:
Median = the (N + 1)th
2
First case:
33
Mdn = 7+1 = 8 or 4th
2 2
Counting from either end of the series, that is, 9.0 (midpoint 8.5 to 9.5).
TOPIC 5: GROUPED DATA

Formula
Mdn = l + ( N/2 - f) i
fm
l = Exact lower limit of the class interval upon which the median lies.
N
/2 = One- half the total number of scores
f = Sum of all scores / frequencies lying below the class interval upon
which the median falls / lies
fm = frequency within the interval upon which the Median falls
i = length of class interval
Illustration
Scores f X (MP)
100 – 109 5 104.5
90 – 99 9 94.5
80 – 89 14 84.5
70 – 79 19 74.5
60 – 69 21 64.5
50 – 59 30 54.5
40 – 49 25 44.5
30 – 39 15 34.5
20 – 29 10 24.5
10 – 19 8 14.5
0-9 6 4.5
N = 162
Mdn = l+ N -f i
2___
fm
Mdn = 49.5 + 162 - 64 i
2____
30
l = 49.5 Mdn = 49.5 + (81 - 64) i

30
N = 81
2
f = 64
fm = 30 Mdn = 49.5 + 17 x 10
30
34
Mdn = 49.5 + (0.57) x 10 = 5.67
Mdn = 49.5
5.67
55.17
Mdn = 55.17
MODE
Mode is the score which occurs with greatest frequency.
According to Crow and Crow “The score in a given set of data that appears most frequently is called the mode.
In a simple ungrouped data, mode is that single measure which occurs most frequently.
It is represented by (Mo).
Mode is the most common item of a series. It represents the most typical value of a series. A value which is in
fashion. One speaks of the average student, the most common game, the common man, or the typical problem
e.t.c
Computation
Ungrouped Data
Compute mode from the following data:
8, 9, 9, 13, 14, 17, 16, 17, 16, 18, 20, 17
From the scores given above, 17 is the most often recurring measure and therefore it is the crude mode.
Grouped Data
However, a simpler formula for approximating the true mode, when the frequency distribution is symmetrical or
at least not badly skewed is:-
Mode = 3 median – 2 mean
That is, mode equals three times the median minus two times the mean.
Example;
Mode = 3mdn - 2mn
= 3 (55.17) – 2 (55.43)
= 165.51 – 110.86
Mode = 54.65
APPLICATIONS OF THE MEASURES OF CENTRAL TENDENCY
WHEN TO USE:-
MEAN
- When the distribution is normal. Meaning that when some of the scores are missing, we should not use the
mean.
- When we are concerned with a representative score.
- When we want to gather correct and real information.
- When scores are scattered.
- When we have to compute the mode.
- When we have to know the exact mid-point.
MEDIAN
- When a quick and easily computed measure is desired.
- When distributions are badly skewed i.e; when one or more extreme measurements are at one side of the
distribution.
- When an incomplete distribution is given, i.e. some of the C.I. have no frequency.
- When we are not concerned with accuracy.
- When we have to know the exact mid-point.
35
MODE
- When the distribution is incomplete
- When the rough estimate of central tendency is required.
- When we want to know the fashion or the most recurring measure.
- When the quickest estimate of average is desired.
TOPIC 6: MEASURES OF VARIATION

Measures of variation are also known measures of variability, measures of dispersion, and measures of spread.
According to Fisher, R.A. (1948), statistics is the study of variation. The statistician has a great concern to the
variation in the events of nature. This variation is of two types:-
1. Between and among variation and
2. Within variation.
1. Between or among variation is studied with the help of measures of central tendency that is; mean, median
and mode. This concept has been illustrated by the following example;
Groups Scores Means

A. 10, 12, 14, 16, 18 Mn = 14
B. 0, 0, 0, 0, 70 Mn = 14
C. 14, 14, 14, 14, 14, Mn = 14
The above illustration indicates that there is no difference among the three groups, since the mean is the same.
However looking at the scores it is clear that there is a difference. This variation can be attributed as the second
type.
2. Within variation among the groups. The sets of scores above indicate that there is no difference among the
three groups but there is variation within the groups. Group C is homogeneous while group B is heterogeneous
or the scores have spread across. It may therefore be said that group C has no variation while group B has
high variation. This homogeneity or within variation of the three groups can be studied with the help of
measures of variability.
It may be stated that measures of central tendency and measures of variability are complementary in studying
between and within variation of the groups and samples.
The characteristics of a distribution of scores is variously referred to as dispersion, spread, scatter deviation,
homogeneity or heterogeneity and variability or variation. Theoretically, it is also known as sampling error.
Measures of variability have been classified into four as follows:-
1. The range
2. The Quartile deviation or Q
3. The Mean deviation or average deviation MD/AD
4. The Standard Deviation (σ / SD) or Variance (σ2)
1. The Range
It is a very simple and quick measure of variability.
The range only takes account of the scores in the extreme and in a given distribution. For this reason, it is a very
weak measure of variation. Simply because
36
Limitation of Range.
It does not indicate the variability of all sets of scores because it considers only the highest and lowest scores,
when other scores are not taken into consideration.
Illustration.
(1) 0, 5, 5, 5, 5, 55
(2) 0, 7, 10, 29, 35, 55
The range of the two groups are as follows:-

R1 = (55 – 0) + 1 = 56
R2 = (55 – 0) + 1 = 56
THE QUARTILE DEVIATION (Q)

The quartile deviation (Q) is one – half the scale distance between the 75th and 25th percentiles in a frequency
distribution. The 25th percentile or Q1 is the first quartile on the score scale, the point below which lie 25% of
the scores. The 75 percentile of Q3 is the third quartile on the score scale, the point below which lie 75% of the
scores.
Formula.
Q = Q3 -Q1
2
To find Q, it is clear that we must first compute the 75th and 25th percentiles.
Formula for Q1,

Q1 = l +i (N/4 - Cumfi)
Fq
l - is the exact lower limit of the interval in which the quartile is likely
to fall.
i - Class interval
Cumfi - is the cumulative frequencies upto the class interval where the
quartile is likely to fall.
fq - the frequency on the class interval where the quartile is likely to
fall.
Formula for Q 3,
Q 3 = l + i ( 3/4N – Cumfi)
fq
Illustration
The following is a grouped frequency distribution;
C.I f
(scores)
195 – 199 1
190 – 194 2
185 – 189 4
180 – 184 5
175 – 179 8
37
170 – 174 10 (30)
165 – 169 6
160 – 164 4
155 – 159 4 (10)
150 – 154 2
145 – 149 3
140 – 144 __1__
N = 50
Quartile 1 (Q1)
Procedure;
- To find Q1, one has to calculate ¼ of 50 which is 12.5.
- Add the scores in the frequency column starting from the lower end going up to the point where 12.5 is
likely to fall.
- When the scores have been added in their order, the first 4 intervals (140 -144 through 155 – 159) contain
10 scores. This falls in the class interval 159.5 which is the lower limit for the class interval 160-164.
- Q1 must fall on the next interval (160-164) which contains 4 scores on the frequency column.
Therefore;
Q1 = 159.5 + 5 (12.5 – 10)
4
= 2.5 = 0.625
4
= 0.625 x 5
= 3.125
= 159.5 + 3.125
Q1 = 162.62
Quartile 3 (Q3)
- To find Q3, one has to calculate ¾ of 50 which is 37.5.
- Add the scores in the frequency column starting from the lower end going up to the point where 37.5 is
likely to fall.
- When the scores have been added in their order, the first 8 intervals (140 -144 through 170 – 174) contain
8 scores. This falls in the class interval 174.5 which is the lower limit for the class interval 175-179.
- Q3 must fall on the next interval (175-179) which contains 8 scores on the frequency column.
Therefore;
Q3 = 174.5 + 5 (37.5 – 30)
8
= 7.5
8
= 0.9375 x 5
= 4.6875
= 179.1875
Q = 179.19
Therefore to calculate quartile deviation (Q);
Q = Q3 – Q1
38
2
Q = 179.19 – 162.62 = 16.57
2 2
Q = 8.28
THE AVERAGE DEVIATION (AD) / MEAN DEVIATION (MD)

Un-grouped Data
The average deviation (AD) or Mean deviation (MD) is the mean of the deviations of all of the scores in a series
taken from their mean.
In averaging deviations to find the AD, no account of signs is taken, instead all the deviations whether plus or
minus are treated as positive.
Example:-
The mean of scores for 5 students is 10.
Name Scores
Jane 6
James 8
Janet 10
Mauren 12
Juma 14
Calculate the average deviation for the five students mentioned above.
Formula:
AD / MD = ∑ I x’I
N
Procedure;
1. Create a column for the mean (x) as shown below
2. Create a column for x deviation (x’) as shown below
NOTE: To get x deviation (x’) you minus the corresponding score from the mean.
Name score x x’
Jane 6 10 -4
James 8 10 -2
Janet 10 10 0
Mauren 12 10 +2
Juma 14 10 +4
12
NOTE that when summing up the scores in the x deviation column as indicated above, it gives +12
AD / MD = ∑ I x’I
N
=
12
5
AD / MD = 2.4
Grouped Data:
Below are scores for a grouped frequency distribution.
39
C.I.
f
195 – 199 1
190 – 194 2
185 – 189 4
180 – 184 5
175 – 179 8
170 – 174 10
165 – 169 6
160 – 164 4
155 – 159 4
150 – 154 2
145 – 149 3
140 – 144 1
N = 50
Mean = 170.80
Calculate the average deviation for the data provide above

1. The first step is to calculate the mean for the data when mean is not provided as shown below.
2. The second step is to have the x column for the midpoints as shown below.
3. The third step is to have the x deviation column (the difference between the score on the x column and
the provided mean) as shown below.
4. The fourth step is to create the fx’ deviation column (fx’ column is the score on the f column multiplied
with the score on the x deviation column) as shown below.
5. The fifth step is to sum-up all the scores in the fx’ column as shown below.
Formula:
AD or MD = ∑ I fx’ I
N
Illustration;
C.I. (x) (M.P)

f x’ fx’
195 – 199 197 1 26.20 26.20
190 – 194 192 2 21.20 42.40
185 – 189 187 4 16.20 64.80
180 – 184 182 5 11.20 56.00
175 – 179 177 8 6.20 49.60
170 – 174 172 10 1.20 12.00
165 – 169 167 6 -3.80 -22.80
160 – 164 162 4 -8.80 -35.20
155 – 159 157 4 -13.80 -55.20
150 – 154 152 2 -18.80 -37.60
145 – 149 147 3 -23.80 -71.40
140 – 144 142 1 -28.80 -28.80
N = 50 502.00
40
Therefore;
AD/MD = 502.00
50
AD/MD = 10.04
THE STANDARD DEVIATION (SD/ σ)
The Standard Deviation is the most widely used statistic because;
1. It gives opportunity for further statistical calculations.
2. It is least varied if calculated from sets of scores obtained from different samples drawn at random from
the same population.
3. The Standard Deviation is less affected by sampling errors and is therefore more stable.
4. Operationally, the standard deviation is the square root of the arithmetic mean of the squared deviations
of scores taken from their mean.
5. The conventional symbol for the Standard Deviation is the Greek letter sigma (σ).
Ungrouped data:
Formula;
SD / σ = ∑ x’ 2
N
Example:-
The mean of scores for 5 students is 10.
Name Scores
Jane 6
James 8
Janet 10
Mauren 12
Juma 14
Calculate the Standard Deviation for the five students mentioned above.
Procedure;
1. Create a column for the mean (x) as shown below.
2. Create a column for x deviation (x’) as shown below.
3. Create a column for the (x’2) as shown below.
4. No account of signs is taken, instead all the deviations whether plus or minus are treated as positive.
NOTE: To get x deviation (x’2) you square each score in the x’2 column as shown below;
Name score x x’ x’2
Jane 6 10 -4 16
James 8 10 -2 4
Janet 10 10 0 0
Mauren 12 10 +2 4
Juma 14 10 +4 16
12 40
NOTE that when summing up the scores in the x’ deviation column as indicated above, it gives +40
2
SD / σ = ∑ x’ 2
N
41
σ= 40 = 8
5
σ= 2.83
Grouped Data:
Formula;
SD/ σ = ∑ I x’ fx’ I
N
The following are scores of a grouped frequency distribution;
C.I. f
195 – 199 1
190 – 194 2
185 – 189 4
180 – 184 5
175 – 179 8
170 – 174 10
165 – 169 6
160 – 164 4
155 – 159 4
150 – 154 2
145 – 149 3
140 - 144 1
M = 170.80 N= 50
Calculate the Standard Deviation for the data provided above.

Procedure;
Step 1
- Calculate the mean for the frequency distribution given (when mean is not given).
Step 2
- Create a column for the midpoints (x) as shown below.
Step 3
- Create a column for the x’ deviation (get the difference between the score on the x column and the mean).
Step 4
42
- Create a column for the fx’ deviation ( it is multiplication of the score on the fx’ column with the score on
the x’ deviation column) as shown below.
Step 5
- Create a column for the x’ fx’ deviation (this is the multiplication of the score on the x’ deviation column
with the score on the fx’ deviation) as shown below.
Step 6
- Calculate the sum of all the scores in the x’fx’ deviation column as indicated below. No account of signs
is taken, instead all the deviations whether plus or minus are treated as positive.
Illustration
C.I. x (mp) f x’ fx’ I x'fx’ I
195 – 199 197 1 26.20 26.20 686.44

190 – 194 192 2 21.20 42.40 898.88
185 – 189 187 4 16.20 64.80 1049.76
180 – 184 182 5 11.20 56.00 627.20
175 – 179 177 8 6.20 49.60 307.52
170 – 174 172 10 1.20 12.00 14.40
165 – 169 167 6 -3.80 -22.80 86.64
160 – 164 162 4 -8.80 -35.20 309.76
155 – 159 157 4 -13.80 -55.20 761.76
150 – 154 152 2 -18.80 -37.60 706.88
145 – 149 147 3 -23.80 -71.40 1699.32
140 - 144 142 1 -28.80 -28.80 829.44
M = 170.80 N= 50 7978.00
SD/ σ = ∑ I x’ fx’ I
N
= 7978.00
50
= 159.56
SD/ σ = 12.63
WHEN TO USE THE VARIOUS MEASURES OF VARIABILITY
RANGE
- When the data are too scant or widely scattered. This is to justify the computation of a more precise measure
of variability.
- When knowledge of extreme scores or total spread is required.
43
Use the Q:
- When the median is the measure of central tendency
- When there are scattered or extreme scores which may influence the SD proportionately.
- When the concentration around the median is of primary interest.
USE THE AD:-
- When it is desired to measure weight of all deviations from the mean according to their rise.
- When extreme deviations would influence SD unduly.
USE THE SD:-

- When the statistic having the greatest stability is required.
- When extreme deviations exercise proportionally greater effect on the variability.
- When coefficient of correlation and other statistics are to be computed subsequently.
VARIANCE
- The variation or spread of a score from the mean in a distribution is called variance.
- Subtracting the mean from each score in a distribution determines how far each score spreads from the
mean.
Illustration;
Suppose in a class test; Ann scores 6, Betty 8, Maureen 10, James 12 and John 14. The mean for the five scores
is 10. What will be the variance?
Procedure
1. Subtract the mean from each score in the distribution; (x – m)
2. Square the differences.
3. Get the sum of the squared variations. ∑ ( x – m)2
4. Divide the sum of the squared variations by N.
= ∑ (x’) 2
N
Student score mean variation (x’)2
Student Score Mean Variation (x’) (x’)2

Ann 6 10 -4 16
Betty 8 10 -2 4
Maureen 10 10 -0 0
James 12 10 +2 4
John 14 10 +4 16
40
∑ ( x - m)2
16
4
0
4
16
40
V = 40
5
44
=8
TOPIC 7: STANDARD SCORES

Introduction
Two common defects of mental measurement that are often pointed out by critics are:-
1. Lack of absolute zero point.
2. Absence of uniform scaling units.
Measurement experts have tried to meet both these objections in a number of ways. One method is that of
treating the mean of the group as the zero point and the standard deviation of the group as the unit of scaling.
This method enables us to convert any score into the standard score also known as Z-score or sigma score.
Illustration;
In a group of fifty students the mean of English test is 47 while the standard deviation is 6. Three students A, B
and C scored 35, 50 and 65 respectively. Calculate their standard scores.
Formula;
Z-score = x – x
σ
x = obtained score
x = mean of the group
σ = Standard deviation
Thus
1. The standard score for A will be;
35 – 47
6
= -12
6
45
=-2
2. Standard score for B will be;
50 – 47
6
= 3
6
= +.5
3. Standard score for C will be;
65 – 47
6
= 18
6
= +3
Continuation:-
The Z-score (Sigma) or Standard Score
- In describing a score in a distribution, its deviation from the mean expressed in standard deviations units
is often more meaningful than the score itself.
- The unit of measurement is the standard deviation.
Z= x–x
σ
46
Where x = raw score
x = Mean
σ = Standard deviation
x = ( x - x ) score deviation from the mean
Example A Example B
x = 76 x = 67
x = 82 x = 62
σ =4 σ = 5
z = 76 - 82
4
= -6
4
= - 1.50;
z = 67 - 62
5
= + 1.00
- The raw score of 76 in example A may be expressed as σ z – score of -1.50, indicating that 76 is 1.5
standard deviations below the mean.
- The score of 67 in example B may be expressed as a sigma score of 1.00, an indication that 67 is one
standard deviation above the mean.
- In comparing or averaging scores in distributions where total points may difference, the researcher using
raw scores may create a false impression of a basis for comparison.
- A z-score makes possible a realistic comparison of scores and may provide a basis for equal weighting of
the scores.
47
- On the sigma scale the mean of any distribution is converted to zero and the standard deviation is equal to
1.
- For example, a teacher wishes to determine a student’s equally weighted average (mean) achievements
on an algebra test and on an English test.
Subject Test Score Mean Highest Possible Standard

Score Deviation
Algebra 40 47 60 5
English 84 110 180 20
- It is apparent that the mean of the two raw test scores would not provide a valid summary of the student’s
performance for the mean would be weighted overwhelmingly in favor of the English test score.
- The conversion of each test score to a sigma score makes them equally weighted and comparable for both
test scores have been expressed on a scale with a mean of zero and a standard deviation of one.
Z= x–x
σ
Algebra z-score = 40 - 47 = -7 = - 1.40

5 5
English z-score = 84 - 110 = -26 = - 1.30

20 20
- On an equally weighted basis, the performance of the student was fairly consistent: 1.40 standard
deviations below the mean in algebra and 1.30 standard deviations below the mean in English.
- Because the normal probability table describes the percentage of area lying between the mean and
successive deviation units under the normal curve, the use of sigma scores has many other useful
applications to hypothesis testing, determination of percentile ranks and probability judgments.
The T Score (T)
48
Formula;
x - x
T = 50 + 10 or 50 + 10z
σ
- The z-score is most frequently, but it is sometimes awkward to have negatives or scores with decimals.
- Therefore, another version of a standard score, the T-score has been devised to avoid some confusion
resulting from negative z-score (below the mean) and also to eliminate decimal values.
- Multiplying the z-score by 10 and adding 50 results in a scale of positive whole number values.
- Using the scores in the previous example,
T = 50 + 10z:
Algebra T = 50 +10 (-1.40) = 50 + (-14) = 36
English T = 50 + 10 (-1.30) = 50 + (-13) = 37
- T-scores are always rounded to the nearest whole number. A z-score of +1.27 would be converted to a T-
score of 63.
T = 50 + 10 (+1.27) = 50 + (+12.70) = 62.70 = 63
- Convert the z-score just calculated for the person selected from the sample into T- scores.
Percentile Rank
- The percentile rank is the point in a distribution below which a given percentage of scores fall.
- If the 80th percentile rank is a score of 65, 80% of the scores fall below 65.
- The median is the 50th percentile rank, for 50% of the scores fall below it.
- When N is small, the definition needs an added refinement.
- The percentile rank is the score in the distribution below which a given percentage of the scores falls,
plus one half the percentage of space occupied by the given score.
Scores: 50 , 47, 43, 39, 30
- On inspection, it is apparent that 43 is the median, or occupies the 50 th percentile rank.
- Fifty percent of the scores should fall below it, but infact only two scores out of five fall below it. (43).
- This indicates that 43 has a percentile rank of 40. But by adding the phrase “plus one half the percentage
of space occupied by the score,” the calculation is reconciled.
- 40% of scores fall below 43; each score occupies 20% of the total space. 40% + 10% = 50 (true
percentile rank).
49
- When N is large, percentile ranks are rounded to the nearest whole number, ranging from the highest
percentile rank of 99 to the lowest of zero.
Formula;
Percentile rank = 100 – (100RK – 50)
N
Illustration;
- Jones is ranked 27th in his senior class of 139 students. What will be his percentile rank;
100 – (2700 – 50) = 100 – 19 = 81

139
TOPIC 8: THE NORMAL DISTRIBUTION
Definition
A normal distribution is one in which the majority of cases / scores are located in the middle of the scale while a
small number of cases are located at both extremes of the scale.
Normal Probability Curve

If we divide the area under each curve [the area between the curve and the X axis] by a line drawn
perpendicularly through the central high point to the base line, the two parts thus formed will be similar in shape
and very nearly equal in area. It is clear, therefore, that each figure exhibits almost perfect bilateral symmetry.
This bell-shaped figure is called the normal probability curve, or simply the normal curve, and is of great value
in mental measurement.
50
Characteristics of a Normal Curve
- A normal curve is always symmetrical, that is, the right half od the curve is equal to the left side of the
curve.
- A normal curve is unimodal and the mode is always at the centre of the distribution. In such a case, the
median, mean and mode are numerically identical meaning that they fall at the centre of the distribution.
- A normal curve is asymptotic to the x-axis. Hence a normal curve never touches the baseline, no matter
how far the curve is stretched.
- In a normal curve, the highest ordinate is at the centre. All other ordinates on both sides of the
distribution are smaller.
- A normal curve is continuous
Applications of the Normal Probability Curve

The normal probability curve can be applied under the following cases:-
- To determine the percentage of cases in a normal distribution within given limits.
- To find the limits in any normal distribution which include a given percentage of the cases.
- To compare two distributions in terms of “overlapping”
- To determine the relative difficulty of test questions, problems and other test items.
- To separate a given group into sub-group according to capacity, when the trait is normally distributed.
MEASURING DIVERGENCE FROM NORMALITY

I. Skewness A distribution is said to be "skewed" when the mean and the median fall at different points in the
distribution, and the balance (or center of gravity) is shifted to one side or the other—to the left or right side. In
a normal distribution, the mean equals the median exactly and the Skewness is zero. The more nearly the
distribution approaches the normal form, the closer together are the mean and median, and the less the skew-
ness.
Negative skewness
A Distributions are said to be skewed negatively or to the left when scores are massed at the high end of the
scale (the right end) and are spread out more gradually toward the low end (or left) as shown in the figure
below.
51
Mean Media
n
Negative skewness: to the left
Positive skewness
Distributions are skewed positively or to the right when scores are massed at the low (or left) end of the scale,
and are spread out gradually toward the high or right end as shown in the figure below.
Median Mean
Positive skewness: to the right

Note that the mean is pulled more toward the skewed end of the distribution than is the median. In fact, the
greater the gap between mean and median, the greater the skewness. Moreover, when skewness is negative, the
mean lies to the left of the median; and when skewness is positive, the mean lies to the right of the median.
Formula;
Sk = 3 (mean – median)
σ
2. Kurtosis
The term "kurtosis" refers to the "peakedness" or flatness of a frequency distribution as compared with the
normal, A frequency distribution more peaked than the normal is said to be leptokurtic; one flatter than the
normal, platykurtic. The figure below shows a leptokurtic distribution and a platykurtic distribution plotted on
the same diagram around the same mean.
52
Leptokurtic [A], normal or mesokurtic [B], and platicurtic [C] curves
A normal curve (called mesokurtic) also has been drawn in on the above figure to bring out the contrast in the
figures, and to make comparison easier.
Formula;
Ku = σ
[P90 – P10]
TOPIC 9: MEASURES OF ASSOCIATION

Measures of association help to determine the relationship between variables
CORRELATION
Correlation helps to establish the relationship between two or more than two variables.
It is classified into two based on the direction of the variables;
 Positive correlation
 Negative correlation
POSITIVE CORRELATION
If one variable increases the corresponding variable also increase. Example, when petrol price goes
up, it leads to increase in the bus fare.
NEGATIVE CORRELATION
If one variable increases the corresponding variable decreases. Example when the supply of maize is
more in the market, the price of maize flour in the supermarket reduces.
DEGREE OF CORRELATION
They are two;
53
(a) Perfect positive correlation
(b) Perfect negative correlation
PERFECT POSITIVE CORRELATION

When one variable is perfectly correlated positively with another variable, the increase or decrease in
that variable leads to increase or decreases in the corresponding variable.
Example,
If a student scores the highest marks in English and again the same student scores the highest marks
in Kiswahili then there is a perfect positive correlation.
PERFECT NEGATIVE CORRELATION

When one variable is perfectly correlated negatively with another variable, the increase or decrease in
that variable leads to increase or decrease in corresponding variable, then there is a perfect negative
correlation.
Example,
If a student scores the highest marks in English and then the same student scores the lowest marks in
Kiswahili, there is a perfect negative correlation.
The method of correlation was introduced by Spearman who adopted the Rank Difference Method to
calculate the correlation between two variables.
Rank Difference Method
Formula;
rho = 1 – 6 ∑ D2
N (N2 – 1)
D – Represents the differences of ranks for the two variables.

N – Represents the number of scores / cases.
54
ILLUSTRATION;
The following are scores of a student in two examinations, X and Y.
X Y
78 84
36 54
98 36
25 60
75 36
80 54
25 92
62 36
36 62
44 68
Calculate the correlation coefficient for X and Y scores using Spearman’s rank difference method.
Procedure;
1. Have a column for X scores as shown below.
2. Have a column for the Y scores as shown below.
3. Have a column for ranking the scores in the X (R1) column as shown below.
4. Have a column for ranking the scores in the Y (R2) column as shown below.
5. Have a column for the rank differences (R1 – R2) = D
6. Have a column for D2
7. Get the summation of D2 (∑D2)
No Students Marks Marks R1 R2 D D2

Math English
1. A 78 84 3.0 2.0 1.0 +1.00
2. B 36 54 7.5 6.5 1.0 +1.00
3. C 98 36 1.0 9.0 -8.0 +64.00
4. D 25 60 9.5 5.0 +4.5 +20.25
5. E 75 36 4.0 9.0 -5.0 +25.00
6. F 80 54 2.0 6.5 -4.5 +20.25
7. G 25 92 9.5 1.0 -8.5 +72.25
8. H 62 36 5.0 9.0 -4.0 +16.00
55
9. I 36 62 7.5 4.0 -3.5 +12.25
10. J 44 68 6.0 3.0 -3.0 +9.00
∑D2 = 241.00
Formula;
rho = 1 – 6 ∑ D2
N (N2 – 1)
= 1 – 6 x 241
10 (100-1)
= 1 - 1446
10 (99)
= 1- 1446
990
= 1 – 1.4606
rho = -.46
Pearson’s Product Moment Correlation Method (r)

Another method used to calculate correlation co-efficient is Pearson’s Product Moment Correlation (r).
Formula;
Pearson’s r = ∑ (x – x ) (y – y)
N (Sx) (Sy)
r = Correlation co-efficient
S = Summation
x-x = Difference between the mean for each score on the x test.
y- y = Difference between the mean for each score on the y test.

N = Number of scores / cases.
Sx = Standard Deviation of x test.
Sy = Standard Deviation of y test
Illustration;
56
The following are scores of two tests, X and Y.
X Y
50 60
60 80
70 90
80 70
90 100
Calculate the co-relation co-efficient for the above tests.
Procedure;
1. Create a column for the X scores as shown below.
2. Calculate the mean for the X scores as shown below.
3. Create a column for the mean of the X scores as shown below.
4. Create a column for the difference between the scores on the X column and the mean as shown
below.
5. Create a column for the Y scores as shown below.
6. Calculate the mean for the Y scores as shown below.
7. Create a column for the mean of the Y scores as shown below.
8. Create a column for the difference between the scores on the Y column and the mean as shown
below.
9. Create a column for the product of (x – x ) (y – y) as shown below
x x x–x y y (y – y) (x – x) (y – y)
50 70 -20 60 80 -20 400
60 70 -10 80 80 0 0
70 70 0 90 80 10 0
80 70 10 70 80 -10 -100
90 70 20 100 80 20 400
700
Pearson’s r = ∑ (x – x ) (y – y)
N (Sx) (Sy)
57
= 700
N (Sx) (Sy)
Note that the standard deviation for X and Y scores should be calculated.
SD for x = ( square root of each scores in the (x - x) column as shown below)
20 x 20 = 400
10 x 10 = 100
0x 0 = 0
10 x 10 = 100
20 x 20 = 400
1,000
1000
SD = √
5
= √200
= 14.14
Note that the standard deviation for the y scores should also be calculated. However in the current case
the scores in the y – y column are the same as the scores in the
x – x column. It therefore means that the standard deviation for the y scores is the same as standard
deviation for the x scores.
Sx = 14.14
Sy = 14.14
Substitute the various values in the formula given above.
Thus, r = 700
(5) (14.14) (14.14)
= 700
5 x 199.9396
= 700
999.698
r = 0.70
58
Pearson’s Product Moment Correlation Method, rxy Bivariate
Formula
rxy = 𝑛∑𝑥𝑦 – {∑x} {∑y}
{𝑛∑𝑥 2 − {∑x}2 {n∑y2 –{∑y}2}
rxy = Correlation coefficient

∑x = sum of all scores in the x variable
∑x2 = summation of the scores in the x variable after being squared
∑y = sum of all scores in the y variable
∑y2 = summation of the scores in the y variable after being squared
n = number of scores
∑xy = Summation of the products of xy scores
Illustration;
The following are scores of two tests, X and Y;
X Y
11 11
13 17
14 15
15 23
17 19
Procedure;
1. Sum of all the scores in the x column as shown below.
2. Sum of all the scores in the y column as shown below.
3. Sum of all the scores in the x column after being squared as shown below.
4. Sum of all the scores in the y column after being squared as shown below.
59
5. Sum of the products of scores in the x column with the corresponding scores in the y column as
shown below.
∑x = 11+13+14+15+17 = 70
∑y = 11+17+15+23+19 = 85
∑x2 = 121+169+196+225+289 =1000
∑y2 = 121+289+225+529+361 = 1525
n =5
∑xy = (11) (11) + (13) (17) + (14) (15) + (15) (23) + (17) (19)
=121+221+210+345+323 = 1220
rxy = 5(1220) − (70) (85)

[(5) (1000)−(70)2] [(5) (1525) – (85)2]
= 6100-5950
(5000 − 4900) (7625 – (7225)
= 150
(100) (400)
= 150
200
rxy = +.75 → The coefficient of correlation is positive and high.
Interpretation of the correlation coefficient

Perfect positive correlation is +1.00, chance correlation is 0
Perfect negative correlation is -1.00
According to Rugg’s general principles of r.
r is negligible or indifferent when it is between .15 -.20.
60
r is low between .20 - .35
r is marked between .35 - .60
r is high when above .70
USES OF THE CORRELATION OF CO-EFFICIENT
1. Prediction
It is used to predict the success one will achieve in his further Careers.
Example
Marks obtained by a student in KCSE can be compared with those marks obtained in the College
examination to predict his success in the completion of the University degree programme.
2. Reliability
It is used to test reliability. The coefficient correlation informs us immediately and precisely on the extent
a test gives the same results on two successive application to the same individuals.
3. Validity
A test is a worthy / value can be obtained through correlation When a test is constructed the question
being asked is, “what does it test”.
This question is answered by the magnitude of the coefficient with various criteria.
4. Test construction
Whenever a test is constructed, there is always the question of whether each element of the test is
related to other elements or to the tests as a whole and as to whether each element is related to the
criterion chosen.
These relationships are all examined through the technique called correlation coefficient.
THE GRAPHICAL PRESENTATION OF UNGROUPED DATA
61
For the data which is not grouped into a frequency distribution, we use the following common graphs or diagrams:
(a) Pictographs or Pictograms.
(b) Bar-graphs on Bar Diagrams
(c) Circle or pie graphs/diagrams
(d) Line graphs.
Let us have an idea of all these four types of graphical representation.

PICTOGRAPHS OR PICTOGRAMS
Pictographs or pictograms are the graphs or diagrams used for presenting an ungrouped statistical data in pictoral
(picture like) form: A picture is said to be worth more than 100 words spoken or written. Thereby the pictorial
representation of the data is always considered better than its description in the words and figures. Let us illustrate
this fact through an example:
Example 1
In a data collection process, it was found that there are 100 students in Class VI; 85 students in Class VII, 80
students in VIII, 90 students in IX and 70 student in X, Present this data first into a tabular form and then in
pictoral form.
PRESENTATION OF DATA IN A TABULAR FORM

Class VI VII VIII IX X
Number of Students 100 85 80 90 70
PRESENTATION OF DATA IN THE PICTORIAL FORM

Step 1: Let us decide to represent a student with a picture (indicative of a student figure)
Step 2: For the sake of brevity and simplicity let us have a scale, a picture (of student) equal to 20 students in
number.
Following these steps, the pictorial presentation (pictograph) of the given data will be as under:
Example 2
On a parking place, at New Delhi Railway Station, the following statistical data (about the number of cars from
different states) was collected and arranged in tabular form. Make a pictogram of this tabular data.
62
Class Number of Student
VI 5
VII 5
VIII 4
IX 5
X 4
20 Students
State Number of Cars

Delhi 140
Uttar Pradesh 60
Punjab 30
Haryana 100
Others 70
You can now very well imagine the merits and advantages of a pictograph. A mere glimpse of the pictograph
reveal that there were maximum number of a cars parked all the Railway Station from Delhi State.
It was followed by Haryana and other states. The minimum number of cars parked were from Punjab and in this
way valuable statistical information maybe easily gathered in an interesting and pleasing way from a pictograph.
The pictograph showing the number of cars belonging to different states is presented on next page.
State Number of Cars

Delhi 7
Uttar Pradesh 3
Punjab 2
Haryana 5
Others 4
20
However, there lies some difficulties in the pictorial presentation of data especially in choosing a suitable scale
(picture for a given number of units) and its comprehension. In the above two examples we have chosen figure
63
of a student and a car representing strength 20. In both these pictographs, we may easily notice the difficulties
encountered in representing the numbers not wholly divisible by 20 i.e. 30, 50, 70, 85,90 etc. We have represented
the strength of 85 (in the first example) students with four complete pictures and fraction (only head). Similarly
is the case with the incomplete pictures of cars. Here we just have approximation and not the exact measurement
of the pictorial figures for representing the scaled fractions numerical data. This difficulty can be somewhat
removed in other forms of graphical representation of data as will be noticed soon.
BAR GRAPHS OR BAR DIAGRAMS
Instead of using pictures we can use bars (rectangles of similar breadth) for the representation of numerical data,
this mode of presentation statistical data through bars is known as bar graphs or bar diagrams. As an example, let
us try to have a bar graph of the tabulated data given in example 15.1. It may take the following shape given in
the Fig. 15.3 .
How to Draw Bar Graph?
(a) Try to use a graph paper for drawing the bar graph.
(b) On one of the axes X or Y, try to plot numerical data by choosing a proper scale and have the other variable
like classes in this example, on the other axis. Here in this example, the numerical strength of students has
been plotted on the Y-axis.
Here the number of students in the different classes are thus represented by the bars (rectangles of similar
breadth) constructed along the X-axis.
WHAT CAN BE INFERRED FROM THE BAR GRAPH?
A bar graph just shown above may provide the following information in a quite simple and quick way.
It shows the strength of students in a particular class of the school i.e. there are 70 students in class X.
 The class having highest strength i.e. class VI.

 The class having lowest strength i.e. class X.
 It also reveals that the strength of students gets decreased as we pass through the classes VI to VIII. It
once again gets increased in class IX but soon lowers down again in class
 The relative strengths of the students studying in different classes of the school may also be adjudged
easily for one or the other type of comparisons.
Example 13.3: Let us have another bar graph for further illustration. Can you think about various types of
information revealed to you just through its glimpse and useful interpretation?
64
We think you can easily infer from the above bar graph that the years of the highest and the lowest yields are
1997 and 1992 respectively and in comparison to the yield of the year 1992. There is approximately double
production in the year 1995.
FIG.13.4 BAR GRAPH SHOWING WHEAT PRODUCTION DURING DIFFERENT YEARS.
CIRCLE GRAPH OR PIE DIAGRAMS
Circle or pie graphs/diagrams provide us an opportunity to represent statistical data through the figure or a circle
and its constituents i.e., proportionate sub-divisions. These are specifically helpful in the case for which the
question of proportion is of much interest. To construct them requires a working knowledge of angle measurement
and percentages.
The process of the construction of a pie graph may be understood with the help of an example given below:
Example 13.4: 200 B.Ed. Students of a College of Education were asked to give their options for the participation
in one or the other types of co-circular activities. The preferences data was tabulated as under-the multiplication
by a line graph. Here it would make no sense whatever to turn these weekly mastery figures into a pie or bar chart.
There is also no possibility of representing these through pictograph. Similarly in the case of representing facts
65
concerning the percentage of pets, there will be no sense in displaying them through a line graph. Here the decision
for representing them through a pie graph seems quite appropriate as there stands a whole of which the different
figures concerning pets choices are collectively a part. Contrarily in the case, showing concommitant changes
occurred in a one variable, relation to the changes introduced in the other, it is always advisable to use line graph
as the mode of representation. In this way, while trying to determine how to best display the particular data one
must decide whether to graph the data, and if so, what kind of graph to use.
Activities Debate Dance Music Painting Models Excursion
Number of Students 42 36 36 12 6 68
Present this data through a pie diagram.
Solution: The steps for the construction of the required pie diagram may be outlined as below:
(a) A circle has the value of 2П (2 pie) i.e. 360°. In the present example the total sample is 200 which has to
be represented through a complete circle having 360°.
(b) The various constituents of the collected data i.e. preferences for the one or the other co-curricular
activities then may be assigned by varying values of pie in terms of the degrees as computed below.
Organisation and Presentation of Statistical Data

Debate: Number of students = 42 out of 200
Proportion out of 200-21%
Proportion out of 360° = 42 x 3600 = 75.6%
200
Dances & Music: No of Students =36 out of 200
Proportion out of 200 = 36 x 200 = 18%

200

200
Painting: No. of Student = 12 out of 200

200
200
Modeling: No. of Students 6 out of 20
66
200
200
Excursion: No. of Students = 6 out of 20

200
200
(c) Now all these above proportions 75.6°, 64.8°, 64.8°, 21.6°, 10.8° and 122.4° may be represented as the
different sectors of a whole circle with the help of the knowledge concerning measurement of angles.
(d) These may take the final following form for making the required pie diagram.
FIG. 13.5 PIE GRAPH SHOWING PREFERENCES OF B.ED STUDENTS FOR THE CO-
CURRICULAR ACTIVITIES.
Example 5: A researcher collected the data from the 100 people fond of pets and tabulated the findings as under:
Cats Dogs Snakes Turtles Fish Parrots Other birds

No. of 180 320 50 50 90 180 130
People
Represent the above data through a pictograph.
67
Solution: Following the procedure suggested in examples diagram may take the following shape.
FIG. 13.6 PIE GRAPH SHOWING PEOPLE FOND OF DIFFERENT PETS
LINE GRAPHS
Line graphs can be better used in describing the Con-committed relationships between two variables by plotting
their respective values on the X and Y axes of a graph paper (After choosing appropriate scales). Let us illustrate
this fact through examples. Example 15.6- Science students of the IX Class of a school collected data about
weather on a cold day of the month December by recording the room temperature at various hours of the day and
obtained the following line graph of the results of their survey.
FIG. 13.7 A LINE GROUP SHOWING THE TEMPERATURE ON A DECEMBER DAY. PROCESS OF
CONSTRUCTION PROCESS OF CONSTRUCTION
68
(a) Here time in house has been plotted on X-axis and the corresponding temperature in centigrade have been
plotted on Y-axis.
(b) The five small squares of the graph paper have been taken equivalent to 1 hour on X-axis and 2 centigrade
on Y-axis. The facts like 2°C was recorded at 8.00 a.m. 0.6°C was recorded at 10.00 a.m. etc. have been
plotted as the varying points and then these points have been joined by continuous straight lines (see the
placing of the points P, Q, R, S, T, U, V, W as the intersection points of paired data and their joining).
(c) The facts like 20C was recorded at 8.00 a.m. 0.60C was recorded at 10.00 a.m. etc. have been plotted as
the varying points and then these points have been joined by continuous straight lines [see the placing of
the points P, Q, R, S, T, U, V, W as the intersection points of paired data and their joining.
WHAT CAN BE INFERRED FROM THE LINE GRAPH?
Line graphs like above can reveal many facts and information about the collected data and consequently we may
be able to get answers of the queries like below:
1. For which hours during the day did the students collect data?
2. What was the high temperature of the day according to the graph?
3. Between what hours was the temperature increasing? Decreasing?
4. About what time in the morning was the temperature about too?
5. What do you predict the temperature might be at 5.00 p.m. lower than or higher than 16°C?
Example 7: The line graph given above in Fig. 13.8 depicts mastery of multiplication facts by a particular student
in the course of learning. Here the time spent in weeks for having mastery over the multiplication facts is shown
on X-axis and the achievement in terms of mastery (known through the percentage of facts mastered) is shown
on Y-axis, by choosing appropriate scales.
69
FIG 13.8 LINE GRAPH SHOWING PROGRESS ABOUT MASTERY OVER THE MULTIPLICATION
FACTS
CONCLUSION ABOUT CHOOSING A PARTICULAR GRAPHICAL MODE FOR THE

REPRESENTATION OF UNGROUPED DATA
Each of the graphical mode, pictograph, pie graph and line graph, described above have their own merits and
limitations of being utilized in the representation of a given ungrouped data at a particular occasion to furnish
useful information. Therefore, a wise decision should always be made before the employment of a particular
graphic mode in a particular situation. Take the last example of representing the data concerning mastery over the
multiplication facts.
A. THE GRAPHICAL PRESENTATION OF FREQUENCY DISTRIBUTION (GROUPED DATA)
There are four methods of representing a frequency distribution graphically:
1. The Histogram or column diagram ;

2. The Frequency Polygon.
3. The Cumulative Frequency Graph.
4. The Cumulative Frequency Percentage Curve or Ogive.
Out of these methods we would take up the most common ones namely the Histogram and the Frequency Polygon
for discussion in the following pages:
1. The Histogram: A histogram or column diagram is essentially a bar graph of a frequency distribution. The
following points, are to be kept in mind while constructing the histogram for a frequency distribution:
(a) The scores in the form of actual class limits as 19.5-24.5, 24.5-29.5 etc. are taken in the construction of a
histogram.
(b) it is customary to take two extra intervals (classes) one below and other above the given grouped intervals
or classes (with zero frequency). In the case of frequency distribution given in table 15.2, we can take
14.5-19.5 and 69.5-74.5 as the two required extra-intervals.
(c) Now we take the actual lower limits of all the class intervals (including the extra-intervals) and try to plot
them on X-axis. The lower limit of the lowest intervals (one of the extra intervals is taken at the
intersecting point of X-axis and Y-axis.)
70
(d) Frequencies of the distribution are plotted on Y-axis.
(e) Each class or interval with its specific frequency is represented by a separate rectangle. The base of each
rectangle is the width of the interval (i) and the height is the respective frequency of that class or interval.
(f) It is not essential to project the sides of the rectangles down to the base line.
(g) Care should be taken to select the appropriate units of representation along the X-axis and Y-axis, Both
X-axis as well as Y-axis should not be either too short or too long. "A good general rule for this purpose"
as suggested by Garrett is to select X and Yunits which will make the height of figures approximately
75% of its width. (1971, p. 11)
FIG. 13.9 THE HISTOGRAM OF FREQUENTLY DISTRIBUTION GIVEN IN THE TABLE 15.2 2.
1. The Frequency Polygon: A frequency polygon is essentially, a line graph for the graphical representation of
the frequency distribution. We can get a frequency polygon from a histogram, if the mid points of the upper
bases of the rectangles are connected. But it is not essential a plot histogram first to draw a frequency polygon.
We can construct it directly from a given frequency distribution. The following points are helpful in
constructing a frequency polygon :
Scale: on x-Axis 3 Scores = 5 Small Squares = 0.5"
On Y-Axis Frequency = 3 Small Squares - 0.3"
71
FIG. 13.10 THE FREQUENCY POLYGON OF THE FREQUENCY DISTRIBUTION GIVEN IN TABLE
15.2
(a) Like histogram two extra intervals classes one above and the other below the given intervals are taken.
(b) The mid-points of all the classes or intervals (including two extra intervals) are calculated.
(c) Mid-points are marked along the X-axis and the corresponding frequencies are plotted along the Y-axis
by choosing suitable scales on both axes.
(d) The various points obtained by plotting the mid-points and frequencies are joined by straight line to give
the frequency polygon.
(e) For the approximate height of the figure and selection of X and Y units the rule emphasized earlier in the
case of histogram should be followed.
COMPARISON BETWEEN THE HISTOGRAM AND THE FREQUENCY POLYGON
Although Histogram and Frequency polygon-both are used for the graphic representation of the frequency
distribution and are alike in many aspects, yet they possess points of differences. Some of these differences can
be cited as below:
1. Where Histogram is essentially the bar graph of the given frequency distribution, the Frequency polygon
is a line graph of this distribution.
2. In Frequency polygon, we assume frequencies to be concentrated at the mid-points of the class interval.
It points out merely the graphical relationship between mid-points and frequencies and thus is unable to
show the distribution of frequencies within each class interval. But the Histogram gives a very clear as
well as accurate picture of the relative proportions of frequency from interval to interval, A mere glimpse
of the figure answers such questions as:
(a) Which group of Class-interval has the largest or smallest frequency'?
(b) Which pair of groups or class intervals has the same frequency?
(c) Which group has its frequency double that of another?
3. In comparing two or more distributions by plotting two or more graphs on the same axes, Frequency
polygon is more useful and practicable than the Histogram as in such cases vertical and horizontal lines
in the histogram tend to coincide.
4. In comparison to Histogram, Frequency polygon gives a much better conception of the contour of the
distribution. With a part of the polygon curve, it is easy to know the trend of the distribution but a
Histogram is unable to tell such a thing.
72
QUESTIONS
1. Discuss in brief the different methods of organising and presenting statistical data.
2. What is Frequency distribution? How can you present a data in the form of a frequency distribution?
Illustrate your answers with an example.
3. Tabulate the following 26 scores into a frequency distribution using a appropriate interval-
72, 75, 77, 67, 72, 81, 68, 65, 86, 73, 67, 82, 67, 70, 76, 70, 83, 71, 63, 72, 72, 61, 67, 84, 69, 64.
4. What is a Histogram? How is it different from a February polygon?

5. Plot Histogram and frequency polygons separately on the different axes for the following distribution.
73

4023 Epy 410 Educational Measurement and Evaluation Notes

Uploaded by

Copyright:

Available Formats

4023 Epy 410 Educational Measurement and Evaluation Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4023 Epy 410 Educational Measurement and Evaluation Notes

Uploaded by

Copyright:

Available Formats

MASENO UNIVERSITY

Introduction to the Course

vi. Explain and compute the validity and reliability of a test.

Section 2: Measurement, evaluation and assessment………………………………………

Section 3: Purposes of Measurement and Evaluation………………………………………

Section 4: Tests and Examinations………………………………………………………..

Section 5: Construction of Tests………………………………………………………………

Topic 2: Frequency distributions and graphic presentations………………………

Section 1: Statistical Concepts in Tests and Measurement………………………….

Section 2: Frequency distributions and graphical presentation………………………….

Section 3: Stated and real class limits………………………………………………….

Section 5: Frequency polygons and curves………………………………………….

Section 6: Skewness and kurtosis of a distribution………………………………………..

Topic 3: Measures of central tendency………………………………………………..

Section 1: The mode……………………………………………………………………

Section 2: The median………………………………………………………………

Section 4: Mean, mode and median compared……………………………………………

Topic 4: Measures of Dispersion…………………………………………………….

Section 3: Standard deviation……………………………………………………………..

Section 4: Interquartile range/deviation………………………………………………..

Topic 5: Measures of Correlation and Regression Analysis…………………………

Section 1: The concept of correlation analysis…………………………………………………

Section 2: Scatter diagram; a graphical presentation of the measures of relationship……………

Section 3: Spearman and Pearson correlation techniques of determining relationships………

Section 4: Regression Analysis………………………………………………………………… Topic 6: Test validity and

Section 3: Item Analysis……………………………………………………………………..

TESTS MEASUREMENT AND EVALUATION

By the end of the topic, you should be able to:

 Define the terms measurement, evaluation and assessment.

 State and explain the different types of evaluation and assessment.

 Explain purposes of measurement and evaluation

EDUCATIONAL TESTS, MEASUREMENTS AND EVALUATION

MEASUREMENT, EVALUATION AND ASSESSMENT

There are three types

 It is the process by which the quality of an individual’s work or performance is judged.

PURPOSES OF MEASUREMENT AND EVALUATION

The primary purpose of assessment is to improve student learning.

1. To identify areas of weakness in learning..

TESTS AND EXAMINATIONS

Test - Is a set of questions to which an examinee has to respond.

Examination - Is a set of tests in various areas to which an examinee has to respond.

Tests can also involve a series of tasks to be performed.

A useful test measures accurately some property or behavior.

Tests can be classified by

i. how they are administered

 Individual and Group Tests

Those administered to large groups are sometimes referred to as group tests.

 Objective and Subjective Tests

 Power and Speed Tests

 Teacher Made and Standardized Tests

Comparison between Teacher Made and Standardized

Teacher made Standardized

Norms The norms may be developed Norms are provided by the

 Norm-referenced and criterion-referenced tests

Comparison between the two

Dimension Criterion-referenced tests Norm-referenced tests (more like a

Purpose - Determines if each student has - To rank students with reference to

Qualities of a Good Test

Factors to consider when constructing a Test