Data Gathering Instruments
Data Gathering Instruments
Data Gathering Instruments
05/28/24 1
What are questionnaires?
Questionnaires are research tools through which people are asked
to respond to the same set of questions in a predetermined order.
Questionnaires should be used when they fit the objectives of the
research. Hence, in a case study , completely inappropriate.
You might want to construct an interview schedule containing
open-ended questions, adopting a descriptive approach.
But where the audience is relatively large, and where standardized
questions are needed, the questionnaire is ideal, and will allow, if
this is required, an analytical approach exploring relationships
between variables.
Of course, in many cases questionnaires will be only one tool used
in the general research effort.
05/28/24 2
ADVANTAGES
They are low cost in terms of both time and money.
The inflow of data is quick and from many people.
Respondents can complete the questionnaire at a time
and place that suits them.
Data analysis of closed questions is relatively simple,
and questions can be coded quickly.
Respondents’ anonymity can be assured.
These is a lack of interviewer bias.
05/28/24 3
Points to be considered when
writing individual questions
It is naïve to believe that standardized questions will always
receive standardized, rational responses.
Prejudicial language
Try to avoid language that is prejudicial or contains sexist,
disablist or racist stereotyping.
A question that annoys, irritates or insults a respondent may
affect the way they respond to questions that follow – if they
decide to complete them at all!
Imprecision
Avoid vague phrases such as ‘average’, ‘regularly’ and ‘a great
deal’ since they are likely to be interpreted in different ways by
different respondents.
05/28/24 4
Points to be considered when
writing individual questions
Leading questions
These suggest a possible answer and hence promote bias.
Questions such as ‘Why do you think the organization has been
successful in the past three years’ are leading because they are
making an assumption with which the respondent may not
necessarily agree.
Double questions These should be avoided because they are
impossible to answer..
Assumptive questions Avoid questions that make assumptions
about people’s beliefs or behaviours. For example,‘How often
do you drink alcohol?’
05/28/24 5
Cont…
Hypothetical questions
Try to avoid hypothetical questions such as: ‘Suppose you were
asked to …’
Knowledge
Make sure that the group that has been targeted to answer the
questions has the knowledge actually to do so.
Sometimes it may be necessary to provide people with some
background information if the subject is quite technical.
Memory recall
People may have difficulty recalling what has occurred even
quite recently.
If, say, you are constructing some questions around recent
newsworthy events, then it would be appropriate to present
respondents with a list of such events before asking them
questions about them.
05/28/24 6
Classification questions
One type of question often required by a survey is the
classification question, dealing with, for example, the
sex, age, status, grade level, school, etc. of the
respondent.
These are important for providing the basis for
analysing associations between variables (for example,
a respondent’s gender and attitude towards sexual
harassment issues in the workplace).
These questions should be introduced by a gentle ‘It
will help us in further analysis if you would tell us a
little about yourself ’.
05/28/24 7
Question content
Clearly, in writing questions issues such as validity need to be
borne in mind.
Hence, the content of the questionnaire needs to cover the
research issues that have been specified. A series of precise
steps must be followed:
The researcher has to be clear about the information required
and encode this accurately into a question.
The respondent must interpret the question in a way that the
researcher intended.
The respondent must construct an answer that contains
information that the researcher has requested.
The researcher must interpret the answer as the respondent had
intended it to be interpreted.
05/28/24 8
Open questions
Open questions have no definitive response and contain answers
that are recorded in full.
Hence, the questionnaire must be designed in such a way that
respondents are able to provide such a response without the
restriction of lack of space.
The advantage of open questions is the potential for richness of
responses, some of which may not have been anticipated by the
researchers.
But the downside of open questions is that while they are easy to
answer they are also difficult to analyse.
At first sight much of the information gathered may seem varied
and difficult to categorize.
Generally, the solution to this is the use of coding and the
adoption of a coding frame.
05/28/24 9
Closed questions
A closed question is one to which the respondent is offered a set of
pre-designed replies such as ‘Yes/No’,‘True or False’,multiple-
choice responses, or is given the opportunity to choose from a
selection of numbers representing strength of feeling or attitude,
or degree of agreement.
Advantages
Easier to analyse. They also make it easier
05/28/24 10
Types of Close-ended Items
List questions These provide the respondent with a list of responses, any of which
they can select. This approach avoids making the answering of a questionnaire a test
of memory.
Category questions These are designed so that only one response is possible.
Ranking questions This requires the respondent to rank responses in order.
With this kind of question it is important to make the instructions for completing the
question clear and explicit.
Providing response categories
Scale questions Scale or rating questions are used to measure a variable, and
comprise four types of scale: nominal, ordinal, interval and ratio.
A common type is the Likert scale on which respondents are asked to indicate how
strongly they agree or disagree with a series of statements.
Other forms of scaling can also be used.The number of response categories, for
example, can be changed. Common formats are ‘True/False’, ‘Yes/No’.
Marking a point on a continuum.
Quick 1 2 3 4 5 6 7 8 9 10 Slow
Friendly 1 2 3 4 5 6 7 8 9 10 Discourteous
Informative 1 2 3 4 5 6 7 8 9 10 Confusing
05/28/24 11
PILOTING QUESTIONNAIRES
What should be considered during piloting?
Instructions given to respondents.
Style and wording of any accompanying letter.
Content of fact-sheet data, that is, respondents’ names, addresses, etc.
Formality or informality of the questionnaire in terms of tone,
presentation, etc.
Length of the questionnaire – if too long, is the response rate likely to
be reduced?
Sequence of questions.
Quality of individual questions in terms of whether they are
understood and answered in a way that was intended.
Scales and question format used, for example, Likert scales,Yes/No
responses, etc.
The validity and reliability of questions.
05/28/24 12
Validity
The validity of a questionnaire can be affected
by the wording of the questions it contains.
a poor sequencing of questions or confusing structure or design of the
questionnaire can all threaten its validity.
Not covering the research issues both in terms of content and detail.
not covering the research area (Zone of Neglect) and
some irrelevant questions to the study (Zone of Invalidity).
asking spurious, irrelevant questions increases the length of a
questionnaire, which in turn, may reduce the number of responses.
If the response rate becomes too low, this may limit the
generalizability of the findings, and hence external validity.
05/28/24 13
Reliability
Reliability is a measure of consistency and can include measures
of
Stability (over time).
Equivalence (administering two versions of a test instrument to
the same people on the same day).
Inter-judge reliability.
Internal Consistency (Cronbach’s alpha)
The extent of this consistency is measured by a reliability
coefficient using a scale from 0.00 (very unreliable) to 1.00
(perfectly reliable).
In practice, a score of 0.9 is generally deemed to be acceptable.
05/28/24 14
Response biases
Social desirability
Acquiescence
Extremity
Halo and leniency
05/28/24 15
RESEARCH INTERVIEW
BAHIR DAR UNIVERSITY
05/28/24 16
Definition
The research interview has been defined as ‘a two-person
conversation initiated by the interviewer for the specific
purpose of obtaining research- relevant information, and
focused by him on content specified by research objectives of
systematic description, prediction, or explanation’ (Cannell and
Kahn, 1968:527).
Kvale (1996: 14) remarks that an interview is an interchange of
views between two or more people on a topic of mutual
interest, sees the centrality of human interaction for knowledge
production, and emphasizes the social situatedness of research
data.
It involves the gathering of data through direct verbal
interaction between individuals.
05/28/24 17
Conceptions of the interview
The first conception is that of a potential means of pure
information transfer and collection.
A second conception of the interview is that of a transaction
which inevitably has bias, which is to be recognized and
controlled.
According to this viewpoint, each participant in an interview
will define the situation in a particular way.
This fact can be best handled by building controls into the
research design, for example, by having a range of interviewers
with different biases.
05/28/24 18
Conceptions of the interview
05/28/24 19
Unavoidable problematic features
1 There are many factors which inevitably differ from one
interview to another, such as mutual trust, social distance and
the interviewer’s control.
2 The respondent may well feel uneasy and adopt avoidance
tactics if the questioning is too deep.
3 Both interviewer and respondent are bound to hold back part of
what it is in their power to state.
4 Many of the meanings which are clear to one will be relatively
opaque to the other, even when the intention is genuine
communication.
5 It is impossible, just as in everyday life, to bring every aspect of
the encounter within rational control.
05/28/24 20
PURPOSES
The research interview may serve three purposes.
First, it may be used as the principal means of gathering
information having direct bearing on the research objectives.
‘By providing access to what is “inside a person’s head”, [it]
makes it possible to measure what a person knows (knowledge
or information), what a person likes or dislikes (values and
preferences), and what a person thinks (attitudes and beliefs)’
(Tuckman, 1972).
Second, it may be used to test hypotheses or to suggest new
ones; or as an explanatory device to help identify variables and
relationships.
And third, the interview may be used in conjunction with other
methods in a research undertaking.
05/28/24 21
Types of Interview
Single or multiple sessions
Structured, Semi-structured vs. Non-structured
Individual vs. Group
Face-to-face vs. Telephone Interview
05/28/24 22
Stages of an interview investigation
1. Thematizing: Formulate the purpose of an investigation and
describe the concept of the topic to be investigated before the
interviews start. The why and what of the investigation should be
clarified before the question of how is posed.
2. Designing: Plan the design of the study, taking into consideration
all seven stages of the investigation, before the interviewing starts.
3. Interviewing: Conduct the interviews based on an interview guide
and with a reflective approach to the knowledge sought and the
interpersonal relation of the interview situation.
4. Transcribing: Prepare the interview material for analysis, which
commonly includes a transcription from oral speech to written text.
05/28/24 23
Stages of an interview investigation
5. Analysing: Decide, on the basis of the purpose and topic of the
investigation, and on the nature of the interview material,
which methods of analysis are appropriate for the interviews.
6. Verifying: Ascertain the generalizability, reliability, and
validity of the interview findings.
7. Reporting: Communicate the findings of the study and the
methods applied in a form that lives up to scientific criteria,
takes the ethical aspects of the investigation into consideration,
and that results in a readable product.
05/28/24 24
Type of Questions
OPENING QUESTIONS
CONTENT QUESTIONS
experience questions
behaviour questions
knowledge questions
sensory questions
PROBES
FINAL CLOSING QUESTIONS
05/28/24 25
Data from in-depth interviews
In-depth interviews are useful for learning about the perspectives
of individuals, as opposed to, for example, group norms of a
community, for which focus groups are more appropriate.
They are an effective qualitative method for getting people to talk
about their personal feelings, opinions, and experiences.
They are also an opportunity for us to gain insight into how
people interpret and order the world.
Interviews are also especially appropriate for addressing sensitive
topics that people might be reluctant to discuss in a group setting.
05/28/24 26
Forms of Interview Data
What form do interview data take?
Interview data consist of tape recordings,
typed transcripts of tape recordings, and
the interviewer’s notes (which may document
observations about the interview content, the
participant, and the context)
05/28/24 27
How are interview data used?
Typed transcripts are the most utilized form of interview data.
During the data analysis phase of the research, after data collection,
transcripts are coded according to participant responses to each
question and/or to the most salient themes emerging across the set of
interviews.
While data is still being collected, researchers use expanded
interview notes:
During interviews, to remind themselves of questions they need to
go back to, where they need more complete information, etc.
During debriefing sessions with other field staff and investigators
During transcription of interview recordings, to clarify and add
contextual details to what participants have said
05/28/24 28
How to Be an Effective Interviewer
Obtaining superior data requires that the interviewer be well prepared
and have highly developed rapport-building skills, social and
conversational skills specific to the capacity of interviewer, and facility
with techniques for effective questioning.
Steps to become comfortable with the interview process:
Be familiar with research documents. An effective interviewer knows
05/28/24 29
Important skills for interviewing
The interviewer’s skills have an important influence
on the comprehensiveness and complexity of the
information that participants provide.
The core skills required to establish positive
interviewer-participant dynamics are
rapport-building,
emphasizing the participant’s perspective, and
states
05/28/24 30
Important skills for interviewing
The interviewer must be able to:
lend a sympathetic ear without taking on a counseling
role;
encourage participants to elaborate on their answers
principle of participant-as-expert.
05/28/24 31
OBSERVATION
Bahir Dar University
05/28/24 32
OBSERVATION
“A systematic method of data collection that relies on
researchers’ ability to gather data through his or her
senses.” 0’Leary (2004:170)
Observation is a research data gathering tool which affords the
05/28/24 34
Kind of observations
The kind of observations available to the researcher lies on a continuum from
unstructured to structured.
A highly structured observation will know in advance what it is looking for
(i.e. pre-ordinate observation) and will have its observation categories worked
out in advance. It will already have its hypotheses decided and will use the
observational data to conform or refute these hypotheses
An unstructured observation will be far less clear on what it is looking for
and will therefore have to go into a situation and observe what is taking place
before deciding on its significance for the research.
05/28/24 35
Kind of observations
A semi-structured observation will have an agenda of issues but will gather
data to illuminate these issues in a far less pre-determined or systematic
manner.
A semi-structured and, more particularly, an unstructured observation, will be
hypothesis generating rather than hypothesis-testing.
The semi-structured and unstructured observations will review observational
data before suggesting an explanation for the phenomena being observed.
05/28/24 36
Observation in Quantitative Study
Quantitative research tends to have a small field of focus,
fragmenting the observed into minute chunks that can
subsequently be aggregated into a variable.
05/28/24 37
Observation in Qualitative Study
Qualitative research, on the other hand, draws the researcher into
the phenomenological complexity of participants’ worlds;
here situations unfold, and connections, causes and correlations
can be observed as they occur over time.
05/28/24 38
Structured observation
A structured observation is very systematic and
enables the researcher to generate numerical data from
the observations.
Numerical data, in turn, facilitate the making of
comparisons between settings and situations, and
frequencies, patterns and trends to be noted or
calculated.
The observer adopts a passive, non-intrusive role,
merely noting down the incidence of the factors being
studied.
Observations are entered on an observational
schedule.
05/28/24 39
Types of structured observation
1. Event sampling
Event sampling, also known as a sign system, requires a tally
mark to be entered against each statement each time it is
observed.
The researcher will need to devise statements that yield the
data that answer the research questions.
2. Instantaneous sampling
To know the chronology of events, it is necessary to use
instantaneous sampling, (time sampling).
Here the researcher enters what (s)he observes at standard
intervals of time; e.g. every twenty seconds, every minute, into
the appropriate category on the schedule.
05/28/24 40
Types of structured observation
3. Interval recording
This method charts the chronology of events to some
extent and, like instantaneous sampling, requires the data
to be entered in the appropriate category at fixed
intervals.
However, instead of charting what is happening on the
instant, it charts what has happened during the preceding
interval.
So, for example, if recording were to take place every
thirty seconds, then the researcher would note down in
the appropriate category what had happened during the
preceding thirty seconds.
05/28/24 41
Types of structured observation
4. Rating scales
In this method the researcher is asked to make some
judgement about the events being observed, and to enter
responses onto a rating scale.
Observed behaviour might be entered onto rating scales
by placing the observed behaviour onto a continuum: 1 2
345
An observer might wish to enter a rating according to a
five point scale of observed behaviour, for example:
1=not at all 2=very little 3=a little 4=a lot 5=a very great
deal
05/28/24 42
Types of structured observation
5. Critical incidents
Critical incidents are particular events or occurrences that might typify or
illuminate a particular feature of a teacher’s behaviour or teaching style, for
example.
These are events that appear to the observer to have more interest than other
ones, and therefore warrant greater detail and recording than other events;
they have an important insight to offer.
There will be times when reliability as consistency in observations is not
always necessary.
For example, a student might only demonstrate a particular behaviour once,
but it is so important as not to be ruled out simply because it occurred once.
One only has to commit a single murder to be branded a murderer!
Sometimes one event can occur which reveals an extremely important
insight into a person or situation.
05/28/24 43
Naturalistic Observation
Observations are recorded in field notes; these can be written at
several levels. At the level of description they might include
• quick, fragmentary jottings of key words/ symbols;
• transcriptions and more detailed observations written out
fully;
• descriptions that, when assembled and written out, form a
comprehensive and comprehensible account of what has
happened;
• pen portraits of participants;
• reconstructions of conversations;
• descriptions of the physical settings of events;
• descriptions of events, behaviour and activities;
• description of the researcher’s activities and behaviour.
05/28/24 44
Naturalistic Observation
Lincoln and Guba (1985:273) suggest a variety of elements or types of
observations that include:
• ongoing notes, either verbatim or categorized in situ;
• logs or diaries of field experiences (similar to field notes though usually
written later);
• notes that are made on specific, predetermined themes;
• ‘chronologs’, where each separate behavioural episode is noted, together
with the time at which it occurred;
• context maps—maps, sketches, diagrams or some graphic display of the
context (usually physical) within which the observation takes place, such
graphics enabling movements to be charted;
• entries on predetermined schedules (including rating scales, checklists and
structured observation charts), using taxonomic or categoric systems, where
the categories derive from previous observational or interview data;
05/28/24 45
Naturalistic Observation
A useful set of guidelines for directing observations of specific
activities, events or scenes,
• Space the physical setting;
• Actors the people in the situation;
• Activities the sets of related acts that are taking place;
• Objects the artifacts and physical things that are there;
• Acts the specific actions that participants are doing;
• Events the sets of activities that are taking place;
• Time the sequence of acts, activities and events;
• Goals what people are trying to achieve;
• Feelings what people feel and how they express this.
05/28/24 46
Naturalistic Observation
At the level of reflection, field notes might include (Bogdan and Biklen,
1992:122):
• reflections on the descriptions and analyses that have been done;
• reflections on the methods used in the observations and data collection
and analysis;
• ethical issues, tensions, problems and dilemmas;
• the reactions of the observer to what has been observed and recorded—
attitude, emotion, analysis etc.;
• points of clarification that have been and/or need to be made;
• possible lines of further inquiry
Lincoln and Guba (1985:327) indicate three main types of item that might be
included in a journal:
1. a daily schedule, including practical matters, e.g. logistics;
2. a personal diary, for reflection, speculation and catharsis;
3. notes on and a log of methodology.
05/28/24 47
TESTS as a Data
Gathering Tool
Bahir Dar University
05/28/24 48
Tests
McMillan and Schumacher (1997:245) state that a test is a
standard set of questions presented to each subject that
requires completion of cognitive tasks.
They further elaborate that response or answers are
summarized to obtain a numerical value that represents the
characteristic of the subject.
They also underline that the cognitive task can focus on what
the person knows (achievement), is able to learn (ability or
aptitude), chooses or selects (interest, attitude or values) or is
able to do (skills).
Thus, we can infer that an educational researcher can use a
test to measure achievement, aptitude, attitude and skills so
as to gather relevant information about his subjects.
05/28/24 49
Parametric Tests
Parametric tests are tests that represent the wide population
as a country or age group and published as standardized tests
which are commercially available and piloted on large and
representative sample of the whole population.
These tests are complete with the backup data of reliability
and validity statistics.
Parametric tests enable the researcher to use statistics
applicable to interval and ratio levels of data.
Moreover, these tests are used in data processing of mean,
standard deviation, t-test, etc.
05/28/24 50
Non-parametric
Non-parametric tests are tests which are designed for a
given specific population like a class in a school or a small
group in a school.
The researcher using these tests is confined to nominal and
ordinal levels of data.
These tests have less complicated computational statistics.
In most case ,they are prepared by classroom teachers.
They have the advantage of being tailored to particular
instructional, departmental and individual circumstances.
Non-parametric tests offer teachers a valuable opportunity
for quick, relevant and focused feedback on student
performance.
As opposed to parametric, the non parametric tests are used
for small samples, they do not make any assumption about
how normal, even and regular the distribution of scores are.
05/28/24 51
Parametric Vs. Non-parametric
When the two tests are compared, parametric tests are more
powerful than non- parametric tests because :
Parametric tests are derived from standardized scores
They help to compare sub-populations with a whole
population.
To compare the result of one school or local education
authority with the whole country.
Hence, the researcher can prepare his/her own ‘home made’
tests to fit the purpose or employ standardized test to
measure what he/she needs for large number of students.
Nevertheless, the researcher should closely check what the
test intends to test.
05/28/24 52
Commercially Produced Tests
These are tests in public domain which cover a vast range of topics and
which can be used for evaluative purposes.
These tests include: diagnostics tests, aptitude tests, achievement tests,
norm-referenced tests, criterion- referenced tests, reading tests, readiness
tests and etc.
There are numerous reasons for using commercially published tests. They
are objective, piloted and refined.
Moreover, they are standardized across a named population. The tests
declare how reliable and valid they are and tend to be parametric tests.
Hence, they enable sophisticated statistics to be calculated.
They come complete with instructions and quick to administer and mark.
Guides to the interpretation of the data are clearly included.
The golden rule for deciding to use a published test is that it must
demonstrate fitness for purpose. If it fails to demonstrate this, then tests
will have to be devised by the researcher.
05/28/24 53
The Researcher Produced Test
The Researcher Produced Test - is a ‘home grown’ test
which is tailored to the local and institutional context very
tightly, that the purposes, objectives and content of the test
will be deliberately fitted to the specific needs of the
researcher in a specific, given context.
In spite of varied advantages, there are several important
considerations in devising a ‘home grown’ test. It might be
time consuming to devise, pilot, refine and then administer
the test.
Generally, when the commercially produced tests are not
accessible, teacher produced tests are the major alternatives.
These teacher produced tests are more alike to the non-
parametric test.
05/28/24 54
Aptitude Test
It is a kind of test used to predict future performance.
The results are used to make a prediction about performance
on some criterion like grades, teaching effectiveness,
certification or test scores prior to instruction, placement or
training.
The term aptitude refers to the predicative use of the scores
from a test, rather than the nature of the test items.
Some terms like intelligence or ability are used
interchangeably with aptitudes.
A national examination given to preparatory students to
predict their future performance in the university is a good
example of it.
05/28/24 55
Achievement Tests
Achievement tests have a more restricted coverage and they are
more closely tied to school subjects, and measure more recent
learning than aptitude tests.
The purpose of achievement test is to measure what has been
learned rather than to predict future performance. Achievement
test can be:
a. Diagnostic (identifies weakness and strength).
b. Survey batteries (test different content areas)
c. Norm-referenced (compares result in relation to others)
d. Criterion-referenced (requires criteria to be fulfilled)
05/28/24 56
Performance Assessment
In the past few years, a new type of assessment has become
very popular as an alternative to traditional testing formats
that rely on written objective items.
Here the emphasis is on measuring student proficiency on
cognitive skills by directly observing how a student performs
the skill in an authentic context.
In this regard, a researcher can see the performance of his
students in a live presentation.
The data gathered from the actual performance can help
him/her to come to conclusion about the students
achievement.
05/28/24 57
Psychological / Non-cognitive Tests
Tests can also be cognitive or non-cognitive. Cognitive tests include aptitude and achievement. On the
other hand, non cognitive tests incorporate traits such as interest, attitudes self concept, values,
personality and beliefs.
Wiersma (200:310) writes that Personality inventories consider characteristics such as motivation,
attitudes and emotional adjustment as a whole.
There are two kinds of personality tests, projective and non-projective.
Projective tests use a word, a picture or some stimulus to elicit an unstructured response. Whereas, non
projective tests use paper and pencil tests that require the individual to respond.
Attitudes involve an individuals feeling toward such things as ideas, procedures, and social
institutions.
Attitude inventories used for research tend to be quite specific.
A school teacher, for instance, can be inspired to test the attitude of grade 9 students towards group
work in English periods.
For this purpose, the researcher can set a test to measure the attitude of his students in a specific way.
McMillan, and Schumacher (1997:250) emphasize that these non-cognitive traits are important in
school success and at the same time measuring them accurately is more difficult than assessing
cognitive traits or skills.
In testing, the non-cognitive items are susceptible to faking. One of these is social desirability, in
which subjects answer items in order to appear most normal or most socially desirable rather than
responding honestly. The other problem with non-cognitive tests is that reliability of cognitive tests is
greatly lower than that of cognitive tests.
Similarly the non-cognitive tests do not have “right” answers like cognitive tests.
Therefore, a researcher need to be conscious when he/she uses non-cognitive tests to gather
information as the results can be affected for varied reasons discussed above.
05/28/24 58
Reliability
McMillan and Schumacher (1997:239) state that reliability
refers to the consistency of measurement. In their words,
reliability reveals to the extent to which the results are similar
over different forms of the same instrument or occasions of
data collecting. They stress that reliability refers to the extent
to which measures are free from errors.
If an instrument has little error, it is reliable and if it has a
great amount of error, it is unreliable. Errors occur due to
varied reasons. For instance, when students get correct
answers by guessing, or lose the correct answers when they
are in a hurry or when the question is ambiguous and so on.
McMillan and Schumacher (1997:239) formulate that the
obtained score may be thought of as having two components a
true score, which represents the actual knowledge or skill
level of the individual, and errors, sources of variability
unrelated to the intent of the instrument.
The reliability coefficient is from. 00 to .99.
05/28/24 59
Empirical procedures for
estimating reliability
Several procedures are used to estimate reliability. All of
them have computation formulas that produce reliability
coefficient. The common ones (Fraenkel and Wallen (2000)
McMillan and Schumacher (1997) and Cohen and Manion
and Morrison (2003) are summarized in the following ways.
1. Parallel forms or alternate forms
This procedure involves the use of two or more equivalent
forms of the test. The two forms are administered to a group
of individuals with a short time interval between the
administrations.
If the test is reliable, the patterns of scores for individuals
should be about the same for the two forms of the test.
There would be a high positive association between the
scores.
Although the questions are different, they should sample
the same content and they should be constructed separately
from each other.
05/28/24 60
Empirical procedures for
estimating reliability
2. Test- retest
In this procedure, the same test is administered
on two or more occasions to the same individual
again if the test is reliable, there will be a high
positive association between the scores.
05/28/24 61
Reliability
3. Internal- consistency Methods
The methods discussed above require two
administration or testing sessions. There
are also internal-consistency methods of
estimating reliability.
Cronback alpha (Alpha coefficient) A
formula developed by Cronback (1951),
based on two or more parts of the test,
requires only one administration of the
test.
05/28/24 62
Reliability
4. Split- half: In computing split-half reliability,
the test items are divided in to two halves, with
the items of the two halves matched on content
and validity, and the halves are then scored
independently.
If the test is reliable, the scores on the two halves
have a high positive association.
The coefficient indicates the degree to which the
two halves of the test provide the same results,
and hence describe the internal consistency of the
test.
05/28/24 63
Reliability
Kuder- Richardson Procedures two formulas for
estimating reliability, developed by Kuder and
Richardson (1937), require only one
administration of a test.
This is perhaps the most frequently employed
method for determining internal consistency. One
formula, KR-20, provides the Mean of all possible
split-half coefficient.
The second formula, KR-21, may be substituted
for KR-20, if it can be assumed that item difficulty
levels are similar.
Formula KR 20 does not require the assumption
that all items are of equal difficulty.
05/28/24 64
Validity
Another essential characteristic of measurement
is validity the extent to which an instrument
measures what it is supposed to measure.
Validity is an integrated evaluative judgment of
the degree to which empirical evidence and
theoretical rational support the adequacy and
appropriateness of influences and actions based
on test score or other modes of assessment.
Validity deals with the question, “Does the
instrument measure the characteristics, trait, or
whatever, for which it was intended?”
Validity is assessed depending on the purpose,
population and environmental characteristics in
which measurement takes place. A test result can
therefore be valid in one situation and invalid in
another.
05/28/24 65
Evidence for establishing validity
Basically, there are two approaches to
determining the validity of an instrument.
1. Through a logical analysis of content or a
logical analysis of what would make up an
educational trait or characteristic. This is
essentially a judgmental analysis.
2. Through an empirical analysis, uses
criterion measurement (the criterion being
some sort of standard or desired
outcomes).
05/28/24 66
Evidence for establishing validity
To facilitate an understanding of the need
to gather appropriate evidence to establish
validity, three types of evidence are
identified.
A. Content related evidence this refers to
the extent to which the content of a test is
judged to be representative of some
appropriate universe or large domain of
content.
05/28/24 67
Evidence for establishing validity
Criterion- related evidence this indicates
whether the scores on an instrument predict
scores on a well specified predetermined criterion.
Criterion related validity uses two types of
evidence: Concurrent evidence and predicative
evidence.
Concurrent evidence uses correlation coefficient
to describe the degree of relationship between two
measures of the same trait given at about the
same time.
However, predictive value depicts when an
assessment is made after a first measure.
Here, correlation coefficient is reported between
the two measures. The result can be used to
predict future condition.
05/28/24 68
Construct related evidence
It is an interpretation or meaning that is
given to set of scores from instruments
that asses a trait or theory that can not be
measured directly, such as measuring an
unobservable trait like intelligence,
creativity or anxiety.
In addition to the commonly known
evidences, Yalew (2006:232) includes face
validity which refers to evidence given by
individuals who have expertise knowledge
in the field related to the instruments.
05/28/24 69
Summary
Generally, to ensure validity in a test, it is
essential to check that the objectives of the test
are fairly addressed. Mager (1962), Wiles and
Bondi (1984) quoted in Cohen, Manion and
Morrison (2003:323) underline that the objective
should be specific and be expressed with an
appropriate degree of precession.
a. represent intended learning out comes.
b. identify the actual and observable behavior which
will demonstrate achievement.
c. include an active verb.
d. be unitary (focusing on one item per objective).
05/28/24 70
SCALE OF
MEASUREMENT
Bahir Dar University
05/28/24 71
Measurement Scales
Wiersma (2000:296) defines measurement as the assignment
of numerals to objects or events according to rules.
The numeral is a symbol such as 1,2 or 3 that is devoid of
either quantitative or qualitative meaning unless such
meaning is assigned by a rule.
Measurement is limiting data of any phenomenon, substantial
or insubstantial. So that data may be interpreted and
ultimately, compared to an acceptable qualitative or
quantitative standard.
Thus, measurement can help us to infer information we get
from the numbers assigned.
There are four types of measurement scales: nominal,
ordinal, interval and ratio.
05/28/24 72
Nominal Scale
A nominal scale is the simplest form of measurement a
researcher can use. When using a nominal scale, researchers
simply assign numbers to different categories in order to show
difference.
For example, a researcher concerned with the variable of gender
might group data in to two categories, male and female and
assign the number “1” to females and the number “2” to males.
Nominal measurement is elemental and unrefined. To analyze
nominal data, we can use the mode of an indicator of the most
frequency occurring category with in our data set.
In most cases, the advantage to assigning numbers to the
categories is to facilitate computer analysis.
05/28/24 73
Nominal data
Nominal data constitute a name value or category with no order or
ranking implied:
e.g.,
sales departments,
occupational descriptors of employees,
schools in a woreda,
departments students belong to in a school,
Sex of respondents, etc.).
Thus, with nominal data, we build up a simple frequency count of
how often the nominal category occurs.
05/28/24 74
Ordinal Scale
An ordinal scale is one in which data may be
ordered in some way- high to low or least to most.
For example, a researcher may rank order student
scores on an English test from high to low.
Ordinal scales indicate relative standing among
individuals.
Consider a variable such as attitude toward school.
If one person indicates a highly favorable attitude
toward school and another individual indicates a
neutral attitude, we not only know that they are
different, but we also can order the individuals on
the degree of favorableness.
05/28/24 75
Ordinal data …
Ordinal data comprises an ordering or ranking of values, although
the intervals between the ranks are not intended to be equal (for
example, an attitude questionnaire).
A type of question that yields ordinal data is
How frequently do you practice the piano?
05/28/24 76
Interval Scale
An interval scale possesses all the characteristics of an ordinal
scale with one additional feature.
The distances between the points on the scale are equal.
Suppose we measure a variable such as IQ or performance on
an achievement test in reading or mathematics.
If three individuals have scores of 105, 110 and 115,
respectively, the difference of five points between the first two
individuals is considered equivalent to the difference of five
points between the second two individuals.
This gives not only difference and order but also a unit of equal
difference established in the measurement. This level of
measurement involves equal units between the intervals.
The other feature of this interval is that its zero point is
established arbitrarily (not a true zero).
05/28/24 77
Interval data
With quantifiable measures, numerical values are assigned
along an interval scale with equal intervals, but there is no zero
point where the trait being measured does not exist.
Another characteristic of interval data is that the difference
between a score of 14 and 15 would be the same as the
difference between a score of 91 and 92.
Hence, in contrast to ordinal data, the differences between
categories are identical.
The kinds of results from interval data show quite a normal
distribution of scores (e.g., those on an IQ test, delivered as part
of a school’s aptitude assessment of students.)
05/28/24 78
Ratio Scale-
An interval scale that possess an actual, or true, zero point is called a ratio
scale. For example, a scale designed to measure height would be a ratio
scale; because the zero point on the scale represents the absence of height
(that is no height),
Ratio Scales are almost never encountered in educational research, since
rarely do researchers engage in measurement involving a true zero point.
Even on those occasions when a student receives a zero on a test of some
sort, this does not mean that whatever is being measured is totally absent
in the student.
What distinguishes the ratio scale from the other three scales is that the
ratio scale can express values in terms of multiples and fractional parts
and the ratios are true ratios.
Ratio scales outside the physical science are relatively rare. And
whenever we can not measure a phenomenon in terms of a ratio scale, we
must refrain from making compatriots such as., “This thing is three times
as great as that.”
05/28/24 79
Ratio data
Ratio data are a subset of interval data, and the scale is
again interval, but there is an absolute zero that
represents some meaning, for example, scores on an
achievement test.
If an employee, for example, undertakes a work-related
test and scores zero, this would indicate a complete lack
of knowledge or ability in this subject!
This sort of classification scheme is important because it
influences the ways in which data are analysed and what
kind of statistical tests can be applied.
05/28/24 80
Why are these distinctions of
scales important?
First, the data convey different amount of information.
Ratio scales provide more information than do interval
scales, interval more than ordinal.
Hence, if possible, researchers should use the type of
measurement that will provide them with the maximum
amount of information needed to answer the research
question being investigated.
Second, some types of statistical procedures are inappropriate
for the different scales.
The way in which the data in a research study are organized
dictates the use of certain types of statistical analysis, but not
others.
05/28/24 81
Summary
Generally, to achieve measurement, we must include the processes
of operations that are going to be used to measure the phenomenon
or variable under study. Such a definition is called an operational
definition. In essence, an operational definition of a variable
describes how or by what means we are going to measure the
variable.
Nominal Groups and labels data only, reports
frequencies or percentages.
Ordinal Ranks data, uses numbers only to
Indicate ranking.
Interval Assumes that differences between scores
05/28/24 83
Definition
Document is a paper or other material, thing
affording information, proof, or evidence of any
thing.
Analysis is “a resolving or separating of a thing
in to its elements or components; the tracing of
things to their source, and the general principles
underlying individual phenomena.
Thus, document analysis is the act of getting a
certain piece of information or evidence from a
certain source and checking its authenticity and
accuracy by discovering its source or origin.
05/28/24 84
Definition
O’ Leary (2004:177) states document and
document analysis in three ways:
First, the term ‘document’ can refer to more than
just paper, and can include photographs, works
of art, and even television programs. Often the
word ‘text’ is used to represent this range of
data.
Second, document analysis refers to these
‘texts’ as a primary data source – or data in its
own right.
Finally, document analysis refers to both data
collection method and a mode of analysis.
05/28/24 85
Introduction
Maykut and Morehouse (1994:71) say: “ the process of
qualitative data analysis takes many forms, but it is
fundamentally a non mathematical analytical procedure
that involves examining the meaning of peoples’ words
and actions.”
Best and Kahn (2003:248) said the following about
document analysis:
“The analysis is concerned with the explanation of the status
of some issue at a particular time or its development over a
period of time.”
So the work of document analysis ranges from discovering
the needed documents to the process of analyzing and
explaining them for the purpose of our research work.
05/28/24 86
The Significance of Document Analysis
Researchers can use these resources of data to gain
valuable insights, identify potential trends, and explain
how things got to be the way they are now (Mills,
2003:67).
For example, teachers can use documents like
attendance rates, dropout rates and suspension rates to
study their frequencies and come up with a certain kind
of solution.
There is also a possibility of suggesting and testing of
hypothesis when we engage ourselves in the process of
analyzing documents (Majumdar, 2005:290).
05/28/24 87
The Significance of Document Analysis
Furthermore, document analysis is used in educational
projects to get evidence that will supplement or
strengthen the data that we have already gathered by
other data collection instruments like interview and
observation.
It can also be used on its own when we are unable to
have access to the subjects of the research (Bell,
1993:67).
In general, through detailed observation and analysis of
documents, we can understand a past event, person, or
movement.
As we all know, this in turn will give us a great insight into
and understanding of the bases of current issues (Mc
Millan and Schumacher, 43).
05/28/24 88
Types of Documents to be Analyzed
The types of document that we analyze are wide and
varied.
O’ Leary (2004:178) has grouped the various documents
in to five types:
1. Authoritative Sources:
These are documents whose authors and the authorities
try to be unbiased and objective.
They include censuses, surveys, books, journals,
independent inquiries and reports.
2. The Partly Line:
These are documents which are issued by political
organizations with their own agenda.
When we analyze these documents we need to take the
interest of the author in to account.
05/28/24 89
Types of Documents to be Analyzed
3. Personal Communication:
This includes letters, e-mails, memories, sketches, drawings,
photographs, diaries, memos, journals, etc., that are by their very
nature, personal and subjective.
These are considered to be rich source of information on their own
or in conjunction with interviews and/or observational data.
4. Multimedia:
This can refer to newspaper or magazine columns /articles, current
affairs shows, news reports, TV sitcoms, commercials, etc.
These types of texts could be used to examine for example, how
often females are given the roles to participate in writing and acting
in the above areas in the relation to males etc.
5. Historical Documents:
These are like organization’s records, minutes, policy documents
etc.
They could also include materials which have historical significance
for the researcher.
05/28/24 90
Documents to be studied in applied linguistics
Policy documents
Curriculum Materials
Textbooks
Syllabi
Lesson plans
Language Tests
Test Results
Diaries
Learning Journals
Portfolios
Meeting minutes
Reports
05/28/24 91