Data Collection and Sampling
Data Collection and Sampling
Data Collection and Sampling
net/publication/322656396
CITATIONS READS
0 4,345
1 author:
Hitesh Mohapatra
Veer Surendra Sai University of Technology
40 PUBLICATIONS 3 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Hitesh Mohapatra on 23 January 2018.
Data Collection: Primary data, secondary data, Processing and analysis of data, Measurement
of relationship, Statistical measurement & significance, Random Sampling, Systematic
Sampling, Stratified Sampling, Cluster Sampling, Multistage Sampling.
UNIT III
The task of data collection begins after a research project has been defined and research
design/plan chalked out. Two types of data: primary data and secondary data.
Primary data are collected afresh and for first time and it is original in character. The secondary
data are already collected by someone else and have already been passed through statistical
process.
SURVEY EXPERIMENT
Survey are conducted in case of descriptive Experiments are a part of experimental
research research.
Survey type research usually have larger Experimental studies need smaller sample
sample.
Survey are concerned with describing Experimental research provide a systematic &
recording, analyzing and interpreying logical logical method for answering question.
conditions that exist or existed. The variables Certain variables are carefully controlled or
that exist or have already occurred are manipulated.
selected and observed.
No manipulation of variables.
Survey are appropriate in case of social and Experiments are essential feature of physical
behaviourial sciences and natural sciences.
Survey are examples of field research. Experiments are a example of lab research.
Survey are concerned with hypothesis Experiments provide a method for hypothesis
formulation & testing. testing.
Possible relation between data & unknowns Experiments are meant to determine such
can bec studied through survey. relationship.
1. It is an expensive method.
2. Information provided by this method is limited.
3. Sometimes observation is biased. Some people create obstacle to collect data
effectively.
Participant Observation: If the observer observes by making himself a member of the group he
is observing so that he can experience what the members of the group experience, the
observation is called Participant Observation otherwise non- Participant Observation.
Uncontrolled observation: If the observation takes place in the natural setting then it is called
uncontrolled observation
Controlled observation: If the observation takes place according to definite pre-arranged plans
involving experimental procedure then it is called controlled observation. In controlled
observation we use mechanical instruments as aids to accuracy and standardization.
(B) INTERVIEW METHOD:
This method involves presentation of oral- verbal stimuli and reply of oral verbal responses.This
method can be used through personal interview and if possible through telephone interview.
(B1) Personal interview: It may be in form of direct personal investigation or indirect oral
investigation. In case of direct personal investigation the interviewer has to collect information
personally from the sources. In some cases it is impossible to contact directly the person
concerned. In such cases an indirect oral examination can be conducted under which the
interviewer has to cross examine other persons who are supposed to have knowledge about the
problem under investigation. Most of the commission/ committees appointed by govt. make use
of this method.
Structured interviews involve use of a set of predetermined questions and highly standardized
techniques of recordings
Focused interview is meant to focus attention on the given experience of the respondent & its
effects. The main task of interviewer is to confine the respondent to a discussion of issues with
which he seeks conversance.
Clinical interview is concerned with feelings or motivations or with the course of individuals life
experience.In non-directive interview the interviewer encourage the respondent to talk about
the given topic with minimum of direct questioning.
Merits
Demerits
Merits
Demerits
Structured questionnaires are simple to administer and relatively inexpensive to analyse. The
provision of alternative replies, at times, helps to understand the meaning of the question clearly.
On the basis of the results obtained in pretest (testing before final use) operations from the use of
unstructured questionnaires, one can construct a structured questionnaire for use in the main study.
2. Question sequence: To make the questionnaire effective and to ensure quality to the replies
received, a researcher should pay attention to the question-sequence. A proper sequence of
questions reduces the chances of individual questions being misunderstood.
The question-sequence must be clear and smoothly-moving.The first few questions are
particularly important because they are likely to influence the attitude of the respondent and
in seeking his desired cooperation. The opening questions should be such as to arouse human
interest.
The following type of questions should generally be avoided as opening questions in a
questionnaire:
1. questions that put too great a strain on the memory or intellect of the respondent;
2. questions of a personal character;
3. questions related to personal wealth, etc.
Knowing what information is desired, the researcher can rearrange the order of the questions
to fit the discussion in each particular case. Difficult questions must be relegated towards the end
so that even if the respondent decides not to answer such questions, information would have
already been obtained. The answer to a given question is a function not only of the question itself,
but of all previous questions.
Concerning the form of questions, we can talk about two principal forms, viz., multiple choice
question and the open-end question. The question with only two possible answers ( ‘Yes’ or ‘No’) is
called as a ‘closed question.’
Open-ended questions which are designed to permit a free response from the respondent rather
than one limited to certain stated alternatives are considered appropriate. Such questions give the
respondent considerable latitude in phrasing a reply. Getting the replies in respondent’s own words is
the major advantage of open-ended questions. But open-ended questions are more difficult to handle,
raising problems of interpretation, comparability and interviewer bias.
Researcher must use proper wordings of questions since reliable and meaningful returns depend
on it to a large extent. Simple words, which are familiar to all respondents should be employed. Words
with ambiguous meanings must be avoided. Similarly, danger words, catch-words or words with
emotional connotations should be avoided.
This method of data collection is very useful in extensive enquiries and can lead to fairly
reliable results. It is very expensive and is usually adopted in investigations conducted by
governmental agencies or by some big organisations.
Population census all over the world is conducted through this method.
DIFFERENCE BETWEEN QUESTIONNAIRES AND SCHEDULES
Questionnaire Schedule
The questionnaire is sent through mail to The schedule is filled out by the researcher or
informants to be answered. the enumerator, who can interpret questions
when necessary.
To collect data through questionnaire is cheap. It is more expensive.
Non-response is high . It is very low in schedules.
It is not always clear as to who replies. The identity of respondent is known.
Personal contact is not possible. Direct personal contact is established with
respondents.
It is used only when respondents are literate and Information can be gathered even when the
cooperative respondents is illiterate.
Wider and more representative distribution of There are difficulty in sending enumerators
sample is possible. over a relatively wider area.
The physical appearance of questionnaire must This is not so in case of schedules as they are
be attractive. to be filled in by enumerators and not by
respondents.
2. Distributor or store audits: Distributor or store audits are performed by distributors as well as
manufactures through their salesmen at regular intervals. Distributors get the retail stores audited
through salesmen and use such information to estimate market size, market share, seasonal
purchasing pattern and so on. The data are obtained in such audits by observation. The advantage
of this method is that it offers the most efficient way of evaluating the effect on sales of variations
of different techniques in-store promotion.
3. Pantry audits: Pantry audit technique is used to estimate consumption of the basket of goods
at the consumer level. The investigator collects an inventory of types, quantities and prices of
commodities consumed. Data are recorded from the examination of consumer’s pantry. The
objective is to find out what types of consumers buy certain products and certain brands.
Limitation of pantry audit is that, at times, it may not be possible to identify consumers’
preferences from the audit data alone.
4. Consumer panels: An extension of the pantry audit approach on a regular basis is known as
‘consumer panel’, where a set of consumers are arranged to come to an understanding to maintain
detailed daily records of their consumption and the same is made available to investigator on
demands. In other words, a consumer panel is essentially a sample of consumers who are interviewed
repeatedly over a period of time.
Initial interviews are conducted before the phenomenon takes place to record the attitude of the
consumer. A second set of interviews is carried out after the phenomenon has taken place to find out
the consequent changes occurred in the consumer’s attitude. Consumer panels have been used in the
area of consumer expenditure, public opinion and radio and TV listenership.
5.Use of mechanical devices: The use of mechanical devices has been widely made to collect
information by way of indirect means. Eye camera, Pupilometric camera, Psychogalvanometer,
Motion picture camera and Audiometer are the principal devices so far developed and commonly
used by modern big business houses.
Eye cameras are designed to record the focus of eyes of a respondent on a specific portion of a
sketch or diagram or written material. Such an information is useful in designing advertising material.
Pupilometric cameras record dilation of the pupil as a result of a visual stimulus. The extent of dilation
shows the degree of interest aroused by the stimulus. Psychogalvanometer is used for measuring the
extent of body excitement as a result of the visual stimulus. Motion picture cameras can be used to
record movement of body of a buyer while deciding to buy a consumer good from a shop or big store.
6. Projective techniques: Projective techniques (or indirect interviewing techniques) for the
collection of data is to use projections of respondents for inferring about underlying motives,
urges, or intentions which are such that the respondent either resists to reveal them or is unable to
figure out himself. In projective techniques the respondent in supplying information tends
unconsciously to project his own attitudes or feelings on the subject under study. Projective
techniques is important in motivational researches or in attitude surveys.
(i) Word association tests: These tests are used to extract information regarding such words which
have maximum association. This technique is frequently used in advertising research.
(ii) Sentence completion tests: These tests happen to be an extension of the technique of word
association tests. This technique permits the testing not only of words (as in case of word
association tests), but of ideas as well and thus, helps in developing hypotheses and in the
construction of questionnaires.
(iii) Story completion tests: Such tests are a step further wherein the researcher may contrive
stories instead of sentences and ask the informants to complete them. The respondent is given just
enough of story to focus his attention on a given subject and he is asked to supply a conclusion to
the story.
(iv) Verbal projection tests: These are the tests wherein the respondent is asked to comment on or
to explain what other people do. For example, why do people smoke? Answers may reveal the
respondent’s own motivations.
(v) Pictorial techniques: There are several pictorial techniques. The important ones are as follows:
(a) Thematic apperception test (T.A.T.): The TAT consists of a set of pictures that are
shown to respondents who are asked to describe what they think the pictures represent.
The replies of respondents constitute the basis for the investigator to draw inferences
about their personality structure, attitudes, etc.
(b) Rosenzweig test: This test uses a cartoon format wherein we have a series of cartoons
with words inserted in ‘balloons’ above. The respondent is asked to put his own words
in an empty balloon space provided for the purpose in the picture.
(c) Rorschach test: This test consists of ten cards having prints of inkblots. The design
happens to be symmetrical but meaningless. The respondents are asked to describe what
they perceive in such symmetrical inkblots
(d) Holtzman Inkblot Test (HIT): This test consists of 45 inkblot cards which are based on
colour, movement, shading and other factors involved in inkblot perception. Only one
response per card is obtained from the subject (or the respondent) and the responses of a
subject are interpreted at three levels of form appropriateness. Form responses are
interpreted for knowing the accuracy (F) or inaccuracy (F–) of respondent’s percepts;
shading and colour for ascertaining his affectional and emotional needs.
(e) Tomkins-Horn picture arrangement test: This test is designed for group administration.
It consists of twenty-five plates, each containing three sketches that may be arranged in
different ways . Respondent is asked to arrange them in a sequence. The responses are
interpreted as providing evidence confirming certain norms, respondent’s attitudes, etc.
(vi) Play techniques: Subjects are asked to improvise a situation in which they have been
assigned various roles. The researcher observe such traits as hostility, dominance, sympathy,
prejudice or the absence of such traits. These techniques have been used for knowing the attitudes
of younger ones through manipulation of dolls.
(vii) Quizzes, tests and examinations: This is also a technique of extracting information regarding
specific ability of candidates indirectly. In this procedure both long and short questions are
framed to test analytical ability.
(viii) Sociometry: It is a technique for describing the social relationships among individuals in a
group. It attempts to describe attractions or repulsions between individuals by asking them to indicate
whom they would choose or reject in various situations.
7. Depth interviews: Depth interviews are those interviews that are designed to discover underlying
motives and desires and are often used in motivational research. Such interviews are held to explore
needs, desires and feelings of respondents. They aim to elicit unconscious as also other types of
material relating especially to personality dynamics and motivations. Depth interviews require great
skill of interviewer and involve considerable time.
8. Content-analysis: Content-analysis consists of analysing the contents of documentary
materials such as books, magazines, newspapers and the contents of all other verbal materials
which can be either spoken or printed. Since 1950’s content-analysis is qualitative analysis
concerning the message of the existing documents. Content-analysis is measurement through
proportion. It measures pervasiveness and that is sometimes an index of the intensity of the force.
The sources of unpublished data are many; they may be found in diaries, letters, unpublished
biographies and autobiographies and also may be available with scholars and research workers, trade
associations, labour bureaus and other public/ private individuals and organisations.
Researcher must be very careful in using secondary data. He must make a minute scrutiny
because it is just possible that the secondary data may be unsuitable or may be inadequate in the
context of the problem which the researcher wants to study. In this connection
The researcher before using secondary data, must see that they possess following characteristics:
1. Reliability of data: The reliability can be tested by finding out such things about the said data:
(a) Who collected the data?
(b) What were the sources of data?
(c) Were they collected by using proper methods
(d) At what time were they collected?
(e) Was there any bias of the compiler?
(f) What level of accuracy was desired? Was it achieved ?
2. Suitability of data: The data that are suitable for one enquiry may not be suitable in another
enquiry. If the available data are unsuitable, they should not be used by the researcher.
Researcher must scrutinise the definition of various terms and units of collection used at the time
of collecting the data from the primary source originally. The object, scope and nature of the
original enquiry must be studied. If the researcher finds differences in these, the data will remain
unsuitable.
3. Adequacy of data: If the level of accuracy achieved in data is found inadequate for the
purpose of the present enquiry, they will be considered as inadequate and should not be used by
the researcher. The data will also be inadequate, if they are related to an area which may be either
narrower or wider than the area of the present enquiry.
The already available data should be used by the researcher only when he finds them reliable,
suitable and adequate.The most desirable approach with regard to the selection of the method
depends on the nature of the particular problem and on the time and resources (money and
personnel) available along with desired accuracy. But, this depends upon the ability and
experience of the researcher.
Guidelines for Constructing Questionnaire/Schedule
1. The researcher must keep in view the problem he is to study for it provides the starting
point for developing the Questionnaire/Schedule. He must be clear about the various
aspects of his research problem to be dealt with in the course of his research project.
2. Appropriate form of questions depends on the nature of information sought, the sampled
respondents and the kind of analysis intended. The researcher must decide whether to
use closed or open-ended question.
3. Rough draft of the Questionnaire/Schedule be prepared, giving due thought to the
appropriate sequence of putting questions.
4. Researcher must invariably re-examine, and in case of need may revise the rough draft
for a better one. Technical defects must be minutely scrutinised and removed.
5. Pilot study should be undertaken for pre-testing the questionnaire. The questionnaire
may be edited in the light of the results of the pilot study.
6. Questionnaire must contain simple but straight forward directions for the respondents so
that they may not feel any difficulty in answering the questions.
2. Editing: Editing of data is done to detect errors and omissions and to correct these when
possible. It involves a careful scrutiny of the completed questionnaires and/or schedules. Editing
is done to assure that the data are accurate, consistent with other facts gathered, uniformly
entered.
Field editing consists in the review of the reporting forms by the investigator for completing
(translating or rewriting) what the latter has written in abbreviated and/or in illegible form. This
editing should be done soon after the interview. While doing field editing, the investigator must
restrain himself and must not correct errors of omission by guessing what the informant would have
said .
After all forms or schedules have been completed and returned to the office Central editing takes
place. All forms should get a thorough editing by a single editor in a small study and by a team of
editors in case of a large inquiry.
Editor(s) may correct the obvious errors such as an entry in the wrong place, entry recorded in
months when it should have been recorded in weeks, and the like. In case of inappropriate on missing
replies, the editor can sometimes determine the proper answer by reviewing the other information in
the schedule. All the wrong replies, which are quite obvious, must be dropped from the final results.
Dos and donots for Editors
(a) They should be familiar with instructions given to the interviewers and coders as well as
with the editing instructions supplied to them for the purpose.
(b) While crossing out an original entry for one reason or another, they should just draw a
single line on it so that the same may remain legible.
(c) They must make entries (if any) on the form in some distinctive colur and that too in a
standardised form.
(d) They should initial all answers which they change or supply.
e) Editor’s initials and the date of editing should be placed on each completed form or
schedule.
3. Coding: It refers to the process of assigning numerals or other symbols to answers so that
responses can be put into a limited number of categories or classes. Such classes should be
appropriate to the research problem under consideration. They must also possess the
characteristic of exhaustiveness (i.e., there must be a class for every data item) and a specific
answer can be placed in one and only one cell in a given category set. Every class is defined in
terms of only one concept.
Coding is necessary for efficient analysis and through it the several replies may be reduced to
a small number of classes which contain the critical information required for analysis. Coding
decisions should usually be taken at the designing stage of the questionnaire. This makes it
possible to precode the questionnaire choices and which in turn is helpful for computer tabulation
as one can straight forward key punch from the original questionnaires. But in case of hand
coding we code in the margin with a coloured pencil. The other method can be to transcribe the
data from the questionnaire to a coding sheet. Coding errors are to be eliminated or minimized.
4. Classification: It is the process of arranging Data in classes or groups on the basis of common
characteristics.
Classification can be one of the following two types.
(a)Classification according to attributes: Datas are classified with common characteristics which
can be descriptive (such as literacy, honesty, etc.) or numerical (such as weight, height, etc.).
Descriptive characteristics refer to qualitative phenomenon; only their presence or absence can be
noticed. Data obtained this way on the basis of certain attributes are known as statistics of
attributes and their classification is called classification according to attributes.
Such classification can be simple classification or manifold classification. In simple classification
we consider one attribute and divide the universe into two classes—one class consisting of items
possessing the given attribute and the other class which do not possess the given attribute.
In manifold classification we consider two or more attributes simultaneously, and divide that data
n
into a number of classes (total number of classes of final order is given by 2 , where n = number
of attributes). Attributes are defined in such a manner that there is least possibility of any
doubt/ambiguity .
Step I: How may classes should be there? What should be their magnitudes?
The objective should be to display the data in such a way as to make it meaningful for the analyst.
Typically, we may have 5 to 15 classes.
While determining class magnitudes. Some statisticians adopt the following formula, suggested
by H.A. Sturges, determining the size of class interval:
i = R/(1 + 3.3 log N)
where
i = size of class interval; N = Number of items to be grouped.
R = Range (i.e., difference between the values of the largest item and smallest
item among the given items);
In case one or two or very few items have very high or very low values, one may use what are
known as open-ended intervals in the overall frequency distribution.
Step II: How to choose class limits?
The mid-point of a class-interval and the actual average of items of that class interval should
remain as close to each other as possible. Class limits should be in multiples of 2, 5, 10, 20, 100
etc.
Class limits may generally be stated in any of the following forms:
Exclusive type class intervals: They are usually stated as follows:
10–20 (read as 10 and under 20)
20–30 (read as 20 and under 30)
30–40
40–50
Under exclusive type class intervals, the upper limit of a class interval is excluded
and items with values less than the upper limit are put in the given class interval.
Inclusive type class intervals:
11–20(read as 11 and under 21)
21–30
31–40
41–50
In inclusive type class intervals the upper limit of a class interval is also included in
the concerning class interval. Thus 20 will be put in 11–20 class interval. The stated
upper limit of the class interval 11–20 is 20 but the real limit is 20.99999
When the phenomenon under consideration happens to be a discrete one (i.e., can be
measured and stated only in integers), then we should adopt inclusive type classification. But
when the phenomenon happens to be a continuous one capable of being measured in fractions as
well, we can use exclusive type class intervals.
Step-III To find the frequency of each class?
This can be done either by tally sheets or by mechanical aids. Under the technique of tally sheet,
the class-groups are written on a sheet of paper known as tally sheet and for each item a stroke
(a small vertical line) is marked against the class group in which it falls. After every four small
vertical lines in a class group, the fifth line for the item falling in the same group, is indicated as
horizontal line through the said four lines.All this facilitates the counting of items in each one of
the class groups. An illustrative tally sheet can be shown as under:
Table: Tally Sheet for Determining the Number of 70 Families in Different
Income Groups
1601 and
Above IIII II 7
Total 70
Alternatively, class frequencies can be determined, in large inquires and surveys, by machines .
5. Tabulation: procedure of arranging data in concise and logical order is called tabulation.
Thus, tabulation is the process of summarising raw data and displaying the same in compact form
for further analysis.
Tabulation is essential because of the following reasons.
1. It conserves space and reduces explanatory and descriptive statement to a minimum.
2. It facilitates the process of comparison.
3. It facilitates the summation of items and the detection of errors and omissions.
4. It provides a basis for various statistical computations.
Tabulation can be done by hand or by mechanical or electronic devices. In large inquiries, we use
computer tabulation if other factors are favourable and facilities are available. The card sorting
method is flexible for hand tabulation. In this method the data are recorded on special cards of
convenient size and shape with a series of holes. Each hole stands for a code and when cards are
stacked, a needle passes through particular hole representing a particular code. These cards are then
separated and counted. Tabulation is classified as simple and complex tabulation. The former type of
tabulation gives information about one or more groups of independent questions, whereas the latter
type of tabulation shows the division of data in two or more categories.
ANALYSIS OF DATA
Analysis means the computation of certain indices or measures along with searching for patterns of
relationship that exist among the data groups. Analysis, in survey or experimental data, involves
estimating the values of unknown parameters of the population and testing of hypotheses for drawing
inferences. Analysis is categorised as descriptive analysis and inferential analysis (Inferential analysis
is called statistical analysis).
Descriptive analysis is largely the study of distributions of one variable. This study provides us with
profiles of companies, work groups, persons and other subjects on any of a multiple of characteristics
such as size. Composition, efficiency, preferences, etc. This sort of analysis may be in respect of one
variable (unidimensional analysis), or of two variables (bivariate analysis) or of more than two
variables (multivariate analysis). We work out various measures that show the size and shape of a
distribution(s) along with the study of measuring relationships between two or more variables.
Correlation analysis studies the joint variation of two or more variables for determining the amount of
correlation between two or more variables.
Causal analysis is concerned with the study of how one or more variables affect changes in another
variable. It is study of functional relationships existing between two or more variables. This analysis is
called regression analysis.
(a) Multiple regression analysis: This analysis is adopted when the researcher has one dependent
variable which is presumed to be a function of two or more independent variables. The objective
of this analysis is to make a prediction about the dependent variable based on its covariance with
all the concerned independent variables.
(b) Multiple discriminant analysis: This analysis is appropriate when the researcher has a single
dependent variable that cannot be measured, but can be classified into two or more groups on the
basis of some attribute. It is used to predict an entity’s possibility of belonging to a particular
group based on several predictor variables.
(c) Multivariate analysis of variance (or multi-ANOVA): This analysis is an extension of two-way
ANOVA, wherein the ratio of among group variance to within group variance is worked out on a
set of variables.
(d) Canonical analysis: This analysis can be used in case of both measurable and non-
measurable variables for the purpose of simultaneously predicting a set of dependent variables
from their joint covariance with a set of independent variables.
(e)Inferential analysis is concerned with the various tests of significance for testing hypotheses in
order to determine with what validity data can be said to indicate some conclusion or conclusions.
It is also concerned with the estimation of population values. It is mainly on the basis of
inferential analysis that the task of interpretation (i.e., the task of drawing inferences and
conclusions) is performed.
When the population is based on two or more variables, we like to measure relationship between
variables. The measurement of relationship between the two or more variables can give us idea of
effect of one variable on the other.
Cross tabulation approach is useful when the data are in nominal form. We classify each variable
into two or more categories and then cross classify the variables in these sub-categories. Then we look
for interactions between them which may be symmetrical, reciprocal or asymmetrical.
A symmetrical relationship is one in which the two variables vary together. A reciprocal
relationship exists when the two variables mutually influence each other. Asymmetrical relationship
exists if one variable (independent variable) is responsible for another variable (dependent variable).
The correlation, found through this approach is not a very powerful form of statistical correlation
and we use some other methods when data happen to be either ordinal or interval or ratio data.
X i Yi n X Y
or r 2 2 2 2
X nX Y nY
i I
In addition to setting inequalities and forming differences if we can form quotients such data
refer to ratio data.
Example : measurement of height, area, weight etc.
(b) Ordinal scale: The lowest level of the ordered scale used is the ordinal scale. The ordinal
scale places events in order. Rank orders represent ordinal scales and are frequently used in
research relating to qualitative phenomena. A student’s rank in his graduation class involves the
use of an ordinal scale.
For instance, if Ram’s position in his class is 10 and Mohan’s position is 40, it cannot be said that
Ram’s position is four times as good as that of Mohan. Ordinal scales only permit the ranking of
items from highest to lowest. Ordinal measures have no absolute values, and the real differences
between adjacent ranks may not be equal. One person is higher or lower on the scale than
another, but more precise comparisons cannot be made.
Thus, the use of an ordinal scale implies a statement of ‘greater than’ or ‘less than’ (equality
statement is also acceptable) without our being able to state how much greater or less. Since the
numbers of this scale have only a rank meaning, the appropriate measure of central tendency is
the median. A percentile or quartile measure is used for measuring dispersion. Correlations are
restricted to various rank order methods. Measures of statistical significance are restricted to the
non-parametric methods.
(d)Interval scale: In interval scale, the intervals are adjusted in terms of some rule for making the
units equal. The units are equal only in so far as one accepts the assumptions on which the rule is
based. Interval scales can have an arbitrary zero, but it is not possible to determine for them what
may be called an absolute zero.
The limitation of the interval scale is the lack of a true zero; it does not have the capacity to
measure the complete absence of a characteristic. The Fahrenheit scale is an example of an
interval scale and shows similarities in what one can and cannot do with it. One can say that an
increase in temperature from 30° to 40° involves the same increase in temperature as an increase
from 60° to 70°, but one cannot say that the temperature of 60° is twice as warm as the
temperature of 30° because both numbers are dependent as zero on the scale is set arbitrarily at
the temperature of the freezing point of water. The ratio of the two temperatures, 30° and 60°,
means nothing because zero is an arbitrary point.
Interval scales provide more powerful measurement than ordinal scales for interval scale also
incorporates the concept of equality of interval. Mean is the appropriate measure of central
tendency, standard deviation is used as measure of dispersion. The tests for statistical significance
are the ‘t’ test and ‘F’ test.
(d) Ratio scale: Ratio scales have an absolute or true zero of measurement. We can conceive of
an absolute zero of length and time. For example, the zero point on a centimeter scale indicates
the complete absence of length or height. But an absolute zero of temperature is theoretically
unobtainable. The number of minor traffic-rule violations and the number of incorrect letters in a
page of type script represent scores on ratio scales. Both these scales have absolute zeros and as
such all minor traffic violations and all typing errors can be assumed to be equal in significance.
Ratio scale represents the actual amounts of variables. Measures of physical dimensions such
as weight, height, distance, etc. are examples. Generally, all statistical techniques are usable with
ratio scales and all manipulations that one can carry out with real numbers can also be carried out
with ratio scale values. Multiplication and division can be used with this scale but not with other
scales mentioned above.
Thus, proceeding from the nominal scale to ratio scale, relevant information is obtained
increasingly. If the nature of the variables permits, the researcher should use the scale that provides
the most precise description. Researchers in physical sciences have the advantage to describe variables
in ratio scale form but the behavioural sciences are generally limited to describe variables in interval
scale form, a less precise type of measurement.
1. Validity
Validity indicates the degree to which an instrument measures what it is supposed to measure.
Validity is the extent to which differences found with a measuring instrument reflect true differences
among those being tested. We seek other relevant evidence we have found with our measuring tool.
The three types of validity are: (i) Content validity; (ii) Criterion-related validity and (iii) Construct
validity.
(i) Content validity is the extent to which a measuring instrument provides adequate coverage of
the topic under study. If the instrument contains a representative sample of the universe, the
content validity is good. It can be determined by using a panel of persons who shall judge how
well the measuring instrument meets the standards.
(ii) Criterion-related validity relates to our ability to predict some outcome or estimate the existence
of some current condition. This form of validity reflects the success of measures used for some
empirical estimating purpose. The concerned criterion must possess the following qualities:
Relevance: (A criterion is relevant if it is defined in terms we judge to be the proper measure.)
Freedom from bias: (Freedom from bias is attained when the criterion gives each subject an equal
opportunity to score well.)
Reliability: (A reliable criterion is stable or reproducible.)
Availability: (The information specified by the criterion must be available.)
Criterion-related validity refers to (i) Predictive validity and (ii) Concurrent validity. The
former refers to the usefulness of a test in predicting some future performance whereas the latter
refers to the usefulness of a test in closely relating to other measures of known validity. Criterion-
related validity is expressed as the coefficient of correlation between test scores and some
measure of future performance or between test scores and scores on another measure of known
validity.
(iii) Construct validity is the most complex and abstract. A measure is said to possess construct
validity to the degree that it confirms to predicted correlations with other theoretical propositions.
Construct validity is the degree to which scores on a test can be accounted for by the explanatory
constructs of a sound theory. For determining construct validity, we associate a set of other
propositions with the results received from using our measurement instrument.
If the above stated criteria and tests are met with, we may state that our measuring instrument
is valid and will result in correct measurement; otherwise we have to look for more information.
2. Reliability
The test of reliability is another important test of sound measurement. A measuring instrument is
reliable if it provides consistent results. Reliable measuring instrument does contribute to validity,
but a reliable instrument need not be a valid instrument. Reliability is not as valuable as validity,
but it is easier to assess reliability. If the quality of reliability is satisfied by an instrument, then
while using it we can be confident that the transient and situational factors are not interfering.
Two aspects of reliability viz., stability and equivalence deserve special mention. The stability
aspect is concerned with securing consistent results with repeated measurements of the same person
and with the same instrument. Degree of stability is determined by comparing the results of repeated
measurements. The equivalence aspect considers how much error may get introduced by different
investigators or different samples of the items being studied. A good way to test for the equivalence of
measurements by two investigators is to compare their observations of the same events.
Reliability can be improved in the following two ways:
(i) By standardising the conditions under which the measurement takes place.This will
improve stability aspect.
(ii) By carefully designed directions for measurement with no variation from group to
group, by using trained and motivated persons to conduct the research and also by
broadening the sample of items used. This will improve equivalence aspect.
3. Practicality
The practicality characteristic of a measuring instrument can be judged in terms of economy, convenience
and interpretability. From the operational point of view, the measuring instrument should be economical,
convenient and interpretable. Economy consideration suggests that some trade-off is needed between the
ideal research project and that which the budget can afford. The length of measuring instrument is an
important area where economic pressures are quickly felt. Data-collection methods to be used are also
dependent at times upon economic factors.
Convenience test suggests that the measuring instrument should be easy to administer. Attention to the
proper layout of measuring instrument is required. For instance, a questionnaire, with clear instructions is
more effective and easier to complete.
Interpretability consideration is important when persons other than the designers of the test are to interpret
the results. The measuring instrument, in order to be interpretable, must be supplemented by (a) detailed
instructions for administering the test;(b) scoring keys; (c) evidence about the reliability and (d)
guides for using the test and for interpreting results.
4. Accuracy The characteristic of accuracy of measurement scale must be a true
representative of observation of underlying characteristic. For example measuring with
an inch scale will provide accurate value only upto one eighth of an inch but measuring
with cm scale will provide more accurate value.
Proportional allocation is considered most efficient and an optimal design when the cost of selecting an
item is equal for each stratum, there is no difference in within-stratum variances, and the purpose of
sampling happens to be to estimate the population value of some characteristic. But in case the purpose
happens to be to compare the differences among the strata, then equal sample selection from each stratum
would be more efficient even if the strata differ in sizes.
In cases where strata differ not only in size but also in variability and it is considered reasonable to take
larger samples from the more variable strata and smaller samples from the less variable strata, we can then
account for both (differences in stratum size and differences in stratum variability) .
(iii) Cluster sampling: If the total area of interest is big one, a convenient way in which a sample
can be taken is to divide the area into a number of smaller non-overlapping areas and then to
randomly select a number of these smaller areas (called clusters).
In cluster sampling the total population is divided into a number of relatively small
subdivisions which are themselves clusters of still smaller units and then some of these clusters
are randomly selected for inclusion in the overall sample. Suppose we want to estimate the
proportion of machine-parts in an inventory which are defective. Also assume that there are
20000 machine parts in the inventory at a given point of time, stored in 400 cases of 50 each.
Now using a cluster sampling, we would consider the 400 cases as clusters and randomly select ‘
n’ cases and examine all the machine-parts in each randomly selected case.
Cluster sampling, reduces cost by concentrating surveys in selected clusters. But it is less
precise than random sampling. There is also not as much information in ‘ n’ observations within a
cluster as there happens to be in ‘ n’ randomly drawn observations. Cluster sampling is used only
because of the economic advantage it possesses; estimates based on cluster samples are usually
more reliable per unit cost.
(iv) Multi-stage sampling: Multi-stage sampling is a further development of the principle of
cluster sampling. Suppose we want to investigate the working efficiency of nationalized banks in India
and we want to take a sample of few banks. The first stage is to select large primary sampling unit
such as states in a country. Then we select certain districts and interview all banks in the chosen
districts. This would represent a two-stage sampling design with the ultimate sampling units
being clusters of districts.
If instead of taking a census of all banks within the selected districts, we select certain towns
and interview all banks in the chosen towns. This would represent a three-stage sampling design.
If instead of taking a census of all banks within the selected towns, we randomly sample banks
from each selected town, then it is a case of using a four-stage sampling plan. If we select
randomly at all stages, we will have what is known as ‘multi-stage random sampling design’.
Multi-stage sampling is applied in big inquires extending to a considerable large geographical
area, say, the entire country. There are two advantages.
(a) It is easier to administer than most single stage designs because sampling frame under multi-
stage sampling is developed in partial units.
(b) A large number of units can be sampled for a given cost under multistage sampling because of
sequential clustering, whereas this is not possible in most of the simple designs.