Measurement: Scaling, Reliability and Validity
Measurement: Scaling, Reliability and Validity
Measurement: Scaling, Reliability and Validity
RELIABILITY, VALIDITY
GROUP MEMBERS:
AHMED JAN DAHRI (15-MBA-03)
ASGHAR ALI ARAIN (15-MBA-09)
1. Rating Scales
Dichotomous scale
Category scale
Likert scale
Numerical scales
Semantic differential scale
Itemized rating scale
Fixed or constant sum rating scale
Staple scale
Graphic rating scale
Consensus scale
Yes
No
Example 9.4
Example 9.8
Example 9.9
2. Ranking Scales
Ranking scales are used to tap preferences between two or among more
objects or items (ordinal in nature).
2.1 Paired Comparison
The paired comparison scale is used when, among a small number of
objects, respondents are asked to choose between two objects at a time. This
helps to assess preferences. If, for instance, in the previous example, during
the paired comparisons, respondents consistently show a preference for
product one over products two, three, and four, the manager reliably
understands which product line demands his utmost attention. However, as
the number of objects to be compared increases, so does the number of
paired comparisons. Hence paired comparison is a good method if the
number of stimuli presented is small.
Example 9.10
Rank the following magazines that you would like to subscribe to in the
order of preference, assignment 1 for the most preferred choice and 5 for the least
preferred.
Example 9.11
Rating scales are used to measure most behavioral concepts. Ranking scales
are used to make comparisons or rank the variables that have been tapped on
a nominal scale.
3. Goodness of Measures
It is important to make sure that the instrument that the instrument that we
develop to measure a particular concept is instrument that we develop to
measures a particular concept is indeed accurately measuring the variable,
and that in fact, we are actually measuring the concept that we set out to
measure. This ensures that in operationally defining perceptual and
attitudinal variables, we have not overlooked some important dimensions and
elements or included some irrelevant ones.
4. Reliability
The reliability of a measure indicates the extent to which it is without bias
(error free) and hence ensures consistent measurement across time and across
the various items in the instrument. In other words, the reliability of a
measure is an indication of the stability and consistency with which the
instrument measures the concept and helps to assess the goodness of a
measure.
b. Split-Half Reliability
Split-half reliability reflects the correlations between two halves of an
instrument. The estimates would vary depending on how the items in the
measure are split into two halves. Split-half reliabilities could be higher than
Cronbachs alpha only in the circumstance of there being more than one
underlying response dimension tapped by the measure and when certain
other conditions are met as well. Hence, in almost all cases, Cronbachs alpha
can be considered a perfectly adequate index of the interitem consistency
reliability.
5. Validity
Several types of validity tests are used to test the goodness of measures and
writers use different terms to denote them. For the sake of clarity, we may
group validity tests under three broad headings: content validity,
criterion-related validity, and construct validity.
5.1 Content Validity
Content validity ensures that the measure includes an adequate and
representative set of items that tap the concept. The more the scale items
represent the domain or universe of the concept being measured, the greater
the content validity. To put it differently, content validity is a function of how
well the dimensions and elements of a concept have been delineated.
Face validity is considered by some as a basic and a very minimum index of
content validity. Face validity indicates that the items that are intended to
measure a concept, do on the face of it look like they measure the concept.