Validity

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Psychological Asssessment

VALIDITY
Prepared and Presented By: Ms. Giselle C. Honrado
Learning Objectives:
Explain the concept of validity.
Differentiate the types of validity.
Discuss the issues concerning validity, fairness in test
use, and test bias.
The Concept of Validity
Validity - a judgment or estimate of how well a test
measures what it purports to measure in a
particular context
Validity - a judgment based on evidence about the
appropriateness of inferences drawn from test
scores.
Inference - logical result or deduction
The Concept of Validity

Validity - a judgment of how useful the instrument


is for a particular purpose with a particular
population of people.
Validity - concerned with what an instrument
measures, how well it does so, and the extent to
which meaningful inferences can be made from the
instrument’s results
The Concept of Validity
Validation - the process of gathering and evaluating evidence
about validity.
Test developer and test user may play a role in the validation
of a test in a specific purpose
Test developer's responsibility: to supply validity evidence in
the test manual
Test user: conduct their own validation studies with their own
group of test takers.
Validation studies - a test user compares the accuracy of a
measure with a gold standard (established) measure. (Source:
https://pubmed.ncbi.nlm.nih.gov)
The Concept of Validity
Local validation studies - may yield insights regarding a particular
population of test takers as compared to the norming sample described
in a test manual.
Local validation studies - necessary when the test user plans to alter in
some way the format, instructions, language, or content of the test.
Local validity tests demonstrate the correlation between two
variables (test scores and performance) across a large group of
individuals. This means that local validation studies require large
sample sizes. (Source:
https://www.criteriacorp.com/resources/glossary/local-validity-
study)
The Concept of Validity
Three Categories:
1. Content validity - based on an evaluation of the the subjects,
topics, or content covered by the items in the test.
2. Criterion-related validity - obtained by evaluating the
relationship of scores obtained on the test to scores on other
tests or measures.
The Concept of Validity
Three Categories:
3. Construct validity - arrived at by executing a comprehensive
analysis of:
how scores on the test relate to other test scores and
measures, and
how scores on the test can be understood within some
theoretical framework for understanding the construct
that the test was designed to measure.
The Concept of Validity
Trinitarian view
An approach that considers criterion-oriented (predictive),
content, and construct validity for the assessment of test
validity.
Construct validity as the "umbrella validity"
Explain why?
The Concept of Validity

The three conventionally listed aspects of validity—criterion-


related, content, and construct—are examined from a dual
perspective: aiding in the understanding of a construct and
establishing a basis for comparison between evaluations of the
validity of measurement and evaluations of the validity of a
hypothesis.
(Source: "On Trinitarian doctrines of validity" as retrieved from
https://psycnet.apa.org/record/1981-22475-001)
The Concept of Validity
There many ways of approaching the process of test validation,
often referred to as strategies.
Content validation strategies, criterion-related validation
strategies, and construct validation strategies.
Trinitarian approaches to validity assessment are not mutually
exclusive: each of the three conceptions of validity provides
evidence that contributes to a judgment concerning the validity
of a test.
All three types of validity evidence contribute to a unified
picture of a test's validity.
The Concept of Validity
Ecological validity - refers to a judgment regarding how well a
test measures what it purports to measure at the time and
place that the variable being measured (typically a behavior,
cognition, or emotion) is actually emitted.
what a test appears to measure to the person being tested

FACE
than to what the test actually measures.
judgment concerning how relevant the test items appear to

VALIDITY be.

"on the face of it" = test could be said to have high face
validity
a test's lack of face validity could contribute to a lack of
confidence in the perceived effectiveness of the test
CONTENT
VALIDITY
Describes a judgment of how adequately a test samples
behavior representative of the universe of behavior that
the test was designed to sample. (Ex. knowledge and skills
required to do a good job)
When a test has content validity, the items on the test
represent the entire range of possible items the test
should cover. Individual test questions may be drawn
from a large pool of items that cover a broad range of
topics.
CONTENT
VALIDITY
Test developers strive to include key components of the
construct targeted for measurement, and exclude content
irrelevant to the construct targeted for measurement.
For example, if you decide to develop an instrument to
measure depression and survey the domain content,
you may find depression is composed of physical,
psychological and cognitive factors
Q: How about intelligence test?
CONTENT
VALIDITY
Educational achievement tests: proportion of material
covered by the test approximates the proportion of
material covered in the course.
Test blueprint - "structure" of the evaluation; a plan
regarding the types of information to be covered by
the items (the number of items tapping each area of
coverage, the organization of the items, etc.)
CONTENT
VALIDITY

Example of "blueprint": Bloom's


taxonomy
Source: https://tips.uark.edu/using-
blooms-taxonomy/
CONTENT
VALIDITY
Culture and the relativity of content validity
A history test considered valid in one classroom, at
one time, and in one place will not necessarily be
considered so in another classroom, at another
time, and in another place.
Politics: another factor that may well play a part in
perceptions and judgments concerning the validity
of tests and test items; "politically correct"
CONTENT
VALIDITY

Self-Esteem Test by Rosenberg (1965)


CRITERION-RELATED
VALIDITY
a judgment of how adequately a test score can be used
to infer an individual's most probable standing on some
measure of interest (the criterion)
indicates the effectiveness of an instrument in predicting
an individual’s performance on a specific criterion
CRITERION-RELATED
VALIDITY
A test is said to have criterion-related validity when the test
has demonstrated its effectiveness in predicting criterion or
indicators of a construct, such as when an employer hires new
employees based on normal hiring procedures like interviews,
education, and experience.
Two types: Concurrent validity and Predictive validity
Q: Give one example of each type.
CRITERION-RELATED
VALIDITY
What is a Criterion?
Broad meaning: a standard on which a judgment or
decision may be based.
for our discussion: the standard against which a test
or a test score is evaluated.
relevant, valid, and uncontaminated
CRITERION-RELATED
VALIDITY
What is a Criterion?
Relevant - it is pertinent or applicable to the matter at
hand.
Valid - purpose: if one test (X) is being used as the
criterion to validate a second test (Y), then evidence
should exist that test X is valid.
Uncontaminated -
criterion contamination - term applied to a criterion
measure that has been based, at least in part, on
predictor measures.
CRITERION-RELATED
VALIDITY
CONCURRENT VALIDITY
*is concerned with the relationship between instrument’s
results and another currently obtainable criterion
refers to the extent to which the results of a measure

correlate with the results of an established measure of the


same or a related underlying construct assessed within a
similar time frame (Source:
https://methods.sagepub.com/reference)
its statement indicate the extent to which test scores may be
used to estimate an individual's present standing on a
criterion.
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY
*examines relationship between an instrument’s result
collected now and a criterion collected in the future
the degree to which test scores accurately predict scores on

a criterion measure. Example: the degree to which college


admissions test scores predict college grade point average
(GPA) (Source: https://www.sciencedirect.com/)
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY
Intervening event: may take varied forms, such as training,
experience, therapy medications, or simply the passage of
time.

how accurately scores on the test predict some criterion


measure
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY: Researchers must take into considerations of the
following:
Base rate - extent to which a particular trait, behavior,
characteristic, or attribute exists in the population
Hit rate - the proportion of people a test accurately identifies as
possessing or exhibiting a particular trait, behavior, characteristic,
or attribute.
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY: Researchers must take into considerations of the
following:
Miss rate - the proportion of people the test fails to identify as
having, or not having, a particular trait, behavior, characteristic, or
attribute
False Positive - A test result that indicates that a person has a
specific disease or condition when the person actually does not
have the disease or condition.
False Negative - A test result that indicates that a person does
not have a specific disease or condition when the person
actually does have the disease or condition.
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY
Validity coefficient - correlation coefficient that provides a
measure of the relationship between test scores on the criterion
measure.

example: computing the correlation coefficient between a

score (or classification) on a psychodiagnostic test and the


criterion score (or classification) assigned by
psychodiagnosticians
Pearson r
Spearman rho
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY
Validity coefficient -
Responsibility of the test developer: to report validation
data in the test manual

Responsibility of test users: to read carefully the description

of the validation study and then to evaluate the suitability of


the test for their specific purposes.
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY
Validity coefficient -
How high should a validity coefficient be for a user or a test
developer to infer that the test is valid?
There are no rules for determining the minimum acceptable
size of a validity coefficient.
Cronbach and Gleser (1965) cautioned against the
establishment of such rules.
They argued that validity coefficients need to be large
enough to enable the test user to make accurate decisions
within the unique context in which a test is being used.
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY

Incremental Validity - the degree to which an additional


predictor explains something about the criterion measure that is
not explained by predictors already in use.
the improvement obtained by adding a particular procedure
or technique to an existing combination of assessment
methods. In other words, incremental validity reflects the
value of each measure or piece of information to the process
and outcome of assessment. (APA Dictionary)
CRITERION-RELATED
VALIDITY
PREDICTIVE VALIDITY
Incremental Validity -
Hierarchical multiple regression analyses - assess
incremental validity to determine the contribution of one

measure to the prediction of the criterion after one or more

other variables have been entered into the analysis.


CONSTRUCT
VALIDITY
a judgment about the appropriateness of inferences drawn
from test scores regarding individual standings on a variable
called construct
Construct - an informed, scientific idea developed or
hypothesized to describe or explain behavior.
are unobservable, presupposed (underlying) traits that a
test developer may invoke to describe test behavior or
criterion performance.
has been viewed as the unifying concept for all validity
evidence.
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Evidence of homogeneity
Evidence of changes with age
Evidence of pretest-posttest changes
Evidence from distinct groups
Convergent evidence
Discriminant evidence
Factor analysis
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Evidence of homogeneity
how uniform a test in measuring a single concept.
the Pearson r could be used to correlate average subtest
scores with the average total test scores
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Evidence of changes with age
ex. reading rate tends to increase dramatically year by
year (age sensitive)
Ex, introversion vs. extraversion based from Jung's
Analytical Psychology
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Evidence of pretest-posttest changes
test scores change as a result of some experience
between a pretest and a posttest
intervening experiences are responsible for changes in
test scores
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Evidence from distinct groups (different groups) - method of
contrasted groups
male vs. female
if a test is a valid measure of a particular construct, then
test scores from groups of people who would presumed
to differ with respect to that construct should have
correspondingly different test scores
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Convergent evidence
if scores on the test undergoing construct validation tend to
correlate highly in the predicted direction with scores on older,
more established, and already validated tests designed to
measure the same (or a similar/related) construct
Discriminant evidence
a validity coefficient showing little (a statistically insignificant)
relationship between test scores and/or other variables with
which scores on the test being construct -validated should not
theoretically be correlated
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Factor analysis
a shorthand term for a class of mathematical procedures
designed to identify factors or specific variables that are
typically attributes, characteristics, or dimensions on which
people may differ.
Exploratory factor analysis (EFA)
Confirmatory factor analysis (CFA)
Factor loading - "a sort of metaphor. Each test is thought of
as a vehicle carrying a certain amount of one or more
abilities" (Tyler, 1965).
CONSTRUCT
VALIDITY
Various techniques of construct validation:
Factor analysis
Exploratory factor analysis (EFA) - Exploratory factor analysis (EFA) is generally used
to discover the factor structure of a measure and to examine its internal reliability. EFA
is often recommended when researchers have no hypotheses about the nature of the
underlying factor structure of their measure.
Confirmatory factor analysis (CFA) - a tool that a researcher can use to attempt to
reduce the overall number of observed variables into latent factors based on
commonalities within the data; used if there is a strong theory about the structure.
Reference: "Exploratory Factor Analysis: A Guide to Best Practice" retrieved from
https://journals.sagepub.com/doi/full/10.1177/0095798418771807)
Validity, Bias, and
Fairness
TEST BIAS
a factor inherent in a test that systematically
prevents accurate, impartial measurement
"Bias" implies systematic variation.

Prevention during test development is the best cure


for test bias, though a procedure called estimated
true score transformations represents one of many
available post hoc remedies (Mueller, 1949; see also
Reynolds & Brown, 1984).
Validity, Bias, and
Fairness
Rating error - a judgment resulting from the intentional or unintentional
misuse of a rating scale.
Leniency error (Generosity error) - tendency of to be lenient (lazy) in
scoring, marking, and/or grading.

Severity error - tendency to be harsh or too strict or negative


Central tendency error - the tendency of raters to just go the middle
road
Halo effect - describes the fact that, for some raters, some ratees
can do no wrong.
Validity, Bias, and
Fairness
TEST FAIRNESS
the extent to which a test is used in an impartial, just,
and equitable way.
Example: norms used for the most psychological tests

are from the western population. This create a cultural

bias for the norms used in the test.


Solution: Validation studies / Local Validation
studies**
A test should be both reliable and valid
Thank you for attention &
participation. Get ready for a QUIZ.

You might also like