In 1949, Dr. Raymond B. Cattell published one of the first
objective measures of normal personality, the Sixteen
Personality Factor Questionnaire (16PF®). Since its original
publication, the 16PF Questionnaire has matured through four
revisions into a widely used and well researched measure of
normal adult personality (Schuerger, 1992). The inventory is
administered worldwide, having been translated into over 40
languages. IPAT, the publisher of the 16PF Questionnaire, along
with licensed providers throughout the world, offer a number of
computerized interpretive reports for the inventory. An
expanded version of the 16PF instrument, the PsychEval
Personality Questionnaire, assesses traits in both the abnormal
and normal ranges of personality.
The 16PF Questionnaire is comprised of 16
Primary Factor scales and five Global Factor
scales developed via factor analysis of the
primary scales. Thus, the inventory provides a
two-tiered hierarchical system of personality
measurement; that is, the primary and global
scales measure the same personality domain but
at two levels of specificity.

When the 16PF Questionnaire was originally published in 1949, it was

the first test based on systematic scientific research into the basic
dimensions of human personality (Goldberg, 1993). Dissatisfied with
the approach of choosing a group of traits a priori and then
constructing a test to measure them, Raymond B. Cattell (1945) set
out to discover the fundamental building blocks of personality using
factor analysis. Cattell conceptualized the factor-analytically
discovered personality factors to be the basic elements of

Beginning with Cattell's original model, the 16PF Primary

Factors were intercorrelated. These relationships led to the
exploration of a higher-order factor structure and to the
discovery that small clusters of the primary scales comprise
"second-order" factors of personality. In the Fifth Edition, these
factors are termed global to better reflect the broad
personality domains that they represent.

The 16PF Questionnaire has been used effectively in a variety of settings

including industrial and organizational, clinical and counseling, educational,
and research. These uses have resulted in a wide range of prediction
equations for criteria such as educational achievement, creativity,
leadership, interpersonal skills, marital adjustment, and psychological
adjustment as well as for dozens of occupational profiles (Cattell, Eber, &
Tatsuoka, 1970). The 16PF instrument has also been rated among the most
frequently administered and recommended personality questionnaires
(Piotrowski & Keller, 1989) and among those most often referenced in
research articles (Graham & Lilly, 1984).


Since its first publication in
1949, the test has
undergone four major
revisions. The latest edition,
the 16PF Fifth Edition
Questionnaire (1993), is the
main subject of this book.

In contrast to previous editions, the In 2001, the test was

16PF Fifth Edition Questionnaire restandardized on a stratified
features simpler, updated language;
random sample of more than
a lower reading level; improved
psychometric characteristics; new 10,000 individuals, which
response-style indices; easier hand reflects the 2000 U.S. Census
scoring; and updated norms. figures for sex, race, and age.
16 - PF TEST
The 16PF Fifth Edition contains 185 multiple-choice items that are
written at a fifth-grade reading level. It provides scores on 16
primary personality scales (one of which is a short reasoning-
ability scale, positioned by itself at the end of the test) and five
global (Big Five) scales. Three response-style scales are also
included to help in identifying unusual response patterns that may
affect the validity of scores. Each primary scale contains 10–15
items, and each item has a three-choice answer format, with the
middle choice being a question mark (?).
A distinguishing characteristic of 16PF items is that they tend to
sample a broad range of normal behavior by asking test takers
about their behavior in specific situations (rather than merely
asking how they would rate themselves on personality traits, as is
the practice of many other tests). The test includes a wide range
of item types, including items that ask about actual behavior:
When I find myself in a boring situation,
I usually “tune out” and daydream
about other things. (a. true; b. ?; c.

• In talking to a friend, I tend to: (a. let

my feelings show; b. ?; c. keep my feelings

to myself )

• I hardly ever feel hurried or rushed as I
go about my daily tasks. (a. true: I don’t;
b. ?; c. false: I often feel rushed.)
RELIABILITY Reliabilities for the 16PF Fifth Edition’s primary
and global scales are comparable to those of
other personality measures even though the
scales are fairly short (10–15 items). These
reliabilities are summarized in Rapid Reference
1.5. Internal consistency reliabilities (how highly
the items in a scale correlate with each other) for
the primary scales average .76 (ranging from 68
to .87 over the 16 scales) in the normative
sample of 10,261 individuals.

16 - PF TEST
Test-retest reliabilities (or estimates of the consistency of scores
over time) for a 2-week interval ranged from .69 to .87 with a
median of .80. Two-month test-retest reliabilities ranged from
.56 to .79 with a median of .69. The 16PF global scales have even
higher reliabilities; 2-week test-retest estimates ranged from .84
to .91 with a mean of .87, and 2-month test-retest estimates
ranged from .70 to .82 with a median of .80. Further information
can be found in the 16PF Fifth Edition Technical Manual (Conn &
Rieke, 1994).


Because the 16PF dimensions were developed through factor analysis, construct validity
is provided by studies confirming its factor structure (e.g., Chernyshenko, Stark, & Chan,
2001; Conn & Rieke, 1994; Cattell & Krug, 1986; Gerbing & Tuley, 1991; Hofer, Horn, & Eber,
1997). Additionally, the factor structure has been confirmed in a range of languages (e.g.,
Italian: Barbaranelli & Caprara, 1996; French: Mogenet & Rolland, 1995; Japanese: Motegi,
1982; Spanish: Prieto, Gouveia, & Fernandez, 1996; and German: Schneewind & Graf, 1998).
16 T Y
- PF I D I
An extensive body of research dating back a half century provides evidence of the test’s
applied validity—its utility in counseling, clinical, career development, personnel selection
and development, educational, and research settings. Profiles and prediction equations exist
for a wide range of criteria such as leadership, creativity, academic achievement,
conscientiousness, social skills, empathy, self-esteem, marital adjustment, power dynamics,
coping patterns, cognitive processing style, and dozens of occupational profiles (Cattell,
Eber, & Tatsuoka, 1992; Conn & Rieke, 1994; Guastello & Rieke, 1993; Kelly, 1999; Krug
& Johns, 1990; Russell & Karol, 2002; Schuerger & Watterson, 1998).
By the 1980s, the 16PF Questionnaire was ranked among the highest in the
number of research articles (Graham & Lilly, 1984, p. 234), and a recent estimate
places the number of references since 1974 at more than 2,000 publications
(Hofer & Eber, 2002). Since the 1960s, the test has been noted as a significant
instrument in professional practice. For example, a study by Piotrowski and
Keller (1989) found the 16PF Questionnaire to be the most recommended of
general personality questionnaires. Research also suggests that the test is
somewhat more powerful than other major questionnaires in predicting real-life
behavior. A recent study (Goldberg, in press) compared many popular
personalities questionnaires in their ability to predict six behavioral clusters and
found that the 16PF dimensions had the highest predictive validity.
The 16PF Fifth Edition is designed to
be administered to adults (aged 16
and older), individually or in a group
The test offers paper-and-pencil,
computer software, and online
administration formats.
Normative data for the 16PF
instrument are based on age range of
16-82 years.
The adolescent version of the 16PF
instrument or the 16PF Adolescent
Personality Questionnaire (APQ) is
appropriate for ages 11-22.
IPAT offers an answer sheet for the 16PF Fifth Edition
that is compatible with both hand- and computer-
scoring options.

Hand Scoring
Sending to IPAT (Mail-In service)
Faxing to IPAT (OnFax service)
Using computer software (OnSite service)
Using online services
The test administrator is advised to take
time to establish a comfortable rapport
with examinees. With this in mind, the
administrator should give thoughtful
attention to examinees' questions and
should reinforce the test objectives by
telling examinees that, in the long run,
they will do the most good for
themselves by being frank and honest
in their self-description.
Test questions have a three-choice response format.
Except for the Factor B items, the middle response choice
is always a question mark (?). The 15 Factor B items,
which assess reasoning ability, are grouped together at
the end of the test booklet following the personality items.
This arrangement not only allows continuity in item
content but also enables separate assessment of
reasoning ability from that of personality in those
instances when this may be desirable.
The test is untimed, but examinees should be encouraged
to work at a steady pace. About 10 minutes into the
testing session, the administrator may want to
discourage examinees from agonizing over possible
responses by reiterating this caution included in the test
directions: "Remember, don't spend too much time
thinking over any one question. Give the first, natural
answer comes to you." Average test-completion time is
35-50 minutes by pencil and 25-35 minutes by computer
The 16PF Fifth Edition can be administered via personal
computer using IPAT OnSite System software or online.
These systems feature item-by-item test administration
that allows examinees to change the previous answer,
and the capabilities of immediately scoring tests and
processing reports. Research has demonstrated that
scores obtained from computerized administration are
equivalent to scores obtained via paper-and-pencil
administration for untimed tests (Mead & Drasgow, 1993).
Testing materials include the Fifth Edition test booklet and the
corresponding answer sheet, which may be hand- or
The administrator may either read the instructions aloud or
request examinees to read the instructions silently, responding
to their questions as necessary.
Briefly, the instructions advise examinees not to make any
marks in the test booklet, which is reusable.
Examinees also are cautioned to avoid skipping any questions
and to choose the first response that comes to mind rather than
spending too much time on any single question.

Before starting the test, examinees are asked to

complete the grids for name and gender on the left-
hand side of the answer sheet. If confidentiality is
desired for tests to be computer-scored, the grid for
I.D. number should be completed in lieu of the name
During testing, the administrator should check that examinees
are marking responses appropriately. Response circles must be
darkened completely with a No. 2 or softer lead pencil,
particularly if the test is to be computer-scored.
At the conclusion of testing, the administrator should review
each answer sheet to ensure that the name (or I.D. number)
and gender grids have been completed and that all items have
a single, clearly legible response. Examinees should be asked to
erase any extraneous marks, to fix incomplete erasures, to
complete missing answers, and to correct multiple answers to a
single item.
Hand Scoring
Hand scoring the test is a quick and simple procedure. The materials needed to
hand score include a set of four scoring keys, a norm table, and an Individual
Record Form. Before scoring, each answer sheet should be checked to ensure
that the test taker answered all items and gave only one response per item. Raw
scores are obtained by placing each scoring key over the answer sheet and
counting the items that are marked for each scale if they appear through holes
in the keys. The raw scores are then transformed to standard scores through the
use of the norm table. For an experienced scorer, the whole process takes only 6
or 7 minutes. Detailed hand-scoring instructions are provided in the test
administrator’s manual.

Score Reporting On the Sten Scale
Scores on the test are presented on a 10-point scale called a “sten” or
standard-ten scale, with a mean of 5.5 and a standard deviation of 2.
Scores are based on representative and up-to-date norms. The current
standardization sample was released in 2002 and has data on over
10,000 persons who are representative of the 2000 U.S. census for sex,
race, and age. As illustrated by Figure 2.1, a person scoring 4 is at the
23rd percentile, and one scoring 7 is at the 77th percentile. Traditionally,
in interpreting scores for individuals, scores below 4 are considered low
and scores above 7 are considered high. In addition, some professionals
refer to scores of 4 as low average and scores of 7 as high average. The
scales are bipolar, and even though they are designated high or low, a
high score should not be considered good and a low score should not be
regarded as bad.
All of the 16PF personality scales are bipolar—that is,
each end of each scale has a distinct definition and
meaning. Scores on the 16PF personality scales are
given in “standard-ten” or sten scores, which range
from 1 to 10, with a mean of 5.5 and a standard
deviation of 2. The sten-score ranges for the 16PF
scales are shown in Rapid Reference 3.2. A high score
on a scale is not regarded as good, and a low score is
not viewed as bad. Rather, a score toward either end
of the scale increases the likelihood that the trait
defined by the pole will be apparent and distinctive
in the client’s behavior. Whether that trait is
determined to have positive or negative effects
depends on the particular situation.
Developed in the 1950s and 1960s, the global scales represent the original
Big Five traits. Because they provide the interpreter with a brief summary
of an individual’s overall personality style, they serve as the framework for
organizing the more specific information provided by the primary scales.
An advantage of the global scales is that they are based on many more
items (40–50) than are the primary scales. Because the globals are more
reliable and robust than the primaries, more confidence can be placed in
their accuracy. A limitation of the global scales is that because they are
quite broad in meaning, they do not convey detailed information about
important nuances of an individual’s unique personality. If the primary
scores within a global score are all in the same direction, then the global
score is a good indicator of personality; however, if the global score is an
average of diverse or opposite primary scores, then it may disguise
important aspects of personality
Scale Interpretation
Primary and Global Scales are bipolar.
High scores = right pole, plus sign (high Warmth, A+)
Low scores = left pole, minus sign (low Warmth, A-)
no bad or good, just strengths and weaknesses for
different situations.
more is not necessarily better.
ex. above-average Anxiety can increase motivation
and achievement while high Anxiety is disruptive of
Step 1: Consider Context of Assessment

settings, relationships, situation, history
special characteristics of the individual:
age, culture, biases, etc.
characteristic preferences of the
Step 2: Evaluate the Response Style

(Validity) Indexes
Response Style Indices

The authors of the 16PF Fifth Edition
decided to address the issue of accuracy in
test-taking attitude by constructing 3
indicators: Impression Management (IM),
Acquiescence (ACQ), and Infrequency (INF).
Response Style Indices

Impression Management (IM)
bipolar (high scores = socially desirable responses, low
scores = reflecting socially undesirable responses)
95th percentile (raw score of 21≥) means socially
desirable responses.
5th percentile (raw score of 4<) means socially
undesirable responses.
Response Style Indices

Impression Management (IM)
High IM Scores (a) the individual may actually behave in
highly socially desirable ways, in which case no distortion is
at work; (b) the responses may reflect a kind of
unconscious distortion in that they are consistent with the
individual’s self-image but not with his or her behavior; (c)
the individual may have deliberately presented him- or
herself as behaving in a highly socially desirable manner
Response Style Indices

Impression Management (IM)
Low IM score suggests an unusual willingness to
admit undesirable attributes or behaviors. Such a
score can occur when a person is unusually self-
critical, discouraged, or under stress. In fact, an
extremely low score may be a “plea for help” (see
Karson et al., 1997).
Response Style Indices

Acquiescence (ACQ)
70x times or more exceeds the 95th "true" answers.
A high score might indicate that the individual:
(a) misunderstood the item content,
(b) responded randomly,
(c) has an unclear self-image, or
(d) had a “yea-saying” response style.
Response Style Indices

Infrequency (INF)
All middle (b = ? ) answers.
95th percentile (raw score 7≥)
may indicate that the individual either:
(a) had trouble reading or comprehending the questions,
(b) responded randomly,
(c) experienced consistent indecisiveness about the a or c response
choices (ambiguous self-picture), or
(d) tried to avoid making the wrong impression by choosing the ? middle
answer rather than one of the more definitive or extreme answers
Step 3: Evaluate the Global Scale Scores

global scales are generally examined next because they give a
broad overview of the individual’s personality.
Step 4. Evaluate the Primary Scales in

the Context of the Globals
Identify the Number of Extreme Primary Scores
2-7 extreme primary scores are within normal range
greater number of extreme scores = well-defined,
individualistic personality style.
few extreme scores = average or unclear self-picture on
certain traits.
extreme scores = strong, behavioral tendencies, difficulty of
shifting behavior (even if unideal or inappropriate).
Step 4. Evaluate the Primary Scales in

the Context of the Globals
Review the Primary Scales in the Context of the
Global Factor Scales
Note each primary scale that is extreme, above average, or
below average.
Primary scales provide a thorough picture of how an individual
function in each domain or global scale.
The interpreter should identify the primary scales that are
raising or lowering the global score to identify the motivations
of an individual.
Step 4. Evaluate the Primary Scales in the Context of the Global

Step 5: Consider Scale Interactions,

Prediction Equations, Interpretive
Report Content, and Comparison Profiles

Step 6: Integrate All Information in

Relation to the Assessment Question
Scale Interactions
the interaction or combination of
scores in a scale that modifies the
meaning of both.
best thought of as hypotheses to be
explored rather than certainties.
Scale Interactions


