PERSONALITY AND JOB PERFORMANCE
91
Personality and Job Performance in
Financial Services Managers
JesuÂs F. Salgado* and AndreÂs Rumbo
This article presents research in which the Five Factor Model of personality was tested as a
predictor of job performance. 125 financial services managers who had enrolled in a potential
evaluation programme were given the NEO-FFI, a questionnaire designed for measuring the
Big Five. Job performance was assessed using nine rating scales and they were grouped into
two components: job problem-solving ability and job motivation. Also, one single scale for
measuring global job performance was used. The results show that Neuroticism and
Conscientiousness correlated with the two components and with the global measure of job
performance. Extraversion, Openness and Agreeableness are correlated with one facet or with
the global rating of job performance. Taken together, the results suggest that the Five Factor
Model is a valid predictor of job performance. The implications of the results for practice and
future research are discussed.
raditionally, personality has been seen as a
variable with low validity for predicting job
T
performance, and some reviews of criterionrelated validity appear to confirm such a belief
(Ghiselli 1973; Guion and Gottier 1965; Schmitt,
Gooding, Noe and Kirsch 1984). Two reasons
might explain these findings. First, in the area of
personality, there have been considerable
controversies for a long time (e.g. crosssituational
consistency;
ideographic
vs.
nomoethic approach, etc.), and there has never
been a universally accepted personality model.
Secondly, in the review articles (e.g. Ghiselli
1973; Guion and Gottier 1965; Schmitt et al.
1984), the validity coefficients were integrated
across variables obtaining a sole validity
coefficient for all personality measures, thereby
masking the predictor±criterion construct
relationships (Hough, Eaton, Dunnette, Kamp
and McCloy 1990).
However, in the 1980s the five-factor model
(FFM) of personality was consolidated (Digman
1990), and it became a strong paradigm in the
area. In effect, the actual research appears to
show that only five factors of personality
generalize across subjects, observers, variables,
factor-analytic algorithms and languages
(Borkernau 1992; John 1990). These factors have
received different names, but the most used are:
Neuroticism or emotional stability (N),
Extraversion or surgency (E), Openness or
culture
(O),
Agreeableness
(A)
and
Conscientiousness (C). However, some authors
criticized the five-factor model for different
motives. For example, Block (1995) criticizes
the dependence of the model on the factor
ß Blackwell Publishers Ltd 1997, 108 Cowley Road, Oxford OX4 1JF, UK and
350 Main Street, Malden, MA 02148, USA.
analysis method, Eysenck (1992) sustains that
three factors are sufficient and, in the field of I/O
Psychology, the most relevant criticisms were
made by Hogan (1991) and Hough (1992). For
example, Hogan (1991) sustains that, in
organizational settings, it could be more
convenient to use specific or small factors than
variables as complex as the Big Five. Specific
factors could lead to a better prediction of job
performance. For his part, Hough (1992) affirms
that the Big Five are very heterogeneous and
incomplete factors, and that additional factors
are necessary (e.g. locus of control and
achievement motivation).
Recently, integrative research on the relation
between personality and job performance was
conducted using the `Big Five' model as a
framework of analysis (see Robertson 1993,
1994). The findings of the research carried out
suggest that the personality measures can be
predictors of job performance. Five studies using
meta-analytic techniques were carried out by
Barrick and Mount (1991), Hough et al. (1990),
Mount and Barrick (1995), Salgado (1997) and
Tett, Jackson and Rothstein (1991). The findings
of these quantitative reviews show that the most
relevant personality factors which predict job
performance are Conscientiousness and
Neuroticism, in this order. For example, in a
large-scale meta-analysis, Barrick and Mount
(1991) found that Conscientiousness (p =
0.22) is a consistently valid predictor for all
occupational groups and all criterion types, but
Neuroticism, Extraversion, Openness and
Agreeableness do not appear to be relevant
predictors of job performance, except for specific
Volume 5
*Address for correspondence:
JesuÂs F. Salgado, Dept.
Psicologõ a Social y Ba sica,
Universidad de Santiago de
Compostela, 15706 Santiago
de Compostela, Spain.
Number 2
April 1997
INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT
92
occupations and criteria. In a partial update of
this study, Mount and Barrick (1995) found that
the validity for Conscientiousness was
underestimated in their prior meta-analysis, and
they suggest that a value of 0.31 is closer to its
true value. For their part, in a small-scale metaanalysis, Tett et al. (1991) found that
Agreeableness is the most relevant, followed
by Neuroticism and Openness, and, with lower
validity, Extraversion and Conscientiousness.
Hough et al. (1990) found that, for job
proficiency criteria, adjustment (Emotional
Stability) and dependability (Conscientiousness)
show an observed (uncorrected) validity of 0.13,
and for training criteria the observed validity for
the same predictors is 0.16 and 0.11 respectively.
A characteristic of these reviews is that they
were conducted using only studies carried out in
the USA and Canada. Salgado (1997), including
only studies conducted in the European
Community, found that Neuroticism (p =
0.19) and Conscientiousness (p = 0.25) are
valid predictors across jobs and criteria.
Therefore, taking these reviews together, it
seems that Neuroticism (Emotional Stability)
and Conscientiousness generalize validity across
jobs, criteria, organizations and countries.
More recently, after the meta-analyses by
Barrick and Mount (1991), Hough et al. (1990)
and Tett et al. (1991) were completed, some
single studies were conducted to check the
validity of the Five Factor Model. Cortina,
Doherty, Schmitt, Kaufman and Smith (1992)
evaluated the predictive validity of the Big Five
using the Minnesota Multiphasic Personality
Inventory (MMPI) and the Inwald Personality
Inventory (IPI) (Inwald, Knatz and Shusman
1983) as measures. They found that personality
showed a moderate validity across criteria for
Conscientiousness, Neuroticism and Agreeableness, but the evidence for Openness and
Extraversion was very small. However, those
measures did not add significantly to the
predictive efficiency of the Civil Service
Examination consisting of three tests: reading,
accuracy of observation and a written
examination. Similarly, Lillibridge and Williams
(1992) applied the five-factor model to predict
management potential using the Guilford±
Zimmerman Temperament Survey (GZTS)
(Guilford, Zimmerman and Guilford 1976) and
the Edwards Personal Preference Schedule (EPPS)
(Edwards 1959). The results of this study
showed that E and A were valid predictors for
management potential. A third study was
conducted by Salgado, Rumbo, SantamarõÂ a and
Losada (1995). In Salgado et al.'s study, the 16
factors of the 16PF were grouped for measuring
the `Big Five' and were correlated with several
measures or facets of job performance.
Neuroticism was significantly correlated with a
Volume 5 Number 2
April 1997
global measure of job performance, and
Extraversion, Openness and Agreeableness were
correlated with some facets of the criterion. For
their part, Van der Berg and Feij (1993) applied
three personality questionnaires for measuring
four factors: Emotional Stability, Extraversion,
Sensation Seeking and Achievement Motivation.
Their results showed that Emotional Stability
and Extraversion were significantly correlated
with a measure of self-appraised performance.
Taking these studies as a whole, the Big Five
appear to be valid predictors of job performance,
but the findings are not conclusive.
A characteristic of these studies is that
measures originally not developed to assess the
Big Five were clustered in these factors using
conceptual criteria. However, this method of
grouping the personality measures (obtained by
instruments that were not developed using the
Big Five model) into five factors has some
problems. For example, several researchers
arrived at different clusters for the same
questionnaires. In a factor analytical study of
the MMPI, Costa, Zonderman, McCrae and
William (1985) found that the MMPI only
provided measures of four factors, with
Conscientiousness excluded from the model. In
another study, correlating the MMPI and the Big
Five as they are measured by NEO-PI (Costa and
McCrae 1985), Costa, Busch, Zonderman and
McCrae (1986) found that the MMPI measured
Neuroticism, Extraversion, Openness and
Agreeableness, but not Conscientiousness. For
their part, Johnson, Butcher, Null and Johnson
(1984) found four factors but, in their study, the
Agreeableness factor was lacking. More recently,
Cortina et al. (1992) found that MMPI assessed
Neuroticism, Extraversion, Agreeableness and
Conscientiousness but not Openness. Moreover,
the correlation between the MMPI measure for
the Openness and the IPI measure for the same
factor was 0.01, and the correlation between the
MMPI and the IPI for Agreeableness was 0.13.
These findings showed a lack of convergent
validity for these measures. Cortina et al. (1992)
suggested that Openness may not be
represented in either the IPI or the MMPI, or
in one but not the other. The same explanation
may be extended to Agreeableness. Also, in the
Lillibridge and Williams (1992) study, there are
problems with regard to their method of
measuring the Big Five. These authors assessed
four of the five factors, with Neuroticism absent.
For their part, Van der Berg and Feij (1993) failed
to obtain a measure of Agreeableness. The lack
of convergent validity and the fact that these
instruments were not developed into the frame
of the Big Five model could explain the
inconsistencies in the findings previously
reported.
However, at present there are several
ß Blackwell Publishers Ltd 1997
PERSONALITY AND JOB PERFORMANCE
questionnaires that were developed to assess the
Big Five. The first inventories using the Big Five
as the model of personality were the Hogan
Personality Inventory (HPI) (Hogan 1982, 1986)
and the NEO-PI (Costa and McCrae 1985,
1992). Recently, Barrick and Mount (1993), and
Barrick, Mount and Strauss (1993), assessed the
`Big Five' using the PCI, an inventory for the
comprehensive description of the five
personality constructs. Currently, there are some
other `Big Five' questionnaires. For example,
recent instruments were developed by Bartram
(1993), Caprara, Barbaranelli and Borgogni
(1994) and Salgado (1994), and the number is
growing.
Several studies have been conducted using
these questionnaires to examine their predictive
validity. For example, Hogan and colleagues
have developed some indices to use in
organizational environments, such as an index
for Service Orientation, or for Reliability, Stress
Tolerance, Management Potential, Sales
Potential and Clerical Potential (Hogan 1991;
Hogan and Hogan 1989; Hogan, Hogan and
Busch 1984). For their part, Barrick and Mount
(1993) found that Conscientiousness and
Extraversion were related to job performance
in a managerial sample, although the level of job
autonomy is a moderator of the validity. In
another study, Barrick et al. (1993) found that
Conscientiousness is a valid predictor of job
performance in sales representatives. Also, their
results show that autonomous goal setting and
goal commitment mediate the relationship
between Conscientiousness and two measures
of job proficiency.
However, a characteristic of these last studies
is that they are not focused on testing the Big
Five model as a whole. Therefore, part of the
validity research with the five-factor model has
been carried out to test specific aspects (e.g.
Barrick and Mount 1993; Barrick et al. 1993;
Hogan and Hogan 1989; Hogan et al. 1984),
while the rest of the studies have been carried
out with questionnaires based on other
personality models than the five-factor model
(e.g. Cortina et al. 1992; Lillibridge and Williams
1992; Salgado et al. 1995; Van der Berg and Feij
1993). Thus, it appears necessary to conduct
studies for checking the model as a whole with
questionnaires based on the five-factor model.
This research has as its primary goal the
testing of the validity of the Big Five for
predicting job performance in financial services
managers, and we will use a questionnaire
specifically designed to assess the Big Five:
NEO-FFI (Costa and McCrae 1992). Based on
the results of the meta-analyses of the
personality validity, we hypothesized that the
Big Five will be valid predictors of job
performance. More specifically, we stated the
ß Blackwell Publishers Ltd 1997
93
following predictions: (a) Neuroticism will be
negatively correlated with job performance; (b)
Conscientiousness will be positively correlated
with job performance; (c) Conscientiousness will
be the dimension that will show higher validity;
and (d) a Big Five composite will show higher
validity than any single personality dimension.
Method
Sample
The subjects were 125 middle managers from a
Spanish financial services organization (a savings
and loan institution) with around 2700 employees. All the subjects were males. Their ages
ranged from 25 to 57 years. Middle managers
carry out the following functions as their main
duties: providing financial services to customers,
directing and coordinating a group of employees, assisting in cash management activities,
examining documents prepared by subordinates
and ensuring that the security procedures are
followed. They examine, evaluate and process
loan applications, and prepare, type and maintain
records of financial transactions. The middle
manager is in charge of the office when the
Director is absent.
Predictors
The `Big Five' are the predictors used in this
research, and the NEO-FFI (Costa and McCrae
1992) is the tool used to measure them. This
questionnaire has 60 items that assess Neuroticism, Extraversion, Openness, Agreeableness
and Conscientiousness. Each factor is measured
by 12 items. For this research, a Spanish
translation was carried out and the process of
adaptation was as follows: first, the senior
researcher translated the NEO-FFI to Spanish;
once the translation was complete, a backtranslation was conducted by a bilingual person
that was unfamiliar with the English version of
the NEO-FFI; then the two versions were
forwarded to the authors and PAR Inc. for
review and suggestions for further revision.
When the Spanish version was accepted by the
authors and Par Inc., data collection commenced.
In the present sample, the reliability (internal
consistency) for N, E, O, A and C was 0.76, 0.72,
0.58, 0.58 and 0.74, respectively.
Criterion measures
The criterion measures in this study were nine
rating scales that assessed the competency of the
individuals in nine characteristics of job performance. The scales had five points: deficient,
insufficient, sufficient, notable and excellent. The
characteristics assessed were: knowledge, effi-
Volume 5
Number 2
April 1997
INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT
94
ciency, problem comprehension, adaptability to
the job, leadership, ability for relations, aspiration level, initiative and attitude. A nine-point
scale was added to this set of scales in order to
assess global job performance. In this study, four
supervisors served as raters, but each subject was
assessed by only one judge who rated him/her
on the scales. The interrater reliability cannot,
therefore, be estimated. The average test±retest
reliability, computed from two measures made
two months apart, is 0.58.
Although the interrater reliability for each
single scale was not computed, it is known that it
is very low. Hunter and Hirsh (1987), and King,
Hunter and Schmidt (1980), have shown that for
a single rating on a particular trait, the interrater
reliability is 0.31, and that the reliability of
summated ratings by a single supervisor is 0.43.
Nothing suggests that the reliability is greater in
this study. Furthermore, in the present study the
correlations between the single scales were high,
suggesting the presence of a Halo effect, and
that they are not discrete and separable facets of
the criterion. These two reasons suggest that a
good solution would be to factor analyze the
correlation matrix of the criterion facets in order
to decide what dimensions of work performance
are to be employed and to improve the
reliability of the criteria.
The criterion ratings were factorized using
component principal analysis and Varimax
rotation (an oblique solution was also tried, but
the results were essentially the same as the
results found in the orthogonal solution). In fact,
the correlation between factors was 0.14, and
this value indicates an angle of 82º. Therefore,
this result is very close to the orthogonality. We
used three criteria to decide the number of
factors to retain: Parallel Analysis (Horn 1965),
eigenvalues greater than one (Kaiser 1960) and
Table 1: Rotated Factor Loadings of Criterion
Measures
Knowledge
Efficiency
Problem
comprehension
Adaptability to job
Leadership
Ability for relations
Aspiration level
Initiative
Attitude
VP
Factor 1
Factor 2
0.797
0.758
ÿ0.167
0.227
0.884
0.700
0.700
0.554
0.570
0.700
ÿ0.088
4.102
ÿ0.020
0.253
0.072
0.251
0.646
0.432
ÿ0.922
1.665
Note: VP = Variance explained by the factor. The VP is computed
as the sum of squares for the element of the factor's column in the
factor loading matrix.
Volume 5 Number 2
April 1997
the Scree Test (Cattell 1966). The three criteria
agreed in that two components should be
rotated (see Table 1). The eigenvalues for the
nine components in the unrotated solution were
4.45, 1.32, 0.88, 0.62, 0.44, 0.41, 0.35, 0.29 and
0.25 respectively. The two retained components
accounted for 64.07% of variance. To interpret
the results of the factor analysis we took into
account only weights greater than 0.50, and we
assumed that the weight of each variable is only
in the factor with the higher loading. In Table 1
it can be seen that the first component has the
following facets of job performance: knowledge,
efficiency, problem comprehension, adaptability
to the job, leadership, ability for relations and
initiative. We named this factor as a job
problem-solving ability. The second factor is
composed of aspiration level and attitude, and
they appear to represent job motivation.
Therefore, job performance may be correctly
represented by three scores: (a) global job
performance, (b) job problem-solving ability
and (c) job motivation. The internal consistency
for the first composite is 0.87, and for the second
is 0.58.
Procedure
The subjects of this study were participants in a
process of potential evaluation conducted by
two researchers of the company. Therefore, the
performance ratings were collected for administrative purposes. The raters were given a onesession training course in which the motives of
the programme were explained, as were the
system characteristics and the appraisal procedure. Over the following days, the supervisors
rated the subjects. When they finished the
evaluation of the subjects, they had an individual
session with those responsible for the process. In
this session, all the ratings were reviewed in
order to reduce errors.
The NEO-FFI was answered collectively in
small groups. The researchers of the company
conducted this process on site.
Results
The validity of the NEO-FFI scales for predicting
job performance was appropriately estimated by
correlations. In Table 2, the intercorrelations
between the NEO-FFI and the single facets of
the criteria are shown. The correlations between
the NEO-FFI and Global Job Performance (GJP),
and the two dimensions of work performance,
are also shown. Regarding the single criterion
measures, in general the direction of the
correlations was as expected. Neuroticism shows
negative correlations with all criterion measures
and Conscientiousness shows positive correla-
ß Blackwell Publishers Ltd 1997
PERSONALITY AND JOB PERFORMANCE
95
Table 2: Correlations Between the NEO-FFI Scales and the Criterion Measures
N
E
O
A
C
K
EF
PC
AJ
LI
AR
AL
IN
AT
JPA
JM
GJP
Na
Eb
Ð
54
02
ÿ34
ÿ38
ÿ13
ÿ06
ÿ14
ÿ13
ÿ08
ÿ15
ÿ30**
ÿ12
00
ÿ16*
ÿ22*
ÿ23**
Ð
00
39
47
ÿ01
ÿ15
ÿ06
ÿ02
ÿ06
22**
16
ÿ02
07
ÿ02
15
13
Ob
Ð
03
ÿ14
ÿ15
ÿ19*
ÿ19*
ÿ20*
ÿ06
ÿ15
ÿ15
ÿ09
14
ÿ20*
ÿ04
ÿ11
Ab
Ð
ÿ16
05
10
ÿ03
ÿ23**
ÿ01
ÿ22*
ÿ12
ÿ03
ÿ18*
ÿ07
ÿ17
ÿ03
Ca
Ð
08
19*
14
07
09
19*
25**
14
19*
17*
27**
32**
K
Ð
55
63
45
48
24
39
45
ÿ11
72
23
43
EF
Ð
62
47
51
33
57
57
15
79
48
65
PC
Ð
58
54
53
47
57
ÿ06
84
31
52
AJ
Ð
33
54
46
59
13
76
39
44
LI
Ð
34
44
47
06
69
34
41
AR
Ð
36
37
15
63
32
46
AL
Ð
65
45
65
91
62
IN
Ð
25
78
58
56
AT
Ð
12
77
37
JPA
Ð
52
67
JM
Ð
61
GJP
X
SD
Ð
17.1
32.1
24.1
32.1
36.1
4.9
4.8
4.9
4.7
4.7
4.8
4.5
4.3
5.4
33.2
9.8
6.3
7.3
6.1
5.8
5.3
5.4
0.8
1.0
0.8
0.9
0.8
0.9
1.1
1.0
0.7
4.7
1.6
1.0
Note: N = Neuroticism; E = Extraversion; O = Openness to Experience; A = Agreeableness; C = Conscientiousness; K = Knowledge; EF = Efficiency; PC = Problem
Comprehension; AJ = Adaptability to Job; LI = Leadership; AR = Ability for Relations; AL = Aspiration Level; IN = Initiative; AT = Attitude; JPA = Job Problem-Solving
Ability; JM = Job Motivation; GJP = Global Job Performance; a = one-tailed test; b = two-tailed test; *p < 0.05; **p < 0.01.
tions with the same measures. For their part,
Extraversion, Openness and Agreeableness are
positively correlated with some facets and
negatively with the rest. Furthermore, Conscientiousness shows the highest correlations of the
five personality factors.
However, taking into account the low level of
reliability of the single criterion measures, it is
preferable to centre the analysis on the criterion
dimensions and on the Global Job Performance
measure. The first dimension of the criterion, the
Job Problem-Solving Ability (JPA), is significantly correlated with Neuroticism, Openness
and Conscientiousness. Job Motivation (JM), the
second criterion dimension, is significantly
correlated with Neuroticism and Conscientiousness. For its part, Global Job Performance
is significantly correlated with Neuroticism and
Conscientiousness. Therefore, Neuroticism and
Conscientiousness are the only factors that show
significant correlations with these three criterion
measures. The average uncorrected validity is
0.20, 0.09, 0.12, 0.09 and 0.25 for Neuroticism,
Extraversion, Openness, Agreeableness and
Conscientiousness, respectively. Consequently,
regarding the size of the coefficients, both
Neuroticism and Conscientiousness show acceptable validity for use in personnel assessment.
It is necessary to take into account that all
reported validity coefficients are affected by
measurement errors in predictor and criterion. As
we are interested in the true validity between the
Big Five and job performance, those validity
coefficients should be corrected for measurement
errors in the criterion in order to obtain an
unbiased estimation of validity (Society for
Industrial and Organizational Psychology (SIOP)
(1987).
ß Blackwell Publishers Ltd 1997
The correction of validity for the
measurement errors may be made using the
appropriate coefficients. For the criterion, the
appropriate reliability estimate should be the
interrater agreement (Guilford and Fruchter
1979; Schmidt and Hunter 1995) or the temporal
stability coefficient (Schmidt, Hunter and Urry
1976). We cannot compute any estimate of the
interrater reliability because each subject was
rated by a single rater. Alternatively, we know
the test±retest reliability for each supervisor and
the average test±retest reliability. The average
test±retest reliability is 0.58, and this value is
close to the value of 0.52 found by Rothstein
(1990) for the interrater agreement when a single
rater assesses the subjects. Barrick and Mount
(1991) use Rothstein's value for the supervisory
rating criterion and Tett et al. (1991) use a value
of 0.50 for the same criterion. For his part,
Salgado (1997) used a value of 0.62, and the
same author found a mean value of 0.575 for the
supervisory rating in the studies conducted in
Spain (Salgado 1995). Here we will use 0.58 as
an estimate of interrater reliability for correcting
validity coefficients. The downward bias in the
interrater reliability made by this estimate, if it is
possible, would be very small in all cases. The
corrected validity coefficients among the five
factors, and the criterion dimensions and Global
Job Performance, are shown in Table 3.
When the validity coefficients are corrected
for measurement errors, all personality factors
that appear to be valid in the prior analysis now
show coefficients with a medium effect size,
according to Cohen's (1988) rule that coefficients
around 0.30 are medium size. Again,
Neuroticism and Conscientiousness have the
highest validity coefficients, Openness is a valid
Volume 5
Number 2
April 1997
INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT
96
Table 3: Validity of the NEO-FFI Scales Corrected for Measurement Error in Criterion
JPA
JM
90%CI
rc
Neuroticisma
Extraversionb
Opennessb
Agreeablenessb
Conscientiousnessa
ÿ0.21
ÿ0.03
ÿ0.26
ÿ0.09
0.22
0.034
0.823
0.021
0.433
0.026
ÿ0.40 to ÿ0.02
ÿ0.22 to 0.16
ÿ0.45 to ÿ0.08
ÿ0.28 to 0.10
0.41 to 0.04
GJP
rc
ÿ0.29
0.20
ÿ0.05
ÿ0.22
0.35
90%CI
0.005
0.088
0.655
0.052
0.000
ÿ0.47 to ÿ0.11
0.28 to 0.01
0.25 to ÿ0.14
ÿ0.41 to 0.04
0.53 to 0.18
Note: JPA = Job Problem-Solving Ability; JM = Job Motivation; GJP = Global Job Performance;
measurement error in criterion; = probability of rc; 90%CI = 90% Confidence Interval.
predictor of Job Problem-Solving Ability (p =
0.021; two-tailed test) and Agreeableness is a
valid predictor of Job Motivation (p = 0.052;
two-tailed test).
In connection with the hypotheses previously
stated, these results confirm the first three
hypotheses. In effect, Neuroticism presents a
negative correlation with job performance and
Conscientiousness shows positive correlations.
Furthermore, Conscientiousness has the highest
validity coefficients. No hypotheses were stated
for Openness and Agreeableness, and both
dimensions correlated negatively with the
criterion components.
However, a characteristic of this study is that
it is a small-sample one. This type of study has a
problem related to sampling error. The smaller
the sample, the larger the sampling error. This
error may affect the significance tests in both
directions, accepting or rejecting the null
hypothesis, and significance tests cannot control
for this. Hunter and Schmidt (1990, p. 31)
suggest using confidence intervals as an
alternative to significance tests at the level of
single studies. Confidence intervals are
generated by the standard error of the mean
effect size and they reflect the extent to which
sampling error remains in the estimate of effect
size (Whitener 1990). Because we use the
correlation as an estimate of the validity, we
must use the standard error of a correlation to
generate the confidence interval around the
validity found for each personality dimension.
The 90% confidence intervals are shown in Table
3.
If the results of the corrected validity are
interpreted with the confidence intervals, the
conclusions are very similar to when significance
tests were used. We made three confidence
intervals for each personality dimension, one
interval for each criterion measure, although we
only comment on the results for Neuroticism
and Conscientiousness. In the case of
Neuroticism, the lower values of 90% confidence
interval are: JPA = ÿ0.02; JM = ÿ0.11 and
GJP = ÿ0.12. For Conscientiousness, the lower
values are: JPA = 0.04, JM = 0.18 and GJP =
Volume 5 Number 2
April 1997
rc
a
= one-tailed test;
ÿ0.30
0.17
ÿ0.14
ÿ0.04
0.42
b
90%CI
0.004
0.141
0.108
0.738
0.000
ÿ0.48 to ÿ0.12
0.36 to ÿ0.02
0.33 to ÿ0.05
ÿ0.23 to 0.15
0.59 to 0.25
= two-tailed test; rc = validity corrected for
0.25. Therefore, for Neuroticism and
Conscientiousness, none of the 90% confidence
intervals include 0, and therefore it is not a
reasonable possibility that the validity is 0 for
these two personality dimensions. On the
contrary, the confidence intervals suggest that
the corrected validity is significantly different
from zero (Whitener 1990).
The last hypothesis stated in the introduction
is to check the FFM as a whole, and one method
is to compute the multiple correlation between
the Big Five as a composite and the criteria. As
can be seen from Table 2, the Big Five are
intercorrelated but, from a theoretical point of
view, they are orthogonal personality factors.
Could these correlations change the above
conclusions in respect to Neuroticism and
Conscientiousness? Could these correlations be
explained by the presence of a single common
factor in the predictor variables? The correlations
between the personality dimensions are not a
strange finding. For example, Costa and McCrae
(1992, p. 100) showed that the Big Five, as they
are assessed by the NEO-PI-R, are correlated
(e.g. Neuroticism±Conscientiousness, r =
ÿ0.53; Extraversion±Openness, r = 0.40).
However, these correlations are using raw data.
If the correlations are estimated at the factor
level, the Big Five are orthogonal. According to
Costa and McCrae (1995), these correlations
mean that the NEO-PI-R scores are not perfect
measures of the FFM. In the same article, Costa
and McCrae indicate that the raw scale scores of
the NEO-FFI also show modest intercorrelations
in most samples, but `The item factor structure of
the NEO-FFI has been confirmed in both
Canadian . . . and German . . . samples', and
`although these brief scales are not completely
uncorrelated, they appear to give useful
approximations to five orthogonal factors'
(Costa and McCrae 1995, p. 218).
Obviously, if the Big Five assessed by the
NEO-FFI are independent factors, and they
contribute independently to explain the criteria,
the multiple correlation between the five factors
and the criteria must be remarkably higher than
the correlation between one single dimension
ß Blackwell Publishers Ltd 1997
PERSONALITY AND JOB PERFORMANCE
97
Table 4: Multiple Regression Analyses using Big Five as Independent Variables and the Criteria Measure as
Dependent Variables
Criterion
R
RA
R2
90%CI
Rc
JPA
JM
GJP
0.51
0.41
0.45
0.48
0.38
0.41
0.26
0.17
0.20
0.33 to 0.63
0.23 to 0.53
0.26 to 0.56
0.63
0.50
0.54
Note: JPA = Job Problem-Solving Ability; JM = Job Motivation; GJP = Global Job Performance; R = multiple correlation coefficient; RA
= adjusted multiple correlation; 90%CI = 90% Confidence Interval using RA as centre of the interval; Rc = adjusted multiple correlation
coefficient corrected by criterion reliability.
(e.g. Neuroticism or Conscientiousness) and the
criteria. However, if the intercorrelations are due
to a single common factor in the predictor
variables, then a personality composite will show
a validity of similar size or only slightly higher
than the single factors. In Table 4, it is possible
to see the results of three multiple regression
analyses using Job Problem-Solving Ability, Job
Motivation and Global Job Performance as
dependent variables, respectively.
The results of the multiple regression analyses
clearly show that a personality composite has
greater validity than any one dimension
considered individually. Furthermore, 90%
confidence intervals indicate that the composite
validity is significantly different from zero.
Therefore, (a) the intercorrelations do not
suggest one single common factor and (b) using
the five factors together in a composite, the
validity is remarkably high, even though the
multiple correlation was corrected for shrinkage.
The size of the validity is similar to the validity
found for ability composites (Hunter and Hunter
1984). These results confirm our fourth
hypothesis, according to which the Big Five
composite would have higher validity than the
single personality dimensions. Thus, the present
test of the FFM to predict job performance
suggests that the Big Five may be reasonably
included in a personnel selection battery.
Discussion
The objective of this study was to test the Five
Factor Model of personality as a predictor of job
performance. Based on previous research using
the Big Five, we thought that Neuroticism would
be negatively correlated with performance
criterion measures and that Conscientiousness
would be positively correlated. Also, we
hypothesized that Conscientiousness would
show the highest validity coefficient.
With respect to the major goals of this
research, the findings confirm the stated
hypotheses. A close examination of the results
shows that they are consistent with the findings
ß Blackwell Publishers Ltd 1997
of previous research conducted with the Big
Five. For example, Barrick and Mount (1991)
found validities (corrected for unreliability and
range restriction) ranging from 0.04 for
Openness to 0.22 for Conscientiousness. Based
on different procedures than those of Barrick and
Mount (1991), Tett et al. (1991) found that the
`Big Five' are valid predictors and that the
validities ranged from 0.16 for Extraversion to
0.33 for Agreeableness. In their meta-analysis,
Hough et al. (1990) found uncorrected validities
of 0.13 and 0.16 for Neuroticism and
Conscientiousness, and Salgado (1997) found
validities ranging from 0.02 for Agreeableness to
0.25 for Conscientiousness. For their part,
Cortina et al. (1992) found that Neuroticism,
Agreeableness and Conscientiousness, as
measured by the IPI, are correlated significantly
with supervisor ratings for police recruits. In the
Cortina et al. study, neither Openness nor
Extraversion resulted in valid predictors.
Similarly, Van der Berg and Feij (1993) found
that Emotional Stability (r = 0.21) and
Extraversion (r = 0.20) correlated significantly
with self-appraised performance. Also, Salgado et
al. (1995) found that Neuroticism showed a
significant correlation with a measure of Global
Job Performance, although Extraversion,
Openness and Agreeableness are also correlated
with some facets of the criterion.
The sizes of the validities found here are similar
to those of Cortina et al. (1992), Salgado et al.
(1995), Tett et al. (1991) and Van der Berg and Feij
(1993), and slightly higher than those reported by
Barrick and Mount (1991), Hough et al. (1990) and
Salgado (1997). Specifically, in the actual research,
Neuroticism shows a validity with a high
coefficient, if it is compared to the validity found
by Barrick and Mount (1991), the most extensive
Big Five meta-analysis to date. Although one
single study is not sufficient evidence to
contradict the Barrick and Mount conclusion
(based on 117 single studies) that
Conscientiousness is the only factor that
generalizes validity across persons, jobs and
situations, the actual findings, along with those
meta-analyses by Hough et al. (1990) and Salgado
Volume 5
Number 2
April 1997
INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT
98
(1997), and the single studies of Salgado et al.
(1995) and Van der Berg and Feij (1993), suggest
an alternative explanation. It may be possible that
there are cross-cultural differences in the
relevance attributed to Neuroticism in order to
perform a job acceptably. Perhaps Neuroticism is
viewed differently in Europe than in America, and
in this way Neuroticism would have a more local
or less generalizable validity in the USA than in
Europe. In other words, the Big Five personality
factors would show a different cross-culturally
generalizable validity, Conscientiousness being
more generalizable and Neuroticism less
generalizable in (more local) validity. However,
this hypothesis cannot be checked in this study.
In connection with the fourth hypothesis, the
findings suggest that the Big Five may be used
as a composite predictor in a similar way to the
ability composite. Used in this way, the Big Five
shows an impressive validity ranging from 0.50
to 0.63. Any single personality predictor reaches
a similar validity size. These findings are also
proof of the independence of the Big Five,
although they are not completely uncorrelated.
Another aspect of this study concerns the
reporting of the results. The most usual form is
to report the probability level of the observed
(uncorrected) validities and to describe the
corrected validities as the best point estimation
of the validity. We report the probabilities of the
corrected validities using the standard error
estimates, applying the formulas provided by
Bobko and Rieck (1980). Thus, it is possible to
apply a significance test to the hypotheses
concerning the corrected validities as well as the
uncorrected validities. Also, we reported the
confidence intervals of the corrected validities
because a growing number of authors suggest
that significance tests cannot appropriately
control the (Type II) error (Cohen 1994; Hunter
and Schmidt 1990). By reporting the significance
tests along with the confidence intervals, the
reader has a better picture of the results of the
study. In the future, we suggest using both the
significance test and the confidence intervals in
reporting the findings of single validity studies.
The findings of this research have certain
implications for the practice of personnel
selection. First, they suggest that personality
measures are valid predictors of job performance,
and therefore should be used in personnel
selection. Secondly, the results also suggest that
the five-factor model appears to be a good
model for predicting job behaviour and that
assessment instruments could be developed with
this model.
In addition, a few words about the
questionnaire NEO-FFI used in this research
appear necessary in the light of the recent
research by Schmit and Ryan (1993). The
inventory is a brief version of the NEO-PI,
Volume 5 Number 2
April 1997
which is the model suggested by Matarazzo
(1992) for the personality inventories for the
21st century. The findings of the present
research show that the NEO-FFI is an acceptable
instrument for predicting job performance in
financial services managers. However, according
to the results of Schmit and Ryan (1993), the
assessment situation (anonymous vs. nonanonymous as in personnel selection)
dramatically affects the structure of the NEOFFI by changing it, and the change in structure
may result in low validity for the NEO-FFI
scales. However, the present results contradict
Schmit and Ryan's (1993) suggestion, and they
confirm that the NEO-FFI is a valid procedure
for personnel selection.
Finally, we should comment on the criterion
measures used here. In this research we used a
criterion being collected for administrative
purposes, and this can result in a higher `Halo
effect' and higher extremism in the ratings than
criteria obtained for research purposes (Guion
1965; Veres, Field and Boyles 1983; Warmke and
Billings 1979). Furthermore McDaniel, Whetzel,
Schmidt and Maurer (1994) found that the
validity coefficients based on research criteria
are generally greater than the validity
coefficients based on administrative criteria.
Therefore, the administrative measures would
affect the validity coefficient found here,
downwardly biasing their sizes.
In summary, it may be said that in
contradiction to the pessimism of the classical
reviews of personality measure validities, the
recent research shows opposite evidence.
Personality measures can be valid predictors of
job performance (Day and Silverman 1989;
Robertson and Kinder 1993; Salgado 1996a;
Salgado et al. 1995; Van der Berg and Feij 1993)
and the `Big Five' is a relevant model to use in
personnel selection. Three or four factors are
significantly associated with job performance.
Also, this research shows that the NEO-FFI is a
brief measure of these, with acceptable reliability
and criterion validity. We suggest using the
NEO-FFI when a rapid measure of personality is
needed.
Acknowledgement
This research was supported by the XUGA
Grant 21104A95 from the Xunta de Galicia
(Spain) to the first author.
References
Barrick, M.R. and Mount, M.K. (1991) The Big Five
personality dimensions and job performance: a
meta-analysis. Personnel Psychology, 44, 1±26.
ß Blackwell Publishers Ltd 1997
PERSONALITY AND JOB PERFORMANCE
Barrick, M.R. and Mount, M.K. (1993) Autonomy as
a moderator of the relationships between the Big
Five personality dimensions and job performance.
Journal of Applied Psychology, 78, 111±118.
Barrick, M.R., Mount, M.K. and Strauss, J.P. (1993)
Conscientiousness and performance of sales
representatives: test of the mediating effects of
goal setting. Journal of Applied Psychology, 78,
715±722.
Bartram, D. (1993) Validation of the `ICES'
personality inventory. European Review of Applied
Psychology, 41, 207±218.
Block, J. (1995) A contrarian view of the five-factor
approach to personality description. Psychological
Bulletin, 177, 187±213.
Bobko, P. and Rieck, A. (1980) Large sample
estimators for standard errors of functions of
correlation coefficients. Applied Psychological
Measurement, 4, 385±398.
Borkenau, P. (1992) Implicit personality and the fivefactor model. Journal of Personality, 60, 295±327.
Caprara, G.V., Barbaranelli, C. and Borgogni, L.
(1994) The Big Five Questionnaire. Firenze (Italy):
OS Organizzazione Speciali.
Cattell, R.B. (1966) The scree test for the number of
factors. Multivariate Behavioral Research, 1, 245±
276.
Cohen, J. (1988) Statistical power analysis for the
behavioral sciences. 2nd edition. Hillsdale, NJ:
Erlbaum.
Cohen, J. (1994) The earth is round (p < 0.05).
American Psychologist, 49, 997±1003.
Cortina, J.M., Doherty, M.L., Schmitt, N., Kaufman,
G. and Smith, R.G. (1992) The `Big Five'
personality factors in the IPI and MMPI:
predictors of police performance. Personnel
Psychology, 45, 119±140.
Costa, P.T. and McCrae, R.R. (1985) The NEO-PI
Personality Inventory. Odessa, FL: Psychological
Assessment Resources.
Costa, P.T. and McCrae, R.R. (1992) The NEO-PI
Personality Inventory. Odessa, FL: Psychological
Assessment Resources.
Costa, P.T. and McCrae, R.R. (1995) Solid ground in
the wetlands of personality: a reply to Block.
Psychological Bulletin, 117, 216±220.
Costa, P.T., Zonderman, A.B., McCrae, R.R. and
William,
R.B.
(1985)
Content
and
comprehensiveness in the MMPI: an item factor
analysis in a normal adult sample. Journal of
Personality and Social Psychology, 48, 925±933.
Costa, P.T., Busch, C.M., Zonderman, A.B. and
McCrae, R.R. (1986) Correlations of MMPI factor
scales with measures of the five-factor model of
personality. Journal of Personality Assessment, 50,
640±650.
Day, D.V. and Silverman, S.B. (1989) Personality and
job performance: evidence of incremental validity.
Personnel Psychology, 42, 25±36.
Digman, J.M. (1990) Personality structure: emergence
of the five-factor model. Annual Review of
Psychology, 41, 417±420.
Edwards, A.L. (1959) Edwards Personality Preference
Schedule. New York: Psychological Corporation.
Eysenck, H.J. (1992) Four ways five factors are not
basic. Personality and Individual Differences, 6, 667±
673.
ß Blackwell Publishers Ltd 1997
99
Ghiselli, E.E. (1973) The validity of aptitude tests in
personnel selection. Personnel Psychology, 26, 461±
477.
Guilford, J.P. and Fruchter, B. (1979) Fundamental
statistics in psychology and education. New York:
McGraw-Hill.
Guilford, J.S., Zimmerman, W.S. and Guilford, J.P.
(1976) The Guilford±Zimmerman Temperament
Survey Handbook: twenty-five years of research and
applications. San Diego: EdITS Publishers.
Guion, R.M. (1965) Personnel Testing. New York:
McGraw-Hill.
Guion, R.M. and Gottier, R.F. (1965) Validity of
personality measures in personnel selection.
Personnel Psychology, 18, 135±164.
Hogan, R. (1982) A socioanalytic theory of personality. In M. Page (ed.), Personality: current theory
and research. Nebraska Symposium on Motivation.
Lincoln, Nebraska: Nebraska University Press.
Hogan, R. (1986) Manual for the Hogan Personality
Inventory. Minneapolis, MN: National Computer
Systems.
Hogan, R. (1991) Personality and personality
measurement. In M.D. Dunnette and L.H. Hough
(eds.), Handbook of Industrial and Organizational
Psychology, Vol. II. Palo Alto, CA: Consulting
Psychologists Press.
Hogan, J. and Hogan, R. (1989) How to measure
employee reliability. Journal of Applied Psychology,
74, 273±279.
Hogan, R., Hogan J. and Busch, C. (1984) How to
measure service orientation. Journal of Applied
Psychology, 69, 157±163.
Horn, J.L. (1965) A rationale and test for the number
of factor. Psychometrika, 30, 179±185.
Hough, L.M. (1992) The `Big Five' personality
variables-construct confusion: description versus
prediction. Human Performance, 5, 139±155.
Hough, L.M., Eaton, N.K., Dunnette, M.D., Kamp,
J.D. and McCloy, R.A. (1990) Criterion-related
validities of personality constructs and the effect
of response distortion on those validities. Journal
of Applied Psychology, 75, 581±585.
Hunter, J.E. and Hunter, R.F. (1984) Validity and
utility of alternate predictors of job performance.
Psychological Bulletin, 96, 72±98.
Hunter, J.E. and Hirsh, H.R. (1987) Applications of
meta-analysis. In C.L. Cooper and I.T. Robertson
(eds.), International Review of Industrial and
Organizational Psychology. Chichester, UK: Wiley.
Hunter, J.E. and Schmidt, F.L. (1990) Methods of metaanalysis. Newbury Park, CA: Sage.
Inwald, E.E., Knatz, H.F. and Shusman, E.J. (1983)
Inwald Personality Inventory Technical Manual. New
York: Hilson Research.
John, O.P. (1990) The `Big Five' factor taxonomy:
dimensions of personality in the natural language
and in questionnaires. In L. Pervin (ed.), Handbook
of Personality: theory and research. New York:
Guilford.
Johnson, J.H., Butcher, J.N., Null, C. and Johnson,
K.N. (1984) Replicated item level factor analysis of
the full MMPI. Journal of Personality and Social
Psychology, 49, 105±114.
Kaiser, H.F. (1960) The applications of electronic
computers to factor analysis. Educational and
Psychological Measurement, 20, 141±151.
Volume 5
Number 2
April 1997
INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT
100
King, L.M., Hunter, J.E. and Schmidt, F.L. (1980) Halo
in a multidimensional forced-choice performance
evaluation scale. Journal of Applied Psychology, 65,
507±516.
Lillibridge, J.R. and Williams, K.J. (1992) Another look
at personality and managerial potential:
application of the five-factor model. In K. Kelley
(ed.), Issues, Theory and Research in Industrial
Organizational Psychology. Amsterdam: NorthHolland.
Matarazzo, J.D. (1992) Psychological testing and
assessment in the 21st century. American
Psychologist, 47, 1007±1018.
McDaniel, M.A., Whetzel, D.L., Schmidt, F.L. and
Maurer, S.D. (1994) The validity of employment
interviews: a comprehensive review and metaanalysis. Journal of Applied Psychology, 79, 599±
616.
Mount, M.K. and Barrick, M.R. (1995) The Big Five
personality dimensions: implications for research
and practice in human resources management. In
K.M. Rowland and G. Ferris (eds.), Research in
personnel and human resources management, 13, pp.
153±200. Greenwich: CT: JAI Press.
Robertson, I.T. (1993) Personality assessment and
personnel selection. European Review of Applied
Psychology, 43, 187±194.
Robertson, I.T. (1994) Personality and personnel
selection. In C.L. Cooper and D.M. Rousseau
(eds.), Trends in Organizational Behavior, 1. London:
Wiley.
Robertson, I.T. and Kinder, A. (1993) Personality and
job competences: the criterion-related validity of
some personality variables. Journal of Occupational
and Organizational Psychology, 66, 225±244.
Rothstein, H.R. (1990) Interrater reliability of job
performance ratings: growth to asymptote level
with increasing opportunity to observe. Journal of
Applied Psychology, 75, 322±327.
Salgado, J.F. (1994) Manual Tecnico del IP/5F.
Departamento de PsicologõÂ a, Universidad de
Santiago de Compostela (Spain). Manuscrito no
publicado. [Technical Manual for IP/5F. Dept. of
Psychology, University of Santiago de
Compostela, Spain. Unpublished manuscript.]
Salgado, J.F. (1995) Fiabilidad intraevaluador e
interevaluador de la valoraciones de rendimiento
en el trabajo. Unpublished manuscript.
Salgado, J.F. (1996) Personality and job competences:
a comment on Robertson and Kinder's (1993)
study. Journal of Occupational and Organizational
Volume 5 Number 2
View publication stats
April 1997
Psychology, 69, 373±375.
Salgado, J.F. (1997) The five-factor model of
personality and job performance in the European
Community (EC). Journal of Applied Psychology 82,
30±43.
Salgado, J.F., Rumbo, A., SantamarõÂ a, G. and Losada,
M.R. (1995) El 16PF, el modelo de personalidad de
cinco factores y el rendimiento en el trabajo.
Revista de PsicologõÂa Social Aplicada, 5, 81±94. [The
16PF, five-factor model of personality and job
performance.]
Schmidt, F.L. and Hunter, J.E. (1996) The
measurement errors in the applied research.
Answers to 27 scenarios. Psychological Methods,
1, 199±223.
Schmidt, F.L., Hunter, J.E. and Urry, V.W. (1976)
Statistical power in criterion-related validation
studies. Journal of Applied Psychology, 61, 473±485.
Schmit, M.J. and Ryan, A.M. (1993) The Big Five in
personnel selection: factor structure in applicant
and nonapplicant populations. Journal of Applied
Psychology, 78, 966±974.
Schmitt, N., Gooding, R.Z., Noe, R.D. and Kirsch, M.
(1984) Meta-analyses of validity studies published
between 1964 and 1982, and the investigation of
study characteristics. Personnel Psychology, 37,
407±422.
Society for Industrial and Organizational Psychology,
Inc. (1987) Principles for the validation and use of
personnel selection procedures. 3rd edition. College
Park, MD: Author.
Tett, R.P., Jackson, D.N. and Rothstein, M. (1991)
Personality measures as predictors of job
performance: A meta-analytic review. Personnel
Psychology, 44, 703±742.
Van der Berg, P.T. and Feij, J.A. (1993) Personality
traits and job characteristics as predictors of job
experiences. European Journal of Psychology, 7,
337±357.
Veres, J.G., Field, H.S., and Boyles, W.R. (1983)
Administrative versus research performance
ratings: An empirical test of rating data quality.
Public Personnel Management, 12, 290±298.
Warmke, D.L. and Billings, R.S. (1979) Comparison of
training methods for improving the psychometric
quality of experimental and administrative
performance ratings. Journal of Applied Psychology,
64, 124±131.
Whitener, E.M. (1990) Confusion of confidence
intervals and credibility intervals. Journal of Applied
Psychology, 75, 315±321.
ß Blackwell Publishers Ltd 1997