IBM SPSS Statistics Base

C6021_book.
fm Page 203 Thursday, January 26, 2006 11:28 AM
12
Factor Analysis
12.1 Aim
The major aim of factor analysis is the orderly simplification of a large number
of intercorrelated measures to a few representative constructs or factors. Suppose that a researcher wants to identify the major dimensions underlying a
number of personality tests. He begins by administering the personality tests
to a large sample of people (N = 1000), with each test supposedly measuring
a specific facet of a persons personality (e.g., ethnocentrism, authoritarianism, and locus of control). Assume that there are 30 such tests, each consisting
of ten test items. What the researcher will end up with is a mass of numbers
that will say very little about the dimensions underlying these personality
tests. On average, some of the scores will be high, some will be low, and some
intermediate, but interpretation of these scores will be extremely difficult if
not impossible. This is where factor analysis comes in. It allows the researcher
to reduce this mass of numbers to a few representative factors, which can
then be used for subsequent analysis.
Factor analysis is based on the assumption that all variables are correlated
to some degree. Therefore, those variables that share similar underlying
dimensions should be highly correlated, and those variables that measure
dissimilar dimensions should yield low correlations. Using the earlier example, if the researcher intercorrelates the scores obtained from the 30 personality tests, then those tests that measure the same underlying personality
dimension should yield high correlation coefficients, whereas those tests that
measure different personality dimensions should yield low correlation coefficients. These high/low correlation coefficients will become apparent in the
correlation matrix because they form clusters indicating which variables
hang together. For example, measures of ethnocentrism, authoritarianism,
and aggression may be highly intercorrelated, indicating that they form an
identifiable personality dimension. The primary function of factor analysis
is to identify these clusters of high intercorrelations as independent factors.
There are three basic steps to factor analysis:
203
2006 by Taylor & Francis Group, LLC
C6021_book.fm Page 204 Thursday, January 26, 2006 11:28 AM
204
Univariate and Multivariate Data Analysis and Interpretation with SPSS
1. Computation of the correlation matrix for all variables.

2. Extraction of initial factors.
3. Rotation of the extracted factors to a terminal solution.
12.1.1
Computation of the Correlation Matrix
As factor analysis is based on correlations between measured variables, a

correlation matrix containing the intercorrelation coefficients for the variables must be computed. The variables should be measured at least at the
ordinal level, although two-category nominal variables (coded 1-2) can be
used. If all variables are nominal variables, then specialized forms of factor
analysis, such as Boolean factor analysis (BMPD, 1992), are more appropriate.
12.1.2
Extraction of Initial Factors
At this phase, the number of common factors needed to adequately describe

the data is determined. To do this, the researcher must decide on (1) the
method of extraction, and (2) the number of factors selected to represent the
underlying structure of the data.
12.1.2.1 Method of Extraction
There are two basic methods for obtaining factor solutions. They are Principal Components analysis and common Factor Analysis (Note: SPSS provides six methods of extraction under the common factor analysis model;
these are: Principal-axis factoring, unweighted least-squares, generalized
least-squares, maximum-likelihood, alpha factoring, and image factoring.)
The choice between these two basic methods of factor extraction lies with
the objective of the researcher. If the purpose is no more than to reduce
data to obtain the minimum number of factors needed to represent the
original set of data, then Principal Components analysis is appropriate. The
researcher works from the premise that the factors extracted need not have
any theoretical validity. Conversely, when the primary objective is to identify
theoretically meaningful underlying dimensions, the common Factor Analysis method is the appropriate model. Given the more restrictive assumptions underlying common factor analysis, the principal components method
has attracted more widespread use.
12.1.2.2 Determining the Number of Factors to Be Extracted
There are two conventional criteria for determining the number of initial
unrotated factors to be extracted. These are the Eigenvalues criterion and
the Scree test criterion.
Factor Analysis
205
12.1.2.2.1 Eigenvalues
Only factors with eigenvalues of 1 or greater are considered to be significant;
all factors with eigenvalues less than 1 are disregarded. An eigenvalue is a
ratio between the common (shared) variance and the specific (unique) variance explained by a specific factor extracted. The rationale for using the
eigenvalue criterion is that the amount of common variance explained by
an extracted factor should be at least equal to the variance explained by a
single variable (unique variance) if that factor is to be retained for interpretation. An eigenvalue greater than 1 indicates that more common variance
than unique variance is explained by that factor.
12.1.2.2.2 Scree Test
This test is used to identify the optimum number of factors that can be
extracted before the amount of unique variance begins to dominate the
common variance structure (Hair, Anderson, Tatham, & Black, 1995). The
scree test is derived by plotting the eigenvalues (on the Y axis) against the
number of factors in their order of extraction (on the X axis). The initial
factors extracted are large factors (with high eigenvalues), followed by
smaller factors. Graphically, the plot will show a steep slope between the
large factors and the gradual trailing off of the rest of the factors. The point
at which the curve first begins to straighten out is considered to indicate the
maximum number of factors to extract. That is, those factors above this point
of inflection are deemed meaningful, and those below are not. As a general
rule, the scree test results in at least one and sometimes two or three more
factors being considered significant than does the eigenvalue criterion (Cattel, 1966).
12.1.3
Rotation of Extracted Factors
Factors produced in the initial extraction phase are often difficult to interpret.
This is because the procedure in this phase ignores the possibility that variables identified to load on or represent factors may already have high loadings (correlations) with previous factors extracted. This may result in
significant cross-loadings in which many factors are correlated with many
variables. This makes interpretation of each factor difficult, because different
factors are represented by the same variables. The rotation phase serves to
sharpen the factors by identifying those variables that load on one factor
and not on another. The ultimate effect of the rotation phase is to achieve a
simpler, theoretically more meaningful factor pattern.
12.1.4
Rotation Methods
There are two main classes of factor rotation method: Orthogonal and
Oblique. Orthogonal rotation assumes that the factors are independent, and
206
the rotation process maintains the reference axes of the factors at 90. Oblique
rotation allows for correlated factors instead of maintaining independence
between the rotated factors. The oblique rotation process does not require
that the reference axes be maintained at 90. Of the two rotation methods,
oblique rotation is more flexible because the factor axes need not be orthogonal. Moreover, at the theoretical level, it is more realistic to assume that
influences in nature are correlated. By allowing for correlated factors, oblique
rotation often represents the clustering of variables more accurately.
There are three major methods of orthogonal rotation: varimax, quartimax,
and equimax. Of the three approaches, varimax has achieved the most
widespread use as it seems to give the clearest separation of factors. It does
this by producing the maximum possible simplification of the columns (factors) within the factor matrix. In contrast, both quartimax and equimax
approaches have not proven very successful in producing simpler structures,
and have not gained widespread acceptance. Whereas the orthogonal
approach to rotation has several choices provided by SPSS, the oblique
approach is limited to one method: oblimin.
12.1.5
Orthogonal vs. Oblique Rotation
In choosing between orthogonal and oblique rotation, there is no compelling

analytical reason to favor one method over the other. Indeed, there are no
hard and fast rules to guide the researcher in selecting a particular orthogonal
or oblique rotational technique. However, convention suggests that the following guidelines may help in the selection process. If the goal of the research
is no more than to reduce the data to more manageable proportions,
regardless of how meaningful the resulting factors may be, and if there is
reason to assume that the factors are uncorrelated, then orthogonal rotation
should be used. Conversely, if the goal of the research is to discover theoretically meaningful factors, and if there are theoretical reasons to assume
that the factors will be correlated, then oblique rotation is appropriate.
Sometimes the researcher may not know whether or not the extracted factors
might be correlated. In such a case, the researcher should try an oblique
solution first. This suggestion is based on the assumption that, realistically,
very few variables in a particular research project will be uncorrelated. If the
correlations between the factors turn out to be very low (e.g., < 0.20), the
researcher could redo the analysis with an orthogonal rotation method.
12.1.6
Number of Factor Analysis Runs
It should be noted that when factor analysis is used for research (either for
the purpose of data reduction or to identify theoretically meaningful dimensions), a minimum of two runs will normally be required. In the first run,
Factor Analysis
207
the researcher allows factor analysis to extract factors for rotation. All factors
with eigenvalues of 1 or greater will be subjected to varimax rotation by
default within SPSS. However, even after rotation, not all extracted rotated
factors will be meaningful. For example, some small factors may be represented by very few items, and there may still be significant cross-loading of
items across several factors. At this stage, the researcher must decide which
factors are substantively meaningful (either theoretically or intuitively), and
retain only these for further rotation. It is not uncommon for a data set to
be subjected to a series of factor analysis and rotation before the obtained
factors can be considered clean and interpretable.
12.1.7
Interpreting Factors
In interpreting factors, the size of the factor loadings (correlation coefficients between the variables and the factors they represent) will help in
the interpretation. As a general rule, variables with large loadings indicate
that they are representative of the factor, while small loadings suggest that
they are not. In deciding what is large or small, a rule of thumb suggests
that factor loadings greater than 0.33 are considered to meet the minimal
level of practical significance. The reason for using the 0.33 criterion is
that if the value is squared, the squared value represents the amount of
the variables total variance accounted for by the factor. Therefore, a factor
loading of 0.33 denotes that approximately 10% of the variables total
variance is accounted for by the factor. The grouping of variables with
high factor loadings should suggest what the underlying dimension is for
that factor.
12.2 Checklist of Requirements

Variables for factor analysis should be measured at least at the ordinal level.
If the researcher has some prior knowledge about the factor structure, then several variables (five or more) should be included to
represent each proposed factor.
The sample size should be 100 or larger. A basic rule of thumb is to
have at least five times as many cases as variables entered into the
factor analysis. A more acceptable range would be a ten-to-one ratio.
208
12.3 Assumptions
The assumptions underlying factor analysis can be classified as statistical
and conceptual.
12.3.1
Statistical Assumptions
Statistical assumptions include normality and linearity and sufficient significant

correlations in data matrix.
Normality and linearity: Departures from normality and linearity
can diminish the observed correlation between measured variables
and thus degrade the factor solution.
Sufficient significant correlations in data matrix: The researcher
must ensure that the data matrix has sufficient correlations to justify
the application of factor analysis. If visual inspection reveals no
substantial number of correlations of 0.33 or greater, then factor
analysis is probably inappropriate.
12.3.2
Conceptual Assumptions
Conceptual assumptions include selection of variables and homogeneity.

Selection of variables: Variables should be selected to reflect the
underlying dimensions that are hypothesized to exist in the set of
selected variables. This is because factor analysis has no means to
determine the appropriateness of the selected variables other than
the correlations among the variables.
Homogeneity: The sample must be homogeneous with respect to
the underlying factor structure. If the sample consists of two or more
distinct groups (e.g., males and females), separate factor analysis
should be performed.
12.4 Factor Analysis: Example 1

A study was designed to investigate how the defense of self defense, provocation, and insanity influence jurors verdict judgments in trials of battered
women who killed their abusive spouses (Ho & Venus, 1995). Nine statements were written to reflect these three defense strategies. Each statement
Factor Analysis
209
was rated on an 8-point scale, with high scores indicating strong support for
that particular defense strategy. A total of 400 subjects provided responses
to these nine statements. Factor analysis (with principal components extraction) was employed to investigate whether these nine defense statements
represent identifiable factors, i.e., defense strategies. The nine statements
(together with their SPSS variable name) written to reflect the three defense
strategies are listed in the following.
1. Provocation Defense
PROVO: In killing her husband, the defendant's action reflects
a sudden and temporary loss of self-control as a result of the
provocative conduct of the deceased.
CAUSED: The nature of the provocative conduct by the deceased
was such that it could have caused an ordinary person with
normal powers of self-control to do what the defendant did.
PASSION: In killing her husband, the defendant acted in the
heat of passion as a response to the deceased sudden provocation
on that fateful day.
2. Self-Defense Defense
PROTECT: In killing her husband, the defendant was justified
in using whatever force (including lethal force) to protect herself.
SAVE: The defendant's lethal action was justified in that she
acted to save herself from grievous bodily harm.
DEFEND: In killing her husband, the defendant used such force
as was necessary to defend herself.
3. Insanity Defense
MENTAL: The action of the defendant is the action of a mentally
impaired person.
INSANE: In killing her husband, the defendant was either irrational or insane.
STABLE: The action of the accused is not typical of the action of
a mentally stable person.
12.4.1
Data Entry Format
Note: The survey questionnaire employed in this study was designed as part
of a larger study. It contains additional variables apart from the nine defense
strategy variables. The data set is named DOMES.SAV.
210
Variables
Column(s)
Code
Sex
Age
Educ
Income
Provo
Protect
Mental
Caused
Save
Insane
Passion
Defend
Stable
Real
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Support
Apply
15
16
Respon
Verdict
17
18
Sentence
19
1 = male, 2 = female
In years
1 = primary to 6 = tertiary
1 = < $10,000 per year, 5 = $40,000 per year
1 = strongly disagree, 8 = strongly agree
1 = totally unbelievable syndrome, 8 = totally believable
syndrome
1 = no support at all, 8 = a great deal of support
1 = does not apply to the defendant at all, 8 = totally applies
to the defendant
1 = not at all responsible, 8 = entirely responsible
1 = not guilty, 2 = guilty of manslaughter, 3 = guilty of
murder
1 = 0 yr, 6 = life imprisonment
12.4.2
Windows Method
1. From the menu bar, click Analyze, then Data Reduction, and then
Factor. The following Factor Analysis window will open.
2. Transfer the nine variables of PROVO, PROTECT, MENTAL,

CAUSED, SAVE, INSANE, PASSION, DEFEND, and STABLE to
211
Factor Analysis
the Variables field by clicking these variables (highlight) and then

clicking
3. To (1) obtain a correlation matrix for the nine variables, and (2) to
test that the correlation matrix has sufficient correlations to justify
the application of factor analysis, click
. The following
Factor Analysis: Descriptives window will open. Check the Coefficients and KMO and Bartletts test of sphericity fields, and then
click
212
4. When the Factor Analysis window opens, click

. This
will open the Factor Analysis: Extraction window. In the Method:
drop-down list, choose Principal components as the extraction
method. In the Eigenvalues over field, accept the default value of
1. Leave the Number of factors field blank (i.e., allow principal
components analysis to extract as many factors as there are with
eigenvalues greater than 1). To obtain a Scree plot of the number of
factors extracted, check the Scree plot field. Click

. This
will open the Factor Analysis: Rotation window. To subject the
extracted factors to Varimax rotation, check the Varimax field. Click
.
213
Factor Analysis

. This will
open the Factor Analysis: Options window. If the data set has missing values, the researcher can choose one of the three methods
offered to deal with the missing values: (1) Exclude cases listwise
any case (subject) with a missing value for any of the variables
in the factor analysis will be excluded from the analysis; this method
is the most restrictive as the presence of missing values can reduce
the sample size substantially, (2) Exclude cases pairwise any
variable in the factor analysis that has a missing value will be
excluded from the analysis; this method is less restrictive as only
variables (with missing values), rather than cases, are excluded, and
(3) Replace with mean all missing values are replaced with mean
values; this method is less restrictive as all variables in the factor
analysis will be included in the analysis. Under Coefficient Display
Format, check the Sorted by size field. This procedure will present
the factor loadings (correlation coefficients) in a descending order
of magnitude format in the output. Check the Suppress absolute
values less than field, and then type the coefficient 0.33 in the field
next to it. This procedure will suppress the presentation of any factor
loadings with values less than 0.33 in the output (an item with a
factor loading of 0.33 or higher indicates that approximately 10% or
more of the variance in that item is accounted for by its common
factor). Click

the analysis. See Table 12.1 for the results.
to complete
214
12.4.3
SPSS Syntax Method
FACTOR VARIABLES=PROVO TO STABLE

/FORMAT=SORT BLANK(.33)
/PRINT=INITIAL EXTRACTION ROTATION CORRELATION KMO
/PLOT=EIGEN
/EXTRACTION=PC
/ROTATION=VARIMAX.
(Note: The method of factor extraction used is Principal Components
analysis.)
12.4.4
SPSS Output
TABLE 12.1
Factor Analysis Output
PROVO
PROTECT
MENTAL
CAUSED
SAVE
INSANE
PASSION
DEFEND
STABLE
MENTAL
CAUSED
SAVE
INSANE
PASSION
DEFEND
STABLE
1.000
.183
.144
.277
.133
.214
.288
.086
.099
.183
1.000
.066
.356
.662
.177
.023
.554
.014
.144
.066
1.000
.107
.031
.474
.035
.011
.446
.277
.356
.107
1.000
.446
.044
.095
.232
.173
.133
.662
.031
.446
1.000
.211
.119
.655
.119
.214
.177
.474
.044
.211
1.000
.095
.111
.407
.288
.023
.035
.095
.119
.095
1.000
.134
.060
.086
.554
.011
.232
.655
.111
.134
1.000
.067
.099
.014
.446
.173
.119
.407
.060
.067
1.000
KMO and Bartlett's Test

KaiserMeyerOlkin Measure of Sampling Adequacy.
Bartlett's Test of
Approx. chi-square
Sphericity
df
Sig.
PROVO
PROTECT
MENTAL
CAUSED
SAVE
INSANE
PASSION
DEFEND
STABLE
Communalities
Initial
Extraction
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
.651
.744
.695
.491
.808
.635
.509
.661
.622
.690
930.250
36
.000
Extraction method: principal component analysis.
215
Correlation
PROTECT
Factor Analysis
Correlation Matrix
PROVO
1
2
3
4
5
6
7
8
9
2.663
1.977
1.176
.874
.616
.569
.509
.345
.272
29.591
21.963
13.069
9.708
6.839
6.322
5.651
3.835
3.023
29.591
51.554
64.623
74.330
81.169
87.491
93.142
96.977
100.000
2.663
1.977
1.176
29.591
21.963
13.069
29.591
51.554
64.623
Rotation Sums of Squared Loadings

Cumulative
Total
% of Variance
%
2.452
1.933
1.431
27.244
21.482
15.897
27.244
48.726
64.623
Total
Total Variance Explained

Extraction Sums of Squared Loadings
% of
Cumulative
Total
Variance
%
216
Component
Initial Eigenvalues
Cumulative
% of Variance
%
217
Factor Analysis
Scree plot
3.0
Eigenvalue
2.5
2.0
1.5
1.0
0.5
0.0
1
4
5
6
7
Component number
FIGURE 12.1
Component Matrixa
Component
1
2
SAVE
PROTECT
DEFEND
CAUSED
MENTAL
INSANE
STABLE
PASSION
PROVO
.878
.795
.758
.613
.333
.331
.748
.723
.680
.513

a
Three components extracted.
.606
.568
218

Rotated Component Matrixa
Component
1
2
SAVE
PROTECT
DEFEND
MENTAL
STABLE
INSANE
PROVO
PASSION
CAUSED
.883
.861
.813
.830
.787
.726
.779
.713
.483
.455

Rotation method: Varimax with Kaiser Normalization.
a.
Rotation converged in four iterations.

Component Transformation Matrix
Component
1
2
1
2
3
.919
.153
.362
.288
.890
.354
3
.268
.429
.862

12.4.5
Results and Interpretation
12.4.5.1 Correlation Matrix

Examination of the Correlation Matrix (see Table 12.1) reveals fairly high
correlations between the nine variables written to measure specific defense
strategies. For example, the intercorrelations between the variables of PROTECT, SAVE, and DEFEND (self-defense defense strategy) are greater than
0.33. Similarly, the intercorrelations between MENTAL, INSANE, and STABLE (insanity defense strategy) are also greater than 0.33. Given the number
of high intercorrelations between the defense-specific variables, the hypothesized factor model appears to be appropriate.
The Bartletts test of sphericity can be used to test for the adequacy of the
correlation matrix, i.e., the correlation matrix has significant correlations
among at least some of the variables. If the variables are independent, the
observed correlation matrix is expected to have small off-diagonal coefficients.
Bartletts test of sphericity tests the hypothesis that the correlation matrix is
an identity matrix, that is, all the diagonal terms are 1 and all off-diagonal
terms are 0. If the test value is large and the significance level is small (< 0.05),
the hypothesis that the variables are independent can be rejected. In the present
analysis, the Bartletts test of sphericity yielded a value of 930.25 and an
associated level of significance smaller than 0.001. Thus, the hypothesis that
the correlation matrix is an identity matrix is rejected.
Factor Analysis
219
12.4.5.2 Factor Analysis Output

The Communalities section presents the communality of each variable (i.e.,
the proportion of variance in each variable accounted for by the common
factors). In using the principal components method of factor extraction, it
is possible to compute as many factors as there are variables. When all
factors are included in the solution, all of the variance of each variable is
accounted for by the common factors. Thus, the proportion of variance
accounted for by the common factors, or the communality of a variable is
1 for all the variables.
The Total Variance Explained section presents the number of common
factors computed, the eigenvalues associated with these factors, the percentage of total variance accounted for by each factor, and the cumulative
percentage of total variance accounted for by the factors. Although nine
factors have been computed, it is obvious that not all nine factors will be
useful in representing the list of nine variables. In deciding how many
factors to extract to represent the data, it is helpful to examine the eigenvalues associated with the factors. Using the criterion of retaining only
factors with eigenvalues of 1 or greater, the first three factors will be
retained for rotation. These three factors account for 29.59%, 21.96%, and
13.07% of the total variance, respectively. That is, almost 65% of the total
variance is attributable to these three factors. The remaining six factors
together account for only approximately 35% of the variance. Thus, a model
with three factors may be adequate to represent the data. From the Scree
plot, it again appears that a three-factor model should be sufficient to
represent the data set.
The Component Matrix represents the unrotated component analysis
factor matrix, and presents the correlations that relate the variables to the
three extracted factors. These coefficients, called factor loadings, indicate
how closely the variables are related to each factor. However, as the factors
are unrotated (the factors were extracted on the basis of the proportion of
total variance explained), significant cross-loadings have occurred. For
example, the variable CAUSED has loaded highly on Factor 1 and Factor
3; the variable INSANE has loaded highly on Factor 1 and Factor 2; the
variable PROVO has loaded highly on Factor 2 and Factor 3. These high
cross-loadings make interpretation of the factors difficult and theoretically
less meaningful.
The Rotated Component Matrix presents the three factors after Varimax
(orthogonal) rotation. To subject the three factors to Oblique (nonorthogonal) rotation, (1) check the Direct Oblimin field in the Factor Analysis:
Rotation window, or (2) substitute the word VARIMAX with OBLIMIN in
the ROTATION subcommand in the SPSS Syntax Method in Subsection
12.4.3. The OBLIMIN rotation output is presented in Table 12.2.
220

TABLE 12.2
Oblique (OBLIMIN) Rotation Output
Pattern Matrixa
1
SAVE
PROTECT
DEFEND
MENTAL
STABLE
INSANE
PROVO
PASSION
CAUSED
Component
2
.878
.876
.825
.844
.794
.704
.776
.724
.459
.394

Rotation method: Oblimin with Kaiser Normalization.
a.
Rotation converged in five iterations.

Component Correlation Matrix
Component
1
2
1
2
3
1.000
.140
.177
140
1.000
6.465E-02
3
.177
6.465E-02
1.000

Rotation method: Oblimin with Kaiser Normalization.
As there is no overwhelming theoretical reason to employ one rotation

method over another, the decision to interpret either the varimax rotated
matrix or the oblimin matrix depends on the magnitude of the factor
correlations presented in the Component Correlation Matrix (in Table
12.2). Examination of the factor correlations indicates that the three factors
are not strongly correlated (all coefficients are less than 0.20), which suggests that the varimax (orthogonal) matrix should be interpreted. However,
should the decision be made to interpret the oblimin rotated matrix, then
a further decision must be made to interpret either the Pattern Matrix or
the Structure Matrix. The structure matrix presents the correlations
between variables and factors, but these may be confounded by correlations between the factors. The pattern matrix shows the uncontaminated
correlations between variables and factors and is generally used for interpreting factors.
Examination of the factor loadings presented in the Varimax Rotated Component Matrix (Table 12.1) shows that eight of the nine variables loaded
highly on the three factors representing the three defense strategies of selfdefense, insanity, and provocation. One variable, CAUSED, cross-loaded
Factor Analysis
221
significantly across Factor 1 and Factor 3. Convention suggests three possible

ways of handling significant cross-loadings.
1. If the matrix indicates many significant cross-loadings, this may
suggest further commonality between the cross-loaded variables
and the factors. The researcher may decide to rerun factor analysis,
stipulating a smaller number of factors to be extracted.
2. Examine the wording of the cross-loaded variables, and based on
their face-validity, assign them to the factors that they are most
conceptually/logically representative of.
3. Delete all cross-loaded variables. This will result in clean factors
and will make interpretation of the factors that much easier. This
method works best when there are only few significant crossloadings.
In the present example, the cross-loaded variable of CAUSED appears to
be more conceptually relevant to Factor 3 (provocation defense) than to
Factor 1 (self-defense defense). Thus, the decision may be made to retain
this variable to represent Factor 3. Alternatively, the researcher may decide
to delete this variable. In any case, no further analysis (rotation) is required
as the factor structure is clearly consistent with the hypothesized threefactor model.
In summary, it can be concluded that factor analysis has identified three
factors from the list of nine variables. In the main, these factors are represented by the specific statements written to reflect the three defense strategies
of self-defense, insanity, and provocation.
12.5 Factor Analysis: Example 2

This example demonstrates the use of factor analysis in deriving meaningful factors using multiple runs. A study was designed to identify the
motives for the maintenance of smoking behavior and its possible cessation
(Ho, 1989). Twenty five statements were written to represent these motives.
Each statement was rated on a five-point scale with high scores indicating
strong agreement with that motive as a reason for smoking. A total of 91
smokers provided responses to these 25 statements. Factor analysis (with
principal components extraction, followed by varimax rotation) was
employed to investigate the factor structure of this 25-item smoking inventory. The 25 statements written to reflect smoking motives are listed in
Table 12.3.
222
TABLE 12.3
Reasons for Smoking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
I find smoking enjoyable.

When I feel stressed, tense, or nervous, I light up a cigarette.
Smoking lowers my appetite and therefore keeps my weight down.
Lighting up a cigarette is a habit to me.
I smoke cigarettes to relieve boredom.
Smoking gives me something to do with my hands.
I feel secure when I am smoking.
I enjoy lighting up after pleasurable experiences, e.g., after a good meal.
Smoking relaxes me.
Smoking gives me a lift.
Smoking allows me to be part of a crowd.
I smoke because I am addicted to cigarettes.
I smoke because members of my family smoke.
Smoking is a means of socializing.
Smoking helps me to concentrate when I am working.
I smoke because most of my friends smoke.
I smoke because it makes me feel confident.
Smoking makes me feel sophisticated and glamorous.
I smoke as an act of defiance.
I smoke because I find it difficult to quit.
I enjoy the taste of cigarettes.
I find smoking pleasurable.
I smoke to annoy nonsmokers.
The health statistics regarding smoking cigarettes and health problems don't bother
me, as they are highly exaggerated anyway.
I am willing to live with my health problems that my smoking may cause me.
12.5.1
Data Entry Format
The data set has been saved under the name: SMOKE.SAV.
12.5.2
Variables
Columns
s1 to s25
125
Code
1 = strong agree,
5 = strongly disagree
Windows Method (First Run)
223
Factor Analysis
2. Transfer the 25 variables of S1 to S25 to the Variables field by

clicking these variables (highlighting) and then clicking
3. To test that the correlation matrix, generated from the 25 variables,

has sufficient correlations to justify the application of factor analysis,
click
. The following Factor Analysis: Descriptives
window will open. Check the and KMO and Bartletts test of sphericity field, and then click
224

. This
will open the Factor Analysis: Extraction window. In the Method
drop-down list, choose Principal components as the extraction
method. Ensure that the Correlation matrix field is checked. In the
Eigenvalues over field, accept the default value of 1. Leave the
Number of factors field blank (i.e., allow principal components
analysis to extract as many factors as there are with eigenvalues
greater than 1). To obtain a Scree plot of the number of factors
extracted, check the Scree plot field. Click

. This
225
Factor Analysis
extracted factors to Varimax rotation, check the Varimax field. Click

.

. This will
open the Factor Analysis: Options window. If the data set has missing values, the researcher can choose one of the three methods
offered to deal with the missing values: (1) Exclude cases listwise,
(2) Exclude cases pairwise, and (3) Replace with mean. Under Coefficient Display Format, check the Sorted by size cell. This procedure
will present the factor loadings (correlation coefficients) in a
descending order of magnitude format in the output. Check the
Suppress absolute values less than field, and then type the coefficient of 0.33 in the field next to it. This procedure will suppress the
presentation of any factor loadings with values less than 0.33 in the
output. Click
226

plete the analysis. See Table 12.4 for the results.
12.5.3
to com-
SPSS Syntax Method (First Run)
FACTOR VARIABLES=S1 TO S25

/PRINT=INITIAL EXTRACTION ROTATION KMO
/PLOT=EIGEN
/EXTRACTION=PC
/ROTATION=VARIMAX.
12.5.4
SPSS Output
TABLE 12.4
Factor Analysis Output
KMO and Bartlett's Test
KaiserMeyerOlkin Measure
of Sampling Adequacy
Bartlett's Test of
Sphericity
Approx. Chi-Square
df
Sig.
.687
876.290
300
.000
227
Factor Analysis
Communalities
Initial
Extraction
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S 11
S12
S13
S14
S15
S16
S17
S18
S19
S20
S21
S22
S23
S24
S25
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
.747
.636
.399
.535
.751
.794
.698
.687
.658
.723
.722
.727
.610
.703
.593
.795
.698
.740
.648
.695
.578
.722
.558
.716
.403
21.258
11.565
10.544
6.662
6.049
5.296
4.760
3.748
3.590
3.224
2.930
2.755
2.286
2.244
2.057
1.805
1.479
1.414
1.287
1.228
1.034
.929
.722
.666
.468
21.258
32.823
43.367
50.029
56.078
61.375
66.134
69.882
73.471
76.696
79.625
82.381
84.666
86.910
88.968
90.773
92.252
93.667
94.954
96.182
97.216
98.145
98.866
99.532
100.000
5.315
2.891
2.636
1.666
1.512
1.324
1.190
21.258
11.565
10.544
6.662
6.049
5.296
4.760
21.258
32.823
43.367
50.029
56.078
61.375
66.134
4.199
2.550
2.169
2.165
2.022
1.979
1.449
16.797
10.201
8.675
8.660
8.089
7.916
5.797
16.797
26.998
35.672
44.332
52.421
60.338
66.134
5.315
2.891
2.636
1.666
1.512
1.324
1.190
.937
.897
.806
.732
.689
.571
.561
.514
.451
.370
.354
.322
.307
.258
.232
.180
.166
.117
Rotation Sums of Squared Loadings

% of
Cumulative
Total
Variance
%
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

Extraction Sums of Squared Loadings
% of
Cumulative
Total
Variance
%
228
Component
Initial Eigenvalues
% of
Cumulative
Total
Variance
%
229
Factor Analysis
Scree plot
6
Eigenvalue
0
1 2 3 4 5 6 7 8 9 10111213141516171819202122232425
Component number
FIGURE 12.2
230
1
SI8
S17
S16
S11
S14
S7
S10
S13
S5
S3
S12
S20
S4
S2
S24
S1
S22
S21
S23
S6
S19
S9
S8
S25
S15
.772
.769
.764
.751
.680
.676
.606
.576
.493
.388
.431
Component Matrixa
Component
3
4
5
.442
.355
.337
.734
.607
.584
.547
.448
.339
.347
.357
.409
.371
.357
.501
.391
.459
.377
.357
.844
.809
.623
.404
.418
.361
.355
.520
.507
.363
.345
.336
.476
.360
.552
.440
.511
.350

a
Seven components extracted.
.367
.395
.384
231
Factor Analysis
Component
2
3
4
5
1
S14
S11
S18
S17
S16
S10
519
S11
S22
S21
S3
S24
S13
S23
S20
S12
S4
S2
S9
S7
S5
S6
S15
S25
S8
.802
.780
.770
.748
.744
.592
.485
.401
.441
.477
.390
.842
.830
.691
.357
.791
.680
.643
.809
.788
.594
.347
.737
.734
.502
.460
.383
.357
.812
.808
.724
.536
.493
.346

a
Rotation converged in nine iterations.
Component Transformation Matrix

3
4
5
Component
1
2
3
4
5
6
7
.830
.173
.144
.371
.188
.250
.158
.099
.055
.945
.112
.199
.206
.017
.379
.316
.038
.581
.142
.603
.185
.066
.727
032
.182
.346
.078
.554
.218
.427
.255
383
.683
.297
.063

.303
.343
.139
.552
.190
.427
.498
.116
.195
.006
.167
.532
.506
.618
232
12.5.5

12.5.5.1 Correlation Matrix

The Bartlett's Test of Sphericity (see Table 12.4) tests the adequacy of the
correlation matrix, and yielded a value of 876.29 and an associated level of
significance smaller than 0.001. Thus, the hypothesis that the correlation
matrix is an identity matrix can be rejected, i.e., the correlation matrix has
significant correlations among at least some of the variables.
12.5.5.2 Factor Analysis Output
The Total Variance Explained section presents the number of common factors extracted, the eigenvalues associated with these factors, the percentage
of total variance accounted for by each factor, and the cumulative percentage
of total variance accounted for by the factors. Using the criterion of retaining
only factors with eigenvalues of 1 or greater, seven factors were retained for
rotation. These seven factors accounted for 21.26%, 11.56%, 10.54%, 6.66%,
6.05%, 5.30%, and 4.76% of the total variance, respectively, for a total of
66.13%. The scree plot, however, suggests a four-factor solution.
The Rotated Component Matrix presents the seven factors after varimax
rotation. To identify what these factors represent, it would be necessary to
consider what items loaded on each of the seven factors. The clustering of
the items in each factor and their wording offer the best clue as to the
meaning of that factor. For example, eight items loaded on Factor 1. An
inspection of these items (see Table 12.3) clearly shows that the majority of
these items reflect a social motive for smoking (e.g., smoking is a means of
socializing; smoking allows me to be part of a crowd; I smoke because most
of my friends smoke, etc.). Factor 2 contains five items that clearly reflect
the pleasure that a smoker gains from smoking (e.g., I find smoking enjoyable; I find smoking pleasurable; I enjoy the taste of cigarettes, etc). Factors
4, 5, and 6 contain items that appear to reflect two related motives addiction and habit (e.g., I smoke because I find it difficult to quit; I smoke because
I am addicted to cigarettes; lighting up is a habit to me; smoking gives me
something to do with my hands, etc.). The two remaining factors, Factor 3
and Factor 7, contain items that do not hang together conceptually, and
as such, are not easily interpretable. In fact, some of the items that load on
these two factors appear to overlap in meaning with other factors. For
example, item s13 (I smoke because members of my family smoke) in Factor
3 appears to reflect a social motive, and thus overlaps in meaning with Factor
1. Similarly, item s8 (I enjoy lighting up after pleasurable experiences) in
Factor 7 appears to overlap in meaning with Factor 2 (pleasure motive). The
commonality in meaning of some of these factors suggests that a number of
factors can be combined. The combination of factors is purely a subjective
decision, aimed at reducing the number of extracted factors to a smaller,
more manageable, and ultimately more meaningful set of factors. Given that
the present factor structure appears to be represented by three dimensions
233
Factor Analysis
of smoking motives (Social, Pleasure, and Addiction/Habit), it was decided

to rerun Factor Analysis, stipulating the extraction of only three factors.
12.5.6
Windows Method (Second Run)
2. Transfer the 25 variables of S1 to S25 to the Variables cell by clicking

these variables (highlighting) and then clicking
234
3. Click
to open the Factor Analysis: Extraction window.
In the Method drop-down list, choose Principal components as the
extraction method. To extract only three factors from the correlation
matrix, check the Number of factors field, and then type 3 in the
field next to it. This procedure will override the default extraction
of all factors with eigenvalues greater than 1. Click

. This
extracted factors to Varimax rotation, check the Varimax cell. Click
.
235
Factor Analysis

. This will
open the Factor Analysis: Options window. Under Coefficient Display Format, check the Sorted by size field. Check the Suppress
absolute values less than field, and then type the coefficient of 0.33
in the field next to it. Click

plete the analysis. See Table 12.5 for the results.
to com-
236
12.5.7

SPSS Syntax Method (Second Run)
FACTOR VARIABLES=S1 TO S25

/CRITERIA=FACTOR(3)
/EXTRACTION=PC
/ROTATION=VARIMAX.
12.5.8
SPSS Output
TABLE 12.5
Three-Factor Structure Output
Extraction Sums of Squared
Rotation Sums of Squared
Initial Eigenvalues
Loadings
Loadings
% of
Cumulative
% of
Cumulative
% of
Cumulative
Component Total Variance
%
Total Variance
%
Total Variance
%
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
5.315
2.891
2.636
1.666
1.512
1.324
1.190
.937
.897
.806
.732
.689
.571
.561
.514
.451
.370
.354
.322
.307
.258
.232
.180
.166
.117
21.258
11.565
10.544
6.662
6.049
5.296
4.760
3.748
3.590
3.224
2.930
2.755
2.286
2.244
2.057
1.805
1.479
1.414
1.287
1.228
1.034
.929
.722
.666
.468
21.258
32.823
43.367
50.029
56.078
61.375
66.134
69.882
73.471
76.696
79.625
82.381
84.666
86.910
88.968
90.773
92.252
93.667
94.954
96.182
97.216
98.145
98.866
99.532
100.000
5.315
2.891
2.636
21.258
11.565
10.544
21.258
32.823
43.367
4.953
3.163
2.725
19.814
12.651
10.902
19.814
32.465
43.367
237
Factor Analysis
Component Matrixa
Component
1
2
S18
S17
S16
S11
S14
S7
S10
S13
S5
S6
S23
S3
S19
S9
S15
S12
S20
S4
S2
S24
S1
S22
S21
S8
S25
.772
.769
.764
.751
.680
.676
.606
.576
.493
.418
.404
.388
.361
.355
.431
.442
.355
.363
.337
.336
.734
.607
.584
.547
.448
.345
.844
.809
.623
.476

a
Three components extracted.
238

Component
1
2
S11
S16
S18
S17
S14
S13
S24
S23
S19
S10
S12
S7
S20
S4
S2
S5
S6
S3
S15
S1
S22
S21
S8
S9
S25
.796
.790
.783
.750
.721
.567
.525
.486
.463
.439
.478
.350
.370
.685
.643
.607
.561
.555
.502
.491
.421
.356
.421
.850
.827
.629
.487
.392

a
12.5.9
Rotation converged in four iterations.
The results presented in the Total Variance Explained section (see Table 12.5)
are identical to those obtained in the first run (Table 12.4). This is not surprising as the same extraction method (principal components analysis) was
applied to the same 25 items. Thus, the same seven factors were extracted,
accounting for a combined 66.13% of the total variance.
The Rotated Component Matrix presents only three rotated factors as
stipulated in both the SPSS windows and syntax file methods. The rotated
factor structure shows a number of cross-loaded items (s10, s7, s5, and s8)
that were deleted prior to interpretation. Deletion of cross-loaded items
serves to clarify the factors and makes their interpretation easier. Factor 1
contains nine items that clearly reflect the social motive for smoking, and
was thus labeled SOCIAL. Factor 2 contains six items that reflect addiction
and habit as motives for smoking, and was labeled ADDICTION/HABIT.
Factor 3 contains four items that reflect the pleasure gained from smoking,
and was labeled PLEASURE. This three-factor model represents the combination of the seven original factors, and appears to reflect adequately the
underlying factor structure of the 25-item smoking inventory.

IBM SPSS Statistics Base

Uploaded by

Copyright:

Available Formats

IBM SPSS Statistics Base

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IBM SPSS Statistics Base

Uploaded by

Copyright:

Available Formats

C6021_book.

fm Page 203 Thursday, January 26, 2006 11:28 AM

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 204 Thursday, January 26, 2006 11:28 AM

Univariate and Multivariate Data Analysis and Interpretation with SPSS

1. Computation of the correlation matrix for all variables.

Computation of the Correlation Matrix

As factor analysis is based on correlations between measured variables, a

Extraction of Initial Factors

At this phase, the number of common factors needed to adequately describe

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 205 Thursday, January 26, 2006 11:28 AM

Rotation of Extracted Factors

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 206 Thursday, January 26, 2006 11:28 AM

Univariate and Multivariate Data Analysis and Interpretation with SPSS

Orthogonal vs. Oblique Rotation

In choosing between orthogonal and oblique rotation, there is no compelling

Number of Factor Analysis Runs

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 207 Thursday, January 26, 2006 11:28 AM

12.2 Checklist of Requirements

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 208 Thursday, January 26, 2006 11:28 AM

Univariate and Multivariate Data Analysis and Interpretation with SPSS

Statistical assumptions include normality and linearity and sufficient significant

Conceptual assumptions include selection of variables and homogeneity.

12.4 Factor Analysis: Example 1

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 209 Thursday, January 26, 2006 11:28 AM

Data Entry Format

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 210 Thursday, January 26, 2006 11:28 AM

Univariate and Multivariate Data Analysis and Interpretation with SPSS

2. Transfer the nine variables of PROVO, PROTECT, MENTAL,

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 211 Thursday, January 26, 2006 11:28 AM

the Variables field by clicking these variables (highlight) and then

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 212 Thursday, January 26, 2006 11:28 AM

Univariate and Multivariate Data Analysis and Interpretation with SPSS

4. When the Factor Analysis window opens, click

5. When the Factor Analysis window opens, click

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 213 Thursday, January 26, 2006 11:28 AM

6. When the Factor Analysis window opens, click

7. When the Factor Analysis window opens, click

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 214 Thursday, January 26, 2006 11:28 AM

Univariate and Multivariate Data Analysis and Interpretation with SPSS

SPSS Syntax Method

FACTOR VARIABLES=PROVO TO STABLE

2006 by Taylor & Francis Group, LLC

KMO and Bartlett's Test

Extraction method: principal component analysis.

2006 by Taylor & Francis Group, LLC

C6021_book.fm Page 215 Thursday, January 26, 2006 11:28 AM

Extraction method: principal component analysis.

2006 by Taylor & Francis Group, LLC

Rotation Sums of Squared Loadings