IBM SPSS Statistics Base
IBM SPSS Statistics Base
IBM SPSS Statistics Base
12
Factor Analysis
12.1 Aim
The major aim of factor analysis is the orderly simplification of a large number
of intercorrelated measures to a few representative constructs or factors. Suppose that a researcher wants to identify the major dimensions underlying a
number of personality tests. He begins by administering the personality tests
to a large sample of people (N = 1000), with each test supposedly measuring
a specific facet of a persons personality (e.g., ethnocentrism, authoritarianism, and locus of control). Assume that there are 30 such tests, each consisting
of ten test items. What the researcher will end up with is a mass of numbers
that will say very little about the dimensions underlying these personality
tests. On average, some of the scores will be high, some will be low, and some
intermediate, but interpretation of these scores will be extremely difficult if
not impossible. This is where factor analysis comes in. It allows the researcher
to reduce this mass of numbers to a few representative factors, which can
then be used for subsequent analysis.
Factor analysis is based on the assumption that all variables are correlated
to some degree. Therefore, those variables that share similar underlying
dimensions should be highly correlated, and those variables that measure
dissimilar dimensions should yield low correlations. Using the earlier example, if the researcher intercorrelates the scores obtained from the 30 personality tests, then those tests that measure the same underlying personality
dimension should yield high correlation coefficients, whereas those tests that
measure different personality dimensions should yield low correlation coefficients. These high/low correlation coefficients will become apparent in the
correlation matrix because they form clusters indicating which variables
hang together. For example, measures of ethnocentrism, authoritarianism,
and aggression may be highly intercorrelated, indicating that they form an
identifiable personality dimension. The primary function of factor analysis
is to identify these clusters of high intercorrelations as independent factors.
There are three basic steps to factor analysis:
203
204
12.1.1
12.1.2
Factor Analysis
205
12.1.2.2.1 Eigenvalues
Only factors with eigenvalues of 1 or greater are considered to be significant;
all factors with eigenvalues less than 1 are disregarded. An eigenvalue is a
ratio between the common (shared) variance and the specific (unique) variance explained by a specific factor extracted. The rationale for using the
eigenvalue criterion is that the amount of common variance explained by
an extracted factor should be at least equal to the variance explained by a
single variable (unique variance) if that factor is to be retained for interpretation. An eigenvalue greater than 1 indicates that more common variance
than unique variance is explained by that factor.
12.1.2.2.2 Scree Test
This test is used to identify the optimum number of factors that can be
extracted before the amount of unique variance begins to dominate the
common variance structure (Hair, Anderson, Tatham, & Black, 1995). The
scree test is derived by plotting the eigenvalues (on the Y axis) against the
number of factors in their order of extraction (on the X axis). The initial
factors extracted are large factors (with high eigenvalues), followed by
smaller factors. Graphically, the plot will show a steep slope between the
large factors and the gradual trailing off of the rest of the factors. The point
at which the curve first begins to straighten out is considered to indicate the
maximum number of factors to extract. That is, those factors above this point
of inflection are deemed meaningful, and those below are not. As a general
rule, the scree test results in at least one and sometimes two or three more
factors being considered significant than does the eigenvalue criterion (Cattel, 1966).
12.1.3
Factors produced in the initial extraction phase are often difficult to interpret.
This is because the procedure in this phase ignores the possibility that variables identified to load on or represent factors may already have high loadings (correlations) with previous factors extracted. This may result in
significant cross-loadings in which many factors are correlated with many
variables. This makes interpretation of each factor difficult, because different
factors are represented by the same variables. The rotation phase serves to
sharpen the factors by identifying those variables that load on one factor
and not on another. The ultimate effect of the rotation phase is to achieve a
simpler, theoretically more meaningful factor pattern.
12.1.4
Rotation Methods
There are two main classes of factor rotation method: Orthogonal and
Oblique. Orthogonal rotation assumes that the factors are independent, and
206
the rotation process maintains the reference axes of the factors at 90. Oblique
rotation allows for correlated factors instead of maintaining independence
between the rotated factors. The oblique rotation process does not require
that the reference axes be maintained at 90. Of the two rotation methods,
oblique rotation is more flexible because the factor axes need not be orthogonal. Moreover, at the theoretical level, it is more realistic to assume that
influences in nature are correlated. By allowing for correlated factors, oblique
rotation often represents the clustering of variables more accurately.
There are three major methods of orthogonal rotation: varimax, quartimax,
and equimax. Of the three approaches, varimax has achieved the most
widespread use as it seems to give the clearest separation of factors. It does
this by producing the maximum possible simplification of the columns (factors) within the factor matrix. In contrast, both quartimax and equimax
approaches have not proven very successful in producing simpler structures,
and have not gained widespread acceptance. Whereas the orthogonal
approach to rotation has several choices provided by SPSS, the oblique
approach is limited to one method: oblimin.
12.1.5
12.1.6
It should be noted that when factor analysis is used for research (either for
the purpose of data reduction or to identify theoretically meaningful dimensions), a minimum of two runs will normally be required. In the first run,
Factor Analysis
207
the researcher allows factor analysis to extract factors for rotation. All factors
with eigenvalues of 1 or greater will be subjected to varimax rotation by
default within SPSS. However, even after rotation, not all extracted rotated
factors will be meaningful. For example, some small factors may be represented by very few items, and there may still be significant cross-loading of
items across several factors. At this stage, the researcher must decide which
factors are substantively meaningful (either theoretically or intuitively), and
retain only these for further rotation. It is not uncommon for a data set to
be subjected to a series of factor analysis and rotation before the obtained
factors can be considered clean and interpretable.
12.1.7
Interpreting Factors
In interpreting factors, the size of the factor loadings (correlation coefficients between the variables and the factors they represent) will help in
the interpretation. As a general rule, variables with large loadings indicate
that they are representative of the factor, while small loadings suggest that
they are not. In deciding what is large or small, a rule of thumb suggests
that factor loadings greater than 0.33 are considered to meet the minimal
level of practical significance. The reason for using the 0.33 criterion is
that if the value is squared, the squared value represents the amount of
the variables total variance accounted for by the factor. Therefore, a factor
loading of 0.33 denotes that approximately 10% of the variables total
variance is accounted for by the factor. The grouping of variables with
high factor loadings should suggest what the underlying dimension is for
that factor.
208
12.3 Assumptions
The assumptions underlying factor analysis can be classified as statistical
and conceptual.
12.3.1
Statistical Assumptions
12.3.2
Conceptual Assumptions
Factor Analysis
209
was rated on an 8-point scale, with high scores indicating strong support for
that particular defense strategy. A total of 400 subjects provided responses
to these nine statements. Factor analysis (with principal components extraction) was employed to investigate whether these nine defense statements
represent identifiable factors, i.e., defense strategies. The nine statements
(together with their SPSS variable name) written to reflect the three defense
strategies are listed in the following.
1. Provocation Defense
PROVO: In killing her husband, the defendant's action reflects
a sudden and temporary loss of self-control as a result of the
provocative conduct of the deceased.
CAUSED: The nature of the provocative conduct by the deceased
was such that it could have caused an ordinary person with
normal powers of self-control to do what the defendant did.
PASSION: In killing her husband, the defendant acted in the
heat of passion as a response to the deceased sudden provocation
on that fateful day.
2. Self-Defense Defense
PROTECT: In killing her husband, the defendant was justified
in using whatever force (including lethal force) to protect herself.
SAVE: The defendant's lethal action was justified in that she
acted to save herself from grievous bodily harm.
DEFEND: In killing her husband, the defendant used such force
as was necessary to defend herself.
3. Insanity Defense
MENTAL: The action of the defendant is the action of a mentally
impaired person.
INSANE: In killing her husband, the defendant was either irrational or insane.
STABLE: The action of the accused is not typical of the action of
a mentally stable person.
12.4.1
Note: The survey questionnaire employed in this study was designed as part
of a larger study. It contains additional variables apart from the nine defense
strategy variables. The data set is named DOMES.SAV.
210
Variables
Column(s)
Code
Sex
Age
Educ
Income
Provo
Protect
Mental
Caused
Save
Insane
Passion
Defend
Stable
Real
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Support
Apply
15
16
Respon
Verdict
17
18
Sentence
19
1 = male, 2 = female
In years
1 = primary to 6 = tertiary
1 = < $10,000 per year, 5 = $40,000 per year
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = strongly disagree, 8 = strongly agree
1 = totally unbelievable syndrome, 8 = totally believable
syndrome
1 = no support at all, 8 = a great deal of support
1 = does not apply to the defendant at all, 8 = totally applies
to the defendant
1 = not at all responsible, 8 = entirely responsible
1 = not guilty, 2 = guilty of manslaughter, 3 = guilty of
murder
1 = 0 yr, 6 = life imprisonment
12.4.2
Windows Method
1. From the menu bar, click Analyze, then Data Reduction, and then
Factor. The following Factor Analysis window will open.
211
Factor Analysis
3. To (1) obtain a correlation matrix for the nine variables, and (2) to
test that the correlation matrix has sufficient correlations to justify
the application of factor analysis, click
. The following
Factor Analysis: Descriptives window will open. Check the Coefficients and KMO and Bartletts test of sphericity fields, and then
click
212
213
Factor Analysis
to complete
214
12.4.3
12.4.4
SPSS Output
TABLE 12.1
Factor Analysis Output
PROVO
PROTECT
MENTAL
CAUSED
SAVE
INSANE
PASSION
DEFEND
STABLE
MENTAL
CAUSED
SAVE
INSANE
PASSION
DEFEND
STABLE
1.000
.183
.144
.277
.133
.214
.288
.086
.099
.183
1.000
.066
.356
.662
.177
.023
.554
.014
.144
.066
1.000
.107
.031
.474
.035
.011
.446
.277
.356
.107
1.000
.446
.044
.095
.232
.173
.133
.662
.031
.446
1.000
.211
.119
.655
.119
.214
.177
.474
.044
.211
1.000
.095
.111
.407
.288
.023
.035
.095
.119
.095
1.000
.134
.060
.086
.554
.011
.232
.655
.111
.134
1.000
.067
.099
.014
.446
.173
.119
.407
.060
.067
1.000
PROVO
PROTECT
MENTAL
CAUSED
SAVE
INSANE
PASSION
DEFEND
STABLE
Communalities
Initial
Extraction
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
.651
.744
.695
.491
.808
.635
.509
.661
.622
.690
930.250
36
.000
215
Correlation
PROTECT
Factor Analysis
Correlation Matrix
PROVO
1
2
3
4
5
6
7
8
9
2.663
1.977
1.176
.874
.616
.569
.509
.345
.272
29.591
21.963
13.069
9.708
6.839
6.322
5.651
3.835
3.023
29.591
51.554
64.623
74.330
81.169
87.491
93.142
96.977
100.000
2.663
1.977
1.176
29.591
21.963
13.069
29.591
51.554
64.623
27.244
21.482
15.897
27.244
48.726
64.623
Total
216
Component
Initial Eigenvalues
Cumulative
% of Variance
%
217
Factor Analysis
Scree plot
3.0
Eigenvalue
2.5
2.0
1.5
1.0
0.5
0.0
1
4
5
6
7
Component number
FIGURE 12.1
Component Matrixa
Component
1
2
SAVE
PROTECT
DEFEND
CAUSED
MENTAL
INSANE
STABLE
PASSION
PROVO
.878
.795
.758
.613
.333
.331
.748
.723
.680
.513
.606
.568
218
.883
.861
.813
.830
.787
.726
.779
.713
.483
.455
.919
.153
.362
.288
.890
.354
3
.268
.429
.862
12.4.5
Factor Analysis
219
220
Pattern Matrixa
1
SAVE
PROTECT
DEFEND
MENTAL
STABLE
INSANE
PROVO
PASSION
CAUSED
Component
2
.878
.876
.825
.844
.794
.704
.776
.724
.459
.394
1.000
.140
.177
140
1.000
6.465E-02
3
.177
6.465E-02
1.000
Factor Analysis
221
222
TABLE 12.3
Reasons for Smoking
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
12.5.1
The data set has been saved under the name: SMOKE.SAV.
12.5.2
Variables
Columns
s1 to s25
125
Code
1 = strong agree,
5 = strongly disagree
1. From the menu bar, click Analyze, then Data Reduction, and then
Factor. The following Factor Analysis window will open.
223
Factor Analysis
224
225
Factor Analysis
226
12.5.3
to com-
12.5.4
SPSS Output
TABLE 12.4
Factor Analysis Output
KMO and Bartlett's Test
KaiserMeyerOlkin Measure
of Sampling Adequacy
Bartlett's Test of
Sphericity
Approx. Chi-Square
df
Sig.
.687
876.290
300
.000
227
Factor Analysis
Communalities
Initial
Extraction
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S 11
S12
S13
S14
S15
S16
S17
S18
S19
S20
S21
S22
S23
S24
S25
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
.747
.636
.399
.535
.751
.794
.698
.687
.658
.723
.722
.727
.610
.703
.593
.795
.698
.740
.648
.695
.578
.722
.558
.716
.403
21.258
11.565
10.544
6.662
6.049
5.296
4.760
3.748
3.590
3.224
2.930
2.755
2.286
2.244
2.057
1.805
1.479
1.414
1.287
1.228
1.034
.929
.722
.666
.468
21.258
32.823
43.367
50.029
56.078
61.375
66.134
69.882
73.471
76.696
79.625
82.381
84.666
86.910
88.968
90.773
92.252
93.667
94.954
96.182
97.216
98.145
98.866
99.532
100.000
5.315
2.891
2.636
1.666
1.512
1.324
1.190
21.258
11.565
10.544
6.662
6.049
5.296
4.760
21.258
32.823
43.367
50.029
56.078
61.375
66.134
4.199
2.550
2.169
2.165
2.022
1.979
1.449
16.797
10.201
8.675
8.660
8.089
7.916
5.797
16.797
26.998
35.672
44.332
52.421
60.338
66.134
5.315
2.891
2.636
1.666
1.512
1.324
1.190
.937
.897
.806
.732
.689
.571
.561
.514
.451
.370
.354
.322
.307
.258
.232
.180
.166
.117
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
228
Component
Initial Eigenvalues
% of
Cumulative
Total
Variance
%
229
Factor Analysis
Scree plot
6
Eigenvalue
0
1 2 3 4 5 6 7 8 9 10111213141516171819202122232425
Component number
FIGURE 12.2
230
1
SI8
S17
S16
S11
S14
S7
S10
S13
S5
S3
S12
S20
S4
S2
S24
S1
S22
S21
S23
S6
S19
S9
S8
S25
S15
.772
.769
.764
.751
.680
.676
.606
.576
.493
.388
.431
Component Matrixa
Component
3
4
5
.442
.355
.337
.734
.607
.584
.547
.448
.339
.347
.357
.409
.371
.357
.501
.391
.459
.377
.357
.844
.809
.623
.404
.418
.361
.355
.520
.507
.363
.345
.336
.476
.360
.552
.440
.511
.350
.367
.395
.384
231
Factor Analysis
Rotated Component Matrixa
Component
2
3
4
5
1
S14
S11
S18
S17
S16
S10
519
S11
S22
S21
S3
S24
S13
S23
S20
S12
S4
S2
S9
S7
S5
S6
S15
S25
S8
.802
.780
.770
.748
.744
.592
.485
.401
.441
.477
.390
.842
.830
.691
.357
.791
.680
.643
.809
.788
.594
.347
.737
.734
.502
.460
.383
.357
.812
.808
.724
.536
.493
.346
Component
1
2
3
4
5
6
7
.830
.173
.144
.371
.188
.250
.158
.099
.055
.945
.112
.199
.206
.017
.379
.316
.038
.581
.142
.603
.185
.066
.727
032
.182
.346
.078
.554
.218
.427
.255
383
.683
.297
.063
.303
.343
.139
.552
.190
.427
.498
.116
.195
.006
.167
.532
.506
.618
232
12.5.5
233
Factor Analysis
12.5.6
1. From the menu bar, click Analyze, then Data Reduction, and then
Factor. The following Factor Analysis window will open.
234
3. Click
to open the Factor Analysis: Extraction window.
In the Method drop-down list, choose Principal components as the
extraction method. To extract only three factors from the correlation
matrix, check the Number of factors field, and then type 3 in the
field next to it. This procedure will override the default extraction
of all factors with eigenvalues greater than 1. Click
235
Factor Analysis
to com-
236
12.5.7
12.5.8
SPSS Output
TABLE 12.5
Three-Factor Structure Output
Total Variance Explained
Extraction Sums of Squared
Rotation Sums of Squared
Initial Eigenvalues
Loadings
Loadings
% of
Cumulative
% of
Cumulative
% of
Cumulative
Component Total Variance
%
Total Variance
%
Total Variance
%
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
5.315
2.891
2.636
1.666
1.512
1.324
1.190
.937
.897
.806
.732
.689
.571
.561
.514
.451
.370
.354
.322
.307
.258
.232
.180
.166
.117
21.258
11.565
10.544
6.662
6.049
5.296
4.760
3.748
3.590
3.224
2.930
2.755
2.286
2.244
2.057
1.805
1.479
1.414
1.287
1.228
1.034
.929
.722
.666
.468
21.258
32.823
43.367
50.029
56.078
61.375
66.134
69.882
73.471
76.696
79.625
82.381
84.666
86.910
88.968
90.773
92.252
93.667
94.954
96.182
97.216
98.145
98.866
99.532
100.000
5.315
2.891
2.636
21.258
11.565
10.544
21.258
32.823
43.367
4.953
3.163
2.725
19.814
12.651
10.902
19.814
32.465
43.367
237
Factor Analysis
Component Matrixa
Component
1
2
S18
S17
S16
S11
S14
S7
S10
S13
S5
S6
S23
S3
S19
S9
S15
S12
S20
S4
S2
S24
S1
S22
S21
S8
S25
.772
.769
.764
.751
.680
.676
.606
.576
.493
.418
.404
.388
.361
.355
.431
.442
.355
.363
.337
.336
.734
.607
.584
.547
.448
.345
.844
.809
.623
.476
238
.796
.790
.783
.750
.721
.567
.525
.486
.463
.439
.478
.350
.370
.685
.643
.607
.561
.555
.502
.491
.421
.356
.421
.850
.827
.629
.487
.392
12.5.9
The results presented in the Total Variance Explained section (see Table 12.5)
are identical to those obtained in the first run (Table 12.4). This is not surprising as the same extraction method (principal components analysis) was
applied to the same 25 items. Thus, the same seven factors were extracted,
accounting for a combined 66.13% of the total variance.
The Rotated Component Matrix presents only three rotated factors as
stipulated in both the SPSS windows and syntax file methods. The rotated
factor structure shows a number of cross-loaded items (s10, s7, s5, and s8)
that were deleted prior to interpretation. Deletion of cross-loaded items
serves to clarify the factors and makes their interpretation easier. Factor 1
contains nine items that clearly reflect the social motive for smoking, and
was thus labeled SOCIAL. Factor 2 contains six items that reflect addiction
and habit as motives for smoking, and was labeled ADDICTION/HABIT.
Factor 3 contains four items that reflect the pleasure gained from smoking,
and was labeled PLEASURE. This three-factor model represents the combination of the seven original factors, and appears to reflect adequately the
underlying factor structure of the 25-item smoking inventory.