SPSS Lecture Note 2022
SPSS Lecture Note 2022
SPSS Lecture Note 2022
Email: [email protected]
Introduction to SPSS
❖ It used to;
❖ The easiest way is to start it from the Start button located at the bottom
of the Windows desktop.
❖ You may select Run the tutorial to have a tour of SPSS most basic
features.
❖ If you select Type in data the SPSS Data Editor will be opened
SPSS windows
❖ It is executed the function by placing and clicking the pointer over the
corresponding button
...Data Editor Toolbar
C. Status Bar
❖ When the statement SPSS Processor is ready appears in the Status Bar,
SPSS is ready to receive your instructions.
Data Editor Window…Cont’d
❖ The cell is the intersection between row and column in the grid
❖ A cell will therefore contain the score of a particular subject (or case) on
one particular variable
…the Data View
❖ Used for;
❖ In this window the rows are variables and columns are variable attributes
❖ Used to define the type of information that is entered into each column
in data view
❖ Changes like add, delete and modify attributes of variables can be made
on this window
…the variable view
2. The viewer window
o The right pane contains statistical tables, charts, and text output.
❖ You can edit the output in this window and save it for later use.
❖ This window opens automatically the first time you run a procedure
that generates output
…the viewer window
3. The Chart Editor Window
❖ This window is used to edit charts and plots.
❖ You can use the window to change the colors, select different type
fonts or sizes, rotate axes, change the chart type, and the like.
❖ Most SPSS commands are accessible from the SPSS menus and dialog
boxes.
❖ However, some commands and options are available only by using the
SPSS command language.
❖ You will also use this window if you wish to run SPSS commands instead
of clicking on the pull-down menus.
o There are three steps that must be followed to create a new data set
in SPSS
❖ Whenever you are working with data, it is important to make sure the
variables in the data are defined.
❖ When you change the name of a variable, it does not change the data
❖ Note that this changes how the numbers are displayed, but does not
change the values in the dataset.
❖ To specify the number of decimal places for a numeric variable, click cell
corresponding to the “Decimals” column for that variable.
❖ Then click the “up” or “down” arrow icons to increase or decrease the
number of decimal places
Label
❖ A brief but descriptive definition or display name for the variable.
❖ When defined, a variable's label will appear in the output in place of its
name.
❖ Used for coded categorical variables and suggest which code represent
which categories
❖ Value labels are useful primarily for categorical (i.e., nominal or ordinal)
variables, especially if they have been recorded as codes (e.g., 1, 2, 3).
❖ Note that defining value labels only affects the labels associated with each
value, and does not change the recorded values themselves
…values
❖ When value labels are defined, the labels will display in the output
instead of the original codes.
…values
…values
❖ Type the first possible value (1) for your variable in the Value field.
❖ In the Label field type the label exactly as you want it to display (e.g.,
"Freshman").
❖ Click Add when you are finished defining the value and label.
❖ Your variable value and label will appear in the center box.
❖ Repeat these steps for each possible value for your variable.
❖ Make changes to the selected value or label as needed. And then Click
Change
❖ Note that this does not affect or eliminate SPSS's default missing value
code (".").
❖ This column merely allows the user to specify alternative codes for
missing values
❖ To set user defined missing value codes, click inside the cell
corresponding to the “Missing” column for that variable.
❖ Click the option that best matches how you wish to define
missing data and enter any associated values, then click OK at the
bottom of the window.
Columns
❖ This simply refers to the width of the actual column in the spreadsheet
❖ Then click the “up” or “down” arrow icons to increase or decrease the
column width
Align
❖ The alignment of content in the cells of the SPSS Data View spreadsheet.
❖ To set the alignment for a variable, click inside the cell corresponding to
the "Align" column for that variable.
❖ Then use the dropdown menu to select your preferred alignment: Left,
Right, or Center
Measure
❖ The level of measurement for the variable (e.g., nominal, ordinal, or
scale)
❖ Then click the dropdown arrow to select the level of measurement for
that variable: Scale, Ordinal, or Nominal.
…Measure
❖ The role that a variable will play in your analyses (i.e., independent
variable, dependent variable, both independent and dependent).
❖ Any variable that meets the role requirements will be available for use in
such analyses
…Role
❖ Partition: The variable will partition the data into separate samples
❖ Split: Used with the IBM® SPSS® Modeler (not IBM® SPSS®
Statistics)
…Role
❖ To define a variable's role in your analysis, click inside the cell
corresponding to the “Role” column for that variable.
❖ Then use the dropdown menu to select the role that variable will take:
Input, Target, Both, None, Partition, or Split.
Exercise: Define Variables Below
ID No. Facility Name
………………………………………… ……………………………………………………
Town Date of interview
……………………………….. ……………………………...…….
Interviewer Name and signature Supervisor Name and signature
………………………………………… …………………………………….
Part - I: Background information (before counseling )
SN Question
01 Respondent’s age (in completed years) _______________________
02 Ethnicity? 1. Oromo 2. Amhara 3. Others
(specify)__________
03 Religion 1. Muslim 2. Orthodox 3. Protestant
4. Others (specify) _______________
04 Level of Education? 1.Illiterate 5. High school 9-10
2.Read and write 6. Preparatory 11-12
3.Elementary 1st cycle 1-4 7. College/university
4.Elementary 2nd cycle 4-8
05 Marital status 1. Single 3. Married
2. Divorced 4. Widowed
5. Other(specify) _____________
06 Residence 1. Rural 2. Urban
Step-2: Entering Data
❖ Data entry is accomplished in Data View window.
❖ Clicking any cell will highlight it (active cell) and its contents will appear
in the cell editor.
❖ Data values are not recorded until you press Enter or select another
cell.
…Entering Data
❖ Enter the values for all cases on one variable (column) and then repeat
the procedure for all values in the remaining columns.
Editing Data
❖ To delete the old value and enter a new value: click the cell, enter the
new value, press Enter.
❖ To modify a data value: click the cell, click the cell editor, edit the data
value, and press Enter.
❖ To delete the values in a range, select (highlight) the area concerned and
press Delete.
❖ Use the Undo command in Edit to undo any action you just performed.
For example, use the Undo command to delete the value you have just
entered in the Data Editor window.
Adding Cases
❖ To insert a new case (row) in between cases that already exist in your
data file: click the row below the row where you wish to enter the new
case, click Data on the menu bar, click Insert Case from the pull down
menu.
Deleting Cases
❖ To delete a case, click the case number that you wish to delete, click Edit
from the menu, and then on Clear.
❖ The selected case will be deleted and the rows below will shift upward
Exercise: Perform Data Entry
Saving Data Files
❖ From the main menu choose File and then Save As…. The following
Save Data As dialog box is displayed
…Saving Data Files
❖ Choose the appropriate directory in the Look in: box to save your file.
❖ Then type the name of the data file in the File name: box. No extension
(i.e. a dot followed by three letters) is required.
❖ To save changes to an SPSS data file make the Data Editor the active
window and from the menus choose File and then Save.
❖ The modified data file is saved, overwriting the previous version of the
file.
❖ By default, this will save the data file as an SPSS data file.
Data Manipulation
Using SPSS
Data Manipulation
Data
•Merge files
•Sort cases
•Split file
•Select cases
I. Merging Data
❖ Two data files can be merged (mixed) together using SPSS program
❖ To merge data the two files should have variables of the same
characteristics,
❖Procedure
1. Open one of the two files (the file that you assume will be the first file)
2. From the pull down menu click “Data” & go down to select “Merge
files”
➔ Select the data file you wanted through the Browser to merge & click
the Continue
You will find two windows:
1. Unpaired variables:- list of variables that are not matched thus could
not be included in the new file
Unpaired variables
Cont’d...
2. Variables in the new working file
❖ If all variables are found in this window, we could click “Ok” & we could
have a new file that needs saving, & we should save it for further analysis
Variables in the
new working file
2. Merging to add variables
2. From the pull down menu click “Data” & go down to select “Merge
files”
2nd Click
The match cases on Key
variables in sorted files
3rd Pass the common identity variable to
the key variable by clicking the arrow
If you are sure that both data files are sorted & saved, click OK
II. Sort
❖ When you sort ascending or descending, you can find ‘missing data’,
‘unknown (unexpected)’ data & ‘outliers’
❖ If you find such cases, you can re-check with hard data for possible
correction
o From the pull down menu click “Data” & go down to select
“Sort cases”
o Then select the variable you wanted to sort & pass it to “Sort
by”
2. Right clicking
o You go to the variable name you wanted to sort & do a right click
❖ Eg if data is split by sex, then any analysis done will be displayed first for
males then for females (first for value 1 then for value 2)
Procedure:
❖ From the pull down menu click “Data” & go down to select “Split file”
❖ Then pass the variable to “Groups based on” & click “OK”
o E.g. you can only analyze data only among male population
❖ It can also be used to select study subjects from sample frame (Simple
random sampling)
How is performed?
1. From the pull down menu click “Data” & go down to select “Select
cases”
2. Then on the displayed window select “If condition is satisfied” & click
“if ”
3. Under the “Select case: if ” window select the variable & pass to the
space given
4. And fulfill the logic you wanted to select using the mathematical
functions given
Cont’d...
5. Click “continue”
❖ Select if
B. Simple random selection
How is performed?
1. From the pull down menu click “Data” & go down to select “Select
cases”
2. Then on the displayed window select “random sample of cases” & click
sample
3. Under the “Select case: random sample” window, then click “exactly” &
write the sample size in the first space & total number of population on
the second space
4. Click “continue”
5. Returning back to “Select cases” window & select “deleted” from the
“unselected cases are”
2nd Click
“Sample”
3rd write the total population in the list
❖ How would you sort the data by the ‘Age’ in descending order?
❖ Compute
❖ Rank Cases
❖ Count
❖ Recode
Transform ➔ Compute
❖ From the pull down menu click “Transform” and go down to select
“Compute”
❖ Then on the displayed window write a new variable name and give a
value on the “numeric expression” (Usually numeric variable value)
❖ On the other hand, all string variables (i.e. sex) are identified by a
categorical variable icon (two circles) with the letter a
…Compute
expression
Greater or equal
lesser or equal
‘Not equal’
Or
And
o Eg Knowledge of HIV/AIDS
❖ Make sure questions for the scaling have the same numeric
expression as a correct answer
❖ Below the variable click the “define values” and another window
useful for iterance of the value will be opened
❖ Write the common correct value on the space “value” and pass it
the space for “Values to count” by clicking “add” and click
continue to return back.
3rd Select variables useful for the scaling & transfer it the “variable”
Click the “define variable”
1st write the common correct 2nd Transfer into “Value to count” by clicking “Add”
value on the “Value”
❖ Make sure what you wanted to recode, how to reduce the value
and taking notes is essential.
❖ Continue the procedure till all old values are recoded and click
continue to return back.
Ranged value
lowest to Highest
To pass/ add
selected values
To change
mistakenly
selected value
To remove selected
value
At the end click continue
Click ok to complete recoding now
❖ Make sure what you wanted to recode and how to reduce the
value and taking notes is essential
4th Do the
function
5th Click
3rd Click here to pass the selected continue
variable
Cont’d...
❖ Continue the procedure on number 5 above till all old values are
recoded and click continue to return back.
❖ Date format can be entered day (1-31), month (1-12), and a four
digit year
❖ Descriptive Statistics
❖ Graphics
❖ Comparing means
❖ Comparing proportion
❖ Regression…
Prerequisites for analysis
❖ Be clear with the objectives of study
❖ Eg;
Variable
Types
of Qualitative Quantitative
variables or categorical measurement
Measurement scales
Dependent vs Independent Variables
❖ Dependent variable
o Example ;
• Depression status
• HIV status
• Condom use
• Treatment defaulting
o Example
• Experience of violence
❖ Bivariate analysis
Under ‘statistics’
‘Chi square’,
‘risk’.
Analysis➔ Descriptive statistics➔ Crosstab
Under ‘Cells’,
‘rows’ .
gender * depression diagnosis Crosstabulation
1. This table gives us the ‘OR’ or ‘RR’, if & only if the variables in the
model are of a 2x2 table format
2. The first raw value of the independent variable is considered as a
reference in the above OR (1st raw) & RR (2nd raw) of the above
analysis result
Dependent variable
Independent variable
Dependent variable
Independent variable
Dependent variable
Independent variable
Last or First
Parameter
coding
gender female
Frequency
855
(1)
.000
The reference is female
male 580 1.000
Here the B is the regression coefficient that depicts the slope & the
interception. It is the change in logit of the outcome variable associated with a
one unit change in the predictor variable.
The most crucial & more displayed for the interpretation of logistic regression is
the value of Exp (B) & its 95% CI, which is the change in odds resulting from a unit
change in the predictor
Preventive Risk
0 +1
The Exp (B) odds ratio & its 95% CI are the only result usually displayed
How should we display
Say we got, OR (95% CI)
❖ Sex
o Male 1.00 Exp(B)
o Female 1.86 (1.05, 2.46)
❖ Residence
o Urban 1.00
o Rural 2.78 (0.78, 5.64)
❖ Marital status
o Single 1.00
o Married 0.67 (0.25, 0.89)
o Divorced/widowed 1.82 (1.04, 2.56)
Interpretation
non-Exposure (so reference)
Pearson
Correlation
(r)
The result of analysis
❖ Pearson’s Correlation Coefficient (r) tells you about
o Strength and
• Kendall’s Tau_b’ or
• Spearmans rho’
: Analysis ➔ Correlation ➔ bivariate
Analysis➔ Correlation ➔ bivariate
Similar to Pearson c.
But select Kendall’s tau-b & Spearman rho
❖ Similar interpretation of the correlation coefficient
r & P-value
3.Qualitative & Quantitative
Variables
❖ Here you can look at a difference in mean values between two or
more groups
independent
Analysis ➔ Compare means ➔ samples t-test
independent
Analysis ➔ Compare means ➔
:
samples t-test
Within independent samples t-
test…..
❖ Select the dependent variable to the ‘test variable’ space & the
independent variable to the ‘grouping variables’
❖ This will give you the mean difference & its significance using t-
test
Eg. Sex vs Verbal fluency
Eg ‘Sexno’ is defined
1. Female
2. Male
OUTPUT Group Statistics
Std. Error
gender N Mean Std. Deviation Mean
verbal fluency - animal female 855 15.24 5.711 .195
naming score male 580 15.95 5.493 .228
The group statistics tells us the mean of animal naming score among
males and females Independent Samples Test
‘Tukey’
e.g. Verbal fluency Vs Marital
status
• Descriptive
• Means plot
Descriptives
OUTPUT verbal fluency - animal naming score
95% Confidence Interval for
Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Never married 74 15.42 5.581 .649 14.13 16.71 6 35
currently married or
759 16.27 5.471 .199 15.88 16.66 0 36
cohabiting
separated or divorced 58 17.55 6.319 .830 15.89 19.21 5 32
widowed 427 14.36 5.466 .265 13.84 14.88 0 42
not known 28 10.07 2.340 .442 9.16 10.98 4 17
Total 1346 15.54 5.600 .153 15.24 15.84 0 42
The group descriptive statistics tells us the mean of animal naming score among different
marital status Test of Homogeneity of Variances
ANOVA
The ANOVA statistics tells us that there is mean difference in animal naming
score between groups that is statistically significant
Here the mean of a single value P-value for
is compared with mean of other values the difference
And is displayed by mean difference
Multiple Comparisons
Mean
Difference 95% Confidence Interval
(I) marital status (J) marital status (I-J) Std. Error Sig. Lower Bound Upper Bound
Never married currently married or
-.85 .666 .709 -2.67 .97
cohabiting
separated or divorced -2.13 .959 .172 -4.75 .49
widowed 1.06 .689 .541 -.83 2.94
not known 5.35* 1.213 .000 2.03 8.66
currently married or Never married .85 .666 .709 -.97 2.67
cohabiting separated or divorced -1.29 .745 .419 -3.32 .75
widowed 1.90* .331 .000 1.00 2.81
not known 6.19* 1.052 .000 3.32 9.07
separated or divorced Never married 2.13 .959 .172 -.49 4.75
currently married or
1.29 .745 .419 -.75 3.32
cohabiting
widowed 3.19* .765 .000 1.10 5.28
not known 7.48* 1.259 .000 4.04 10.92
widowed Never married -1.06 .689 .541 -2.94 .83
currently married or
-1.90* .331 .000 -2.81 -1.00
cohabiting
separated or divorced -3.19* .765 .000 -5.28 -1.10
not known 4.29* 1.067 .001 1.38 7.21
not known Never married -5.35* 1.213 .000 -8.66 -2.03
currently married or
-6.19* 1.052 .000 -9.07 -3.32
cohabiting
separated or divorced -7.48* 1.259 .000 -10.92 -4.04
widowed -4.29* 1.067 .001 -7.21 -1.38
*. The mean difference is significant at the .05 level.
❖ After Clicking the ‘statistics’, chose the ‘estimate’, ‘model fit’, ‘confidence
interval’ and ‘R squared change’ & click the ‘Ok’
o This will give you the mean difference between & within group
difference & its significance is measured using F-test
o It also gives you regression coefficients (the intercept & the slop)
‘Model fit’,
‘R squared change’
‘Confidence interval’
OUTPUT Model Summary
Change Statistics
Adjusted Std. Error of R Square
Model R R Square R Square the Estimate Change F Change df1 df2 Sig. F Change
1 .193 a .037 .037 5.496 .037 52.271 1 1344 .000
a. Predictors: (Constant), marital status
The Model summary shows you the R2 which tells us how much the predictive
Variables explains outcome variable, here in this example, it is 3.7 %
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 1578.905 1 1578.905 52.271 .000 a
Residual 40597.181 1344 30.206
Total 42176.086 1345
a. Predictors: (Constant), marital status
b. Dependent Variable: verbal fluency - animal naming score
ANOVA statistics also tells us whether the explanatory variable predicts the outcome
variable well using F-test
OUTPUT Coefficientsa
Unstandardized Standardized
Model
1. B
Coefficients
Std. Error
Coefficients
Beta t Sig.
95% Confidence Interval for B
Lower Bound Upper Bound
1 (Constant) 17.779 .344 51.718 .000 17.105 18.454
marital status -.808 .112 -.193 -7.230 .000 -1.027 -.589
a. Dependent Variable: verbal fluency - animal naming score
1. The B is the coefficient that each independent variable contributes to the dependent
Variable, it is also the indicator of (ß = slop), & the intercept that crosses X value at 0
It tells us to what extent (degree) each predictor effects the outcome, if the effects of all
other predictors are held constant
The equation will seem Verbal fluency score = ß0 + ß1x Marital status + ……..
=17.78 – 0.81x Marital status + ……..
3. Coefficientsa 4.
2.
Unstandardized
Coefficients
Standardized
Coefficients 95% Confidence Interval for B
Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) 17.779 .344 51.718 .000 17.105 18.454
marital status -.808 .112 -.193 -7.230 .000 -1.027 -.589
a. Dependent Variable: verbal fluency - animal naming score
4. Students t-test is the statistics that estimates the significance, & the
upper & lower 95% CI, are significant if both become Negative or
Positive
Asymmetrical Dependent Variable
Use non-parametric analysis
1. Mann-Whitney Test
Analysis➔ Nonparametric tests ➔ Legacy Dialogs➔ 2
independent samples
Within 2 independent samples
❖ Select the dependent variable to the ‘test variable list’ space and the independent
variable to the ‘grouping variables’.
❖ This will give you the ranked mean difference and its
significance using Z score.
‘Sexno’ is defined
1. Male
2. Female
verbal fluency
Ranks
- animal
gender N Mean Rank Sum of Ranks naming score
verbal fluency - animal female 855 700.21 598676.02 Mann-Whitney U 232736.000
naming score male 580 744.23 431653.99 Wilcoxon W 598676.000
Total 1435 Z -1.979
Asymp. Sig. (2-tailed) .048
a. Grouping Variable: gender
Mean rank of animal scoring by sex
verbal fluency
- animal
naming score
Most Extreme Absolute .082
Differences Positive .082
Negative -.001
Kolmogorov-Smirnov Z 1.528
Asymp. Sig. (2-tailed) .019
a. Grouping Variable: gender
End