Spss 1

INTRODUCTION
SPSS is the acronym for Statistical Package for the Social Science. It is a
comprehensive system used to analyse data. SPSS use the data from any type of file to
generate tavulated reports, charts, pots distributions and trends, descriptive statistics as well
as more complex statistical analysis. This proprietary software is one of the most popular
statistical packages which can perform highly complex data manipulation and analysis with
simple instructions.
SPSS has a good breadth of functionality as it has scores of statistical and
mathematical functions, scores statistical procedures and the capability to flexibly handle
various types and formats of data such as numeric, alphanumeric, binary, date etc. Some of
the functionality of SPSS are:
Data transformation
Data examination
Descriptive Statistics
Reliability test
Correlation
t-test
ANOVA
MANOVA
Regression
Discriminant analysis
Graphics and graphical interface and many more
A) FREQUENCY TABLE AND PIE CHARTS
Schools which participated in the survey

Cumulative
Frequency Percent Valid Percent Percent
Valid Public School A 35 25.5 25.5 25.5
Public School B 55 40.1 40.1 65.7
Private School A 34 24.8 24.8 90.5
Private School B 13 9.5 9.5 100.0
Total 137 100.0 100.0
Table 1: Frequency Table For Schools Which Participated in the Survey
Postgraduate qualifications - Master's degree

Valid Cumulative
Frequency Percent Percent Percent
Valid Without Master's degree 65 47.4 47.4 47.4
With Master's degree 72 52.6 52.6 100.0
Total 137 100.0 100.0
Table 2: Frequency Table For Postgraduate Qualification
A total 137 respondents from two types of schools are chosen in this survey. Based on
the Table 1, two public schools that are school A which consists of total 35 respondents and
B is 55 respondents. Two private schools are also involved which Private School A consist of
34 respondents and Private School B consist of 13 respondents.
Table 2 shows the frequency table for postgraduate qualification; Without Masters
degree are 65 respondents consisting of 47.4 % of total respondents and With Masters
Degree are 72 respondents consist of 52.6% of total respondents.
Chart 1: Pie Chart For Schools Which Participated in the Survey
Chart 2: Pie Chart For Postgraduate Qualification

B) RANDOM SAMPLING
Random sampling is the basic sampling technique where we select a group of subjects
(a sample) for study from a larger group (a population). Each individual is chosen entirely by
chance and each member of the population has an equal chance of being included in
the sample.
Age of teachers
Frequency Percent Valid Percent Cumulative Percent
Valid 20 1 3.3 3.3 3.3
22 1 3.3 3.3 6.7
24 1 3.3 3.3 10.0
28 1 3.3 3.3 13.3
30 1 3.3 3.3 16.7
33 1 3.3 3.3 20.0
35 1 3.3 3.3 23.3
36 1 3.3 3.3 26.7
38 2 6.7 6.7 33.3
40 1 3.3 3.3 36.7
43 1 3.3 3.3 40.0
44 1 3.3 3.3 43.3
45 1 3.3 3.3 46.7
47 3 10.0 10.0 56.7
49 2 6.7 6.7 63.3
50 1 3.3 3.3 66.7
53 1 3.3 3.3 70.0
55 3 10.0 10.0 80.0
56 1 3.3 3.3 83.3
59 1 3.3 3.3 86.7
60 1 3.3 3.3 90.0
67 3 10.0 10.0 100.0
Total 30 100.0 100.0
Table 3: Random Sampling 30 cases of total respondents (Teachers Age)
Based on the Table 3, random sampling for each age of the teachers is being used. Data
tabulation showed ranges of age being chosen are between 20 to 67, and some age examples
age of 38, 47, 49, 55 and 67 are more than one time (frequencies are more than 1) picked as a
random sample.
C) AGE VARIABLE
Statistics
Age Categories
N Valid 137
Missing 0
Mean 2.54
Mode 3
Std. Deviation .607
Age Categories
Cumulative
Valid 20 years and below 3 2.2 2.2 2.2
21 to 40 years 62 45.3 45.3 47.4
41 to 60 years 67 48.9 48.9 96.4
61 years and above 5 3.6 3.6 100.0
Total 137 100.0 100.0
Table 4: Age Categories
Based on the Table 4, there are only 3 teachers (2.2%) in the range of age 20 years
and below, 62 teachers (45.3%) aged 21 to 40 years, 67 teachers (48.9%) from age 41 to 60
years and only 5 teachers (3.6%) aged 61 years and above.
D) COMPUTING MEASURES OF CENTRAL TENDENCY
Statistics
Age of teachers
N Valid 137
Missing 0
Mean 42.07
Median 44.00
Mode 47
Std. Deviation 11.742
Variance 137.877
Range 47
Age of teachers
Cumulative
Valid 20 3 2.2 2.2 2.2
22 1 .7 .7 2.9
24 1 .7 .7 3.6
25 11 8.0 8.0 11.7
26 5 3.6 3.6 15.3
27 1 .7 .7 16.1
28 1 .7 .7 16.8
29 1 .7 .7 17.5
30 5 3.6 3.6 21.2
32 1 .7 .7 21.9
33 1 .7 .7 22.6
34 5 3.6 3.6 26.3
35 3 2.2 2.2 28.5
36 12 8.8 8.8 37.2
37 4 2.9 2.9 40.1
38 7 5.1 5.1 45.3
40 3 2.2 2.2 47.4
43 2 1.5 1.5 48.9
44 5 3.6 3.6 52.6
45 6 4.4 4.4 56.9
47 16 11.7 11.7 68.6
48 4 2.9 2.9 71.5
49 5 3.6 3.6 75.2
50 3 2.2 2.2 77.4
53 2 1.5 1.5 78.8
54 1 .7 .7 79.6
55 8 5.8 5.8 85.4
56 5 3.6 3.6 89.1
58 4 2.9 2.9 92.0
59 4 2.9 2.9 94.9
60 2 1.5 1.5 96.4
62 1 .7 .7 97.1
67 4 2.9 2.9 100.0
Total 137 100.0 100.0
Table 5: Measures of Central Tendency and Measures of Dispersion
Based on the Table 5, these are following measures of central tendency

i. Mean : 42.07 (average age of the teachers-respondent)
ii. Median : 44.00 (middle age of the teacher-respondent)
iii. Mode : 47.00 (The most frequent age of the teacher in the survey)
The following measures of dispersion are:

i. Range : 47
ii. Variance : 137.877
iii. Std. Deviation : 11.742
E) CROSSTABULATION SCHOOL VARIABLE AND AGE CATEGORY
Case Processing Summary

Cases
Valid Missing Total
N Percent N Percent N Percent
Age Categories *
Schools which
137 100.0% 0 0.0% 137 100.0%
participated in the
survey
Age Categories * Schools which participated in the survey Crosstabulation

Count
Schools which participated in the survey
Public School Public School Private School Private School
A B A B Total
Age Categories 20 years and below 0 3 0 0 3
21 to 40 years 19 22 16 5 62
41 to 60 years 16 28 15 8 67
61 years and above 0 2 3 0 5
Total 35 55 34 13 137
Table 6: Crosstab Age Categories and School Categories
Based on the Table 6, for Public School A there are 19 respondents (21 to 40 years),
and 16 respondent (41 to 60 years) which total of 35 respondent. In Public School B, 3
respondent (20 years and below), 22 respondent (21 to 40 years), 28 respondent (41 to 60
years) and 2 respondent (61 years and above) which totally 55 respondent taking the survey.
Private school A have 16 respondent (21 to 40 years), 15 respondent (41 to 60 years) and 3
respondent (61 years and above) and totally 34 respondent. Meanwhile for Private School B
only have 5 respondent (21 to 40 years) and 8 respondent (41 to 60 years) which totally 13
respondents of 137 participant in the survey.
F) EXPLORATORY DATA ANALYSIS
Exploratory Data Analysis (EDA) is a data analysis approach that employs various
techniques to maximize insight into data set, uncover underlying structure, extract important
variables, detect outliers and anomalies, test underlying assumptions, develop parsimonious
model, and determine optimal factor setting. EDA relies heavily on the use of graphics to
gain insight into the data. Some common graphical techniques used by EDA are plotting raw
data (data traces, histogram, stem and leaf display, scatter plots), plotting simple statistics
(mean plots, standard deviation plots, and box plots) and position plots to maximize natural
pattern-recognition abilities.
Exploring data can help to determine whether the statistical techniques that you are
considering for data analysis are appropriate. The Explore procedure provides a variety of
visual and numerical summaries of the data, either for all cases or separately for groups of
cases. The dependent variable must be a scale variable, while the grouping variables may
be ordinal or nominal.
Case Processing Summary

Postgraduate qualifications - Master's Cases
degree Valid Missing Total
N Percent N Percent N Percent
School Without Master's degree 65 100.0% 0 0.0% 65 100.0%
administrator - With Master's degree
teacher 72 100.0% 0 0.0% 72 100.0%
relationship
Descriptives
Postgraduate qualifications - Master's degree Statistic Std. Error
School administrator - Without Master's degree Mean 4.1004 .08232
teacher relationship 95% Confidence Interval Lower Bound 3.9360
for Mean Upper Bound 4.2649
5% Trimmed Mean 4.1290
Median 4.0694
Variance .441
Std. Deviation .66370
Minimum 2.53
Maximum 5.00
Range 2.47
Interquartile Range 1.08
Skewness -.322 .297
Kurtosis -.644 .586
With Master's degree Mean 3.9126 .08665
95% Confidence IntervalLower Bound 3.7398
for Mean Upper Bound 4.0854
5% Trimmed Mean 3.9363
Median 4.0903
Variance .541
Std. Deviation .73526
Minimum 2.36
Maximum 5.00
Range 2.64
Interquartile Range .93
Skewness -.523 .283
Kurtosis -.706 .559
Table 7: Exploratory Data Analysis
Tests of Normality
Postgraduate Kolmogorov-Smirnova Shapiro-Wilk
qualifications - Master's
degree Statistic df Sig. Statistic df Sig.
*
School administrator - Without Master's degree .096 65 .200 .951 65 .012
teacher relationship With Master's degree .114 72 .022 .940 72 .002
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction
Table 8: Normality Test
Table 7 showed that variables in School Administrator-teacher relationship with
Masters Degree have the value of mean 3.9126. Skewness and Kurtosis are slightly negative
which showed the data distribution to the left from the normal distribution.
Table 8 showed the Normality Test for the variables involved. Both Kolmogorov-
Smirnov and Shapiro-Wilk showed that the data from the variables are not normal which the
sig. value 0.022 (Kolmogorov-Smirnov) and 0.002 (Shapiro-Wilk) are smaller than 0.05.
Data 1: Stem-and-Leaf Plots for School Administrator - Teacher Relationship Stem-and-Leaf

Plot for Master=With Masters Degree
Master= With Master's degree
Frequency Stem & Leaf
3.00 2 . 344
8.00 2 . 55788899
6.00 3 . 000002
18.00 3 . 555566777788888999
23.00 4 . 01111222222233333344444
12.00 4 . 566788888999
2.00 5 . 00
Stem width: 1.00

Each leaf: 1 case(s)
From Data 1, especially the boxplot chart showed variable of With Masters Degree
have the highest value of lowest quartile 25% rather that the highest quartile 75%. It means
that the relationship between school administrator-teacher is not influenced by the
Postgraduate qualifications in Masters Degree.
G) ADDRESSING MISSING VALUES IN A DATASET
1. Listwise Deletion: Delete all data from any participant with missing values. If the
sample is large enough, then we likely can drop data without substantial loss of
statistical power. Be sure that the values are missing at random and that we are not
inadvertently removing a class of participants.
2. Recover the Values: We can sometimes contact the participants and ask them to fill
out the missing values. For in-person studies, weve found having an additional check
for missing values before the participant leaves helps.
Imputation
Imputation is replacing missing values with substitute values. The following methods
use some form of imputation.
3. Educated Guessing: It sounds arbitrary and isnt our preferred course of action, but
we can often infer a missing value. For related questions, for example, like those often
presented in a matrix, if the participant responds with all 4s, assume that the
missing value is a 4.
4. Average Imputation: Use the average value of the responses from the other
participants to fill in the missing value. If the average of the 30 responses on the
question is a 4.1, use a 4.1 as the imputed value. This choice is not always
recommended because it can artificially reduce the variability of our data but in some
cases makes sense.
5. Common-Point Imputation: For a rating scale, using the middle point or most
commonly chosen value. For example, on a five-point scale, substitute a 3, the
midpoint, or a 4, the most common value (in many cases). This is a bit more
structured than guessing, but its still among the more risky options. Use caution
unless we have good reason and data to support using the substitute value.
6. Regression Substitution: We can use multiple-regression analysis to estimate a
missing value. We use this technique to deal with missing SUS scores. Regression
substitution predicts the missing value from the other values. In the case of missing
SUS data, we had enough data to create stable regression equations and predict the
missing values automatically in the calculator.
7. Multiple Imputation: The most sophisticated and, currently, most popular approach
is to take the regression idea further and take advantage of correlations between
responses. In multiple imputation [pdf], software creates plausible values based on the
correlations for the missing data and then averages the simulated datasets by
incorporating random errors in our predictions. It is one of a number of examples
where computers continue to change the statistical landscape. Most statistical
packages like SPSS come with a multiple-imputation feature.
H) THE DIFFERENCES BETWEEN DEPENDENT VARIABLE AND
INDEPENDENT VARIABLE
An independent variable, sometimes called an experimental or predictor variable, is a

variable that is being manipulated in an experiment in order to observe the effect on
a dependent variable, sometimes called an outcome variable.
Imagine that a tutor asks 100 students to complete a maths test. The tutor wants to
know why some students perform better than others. Whilst the tutor does not know the
answer to this, she thinks that it might be because of two reasons: (1) some students spend
more time revising for their test; and (2) some students are naturally more intelligent than
others. As such, the tutor decides to investigate the effect of revision time and intelligence on
the test performance of the 100 students. The dependent and independent variables for the
study are:
Dependent Variable: Test Mark (measured from 0 to 100)

Independent Variables: Revision time (measured in hours) Intelligence (measured using IQ
score)
The dependent variable is simply that, a variable that is dependent on an independent

variable(s). For example, in our case the test mark that a student achieves is dependent on
revision time and intelligence. Whilst revision time and intelligence (the independent
variables) may (or may not) cause a change in the test mark (the dependent variable), the
reverse is implausible; in other words, whilst the number of hours a student spends revising
and the higher a student's IQ score may (or may not) change the test mark that a student
achieves, a change in a student's test mark has no bearing on whether a student revises more
or is more intelligent (this simply doesn't make sense).
Therefore, the aim of the tutor's investigation is to examine whether these independent
variables - revision time and IQ - result in a change in the dependent variable, the students'
test scores. However, it is also worth noting that whilst this is the main aim of the experiment,
the tutor may also be interested to know if the independent variables - revision time and IQ -
are also connected in some way.
REFERENCES
Gottschalk, L. A. (1995). Content analysis of verbal behavior: New findings and clinical
applications. Hillside, NJ: Lawrence Erlbaum Associates, Inc International
Organization of Scientific Research, IOSR.
Journal of Statistical MethodologyElsevier, WWW.

Journals.elsevier.com/statisticalMethodology.
Shamoo, A.E., Resnik, B.R. (2003). Responsible Conduct of Research. Oxford University
Press.
Savenye, Robinson (2004) Clinical significance of research: A growing concern. Canadian

Journal of Nursing Research, 24, 1-4.
Shepard (2002), Problems in clinical trials go far beyond misconduct. Science. 264(5165):
1538- 41.
Resnik, D. (2000). Statistics, ethics, and research: an agenda for educations and reform.
Accountability in Research. 8: 163-88
Schroder, K.E., Carey, M.P., Venable, P.A. (2003). Methodological challenges in research on
sexual risk behaviour: I. Item content, scaling, and data analytic options. Ann Behav
Med, 26(2): 76-103.
Silverman, S., Manson, M. (2003). Research on teaching in physical education doctoral

dissertations: a detailed investigation of focus, method, and analysis. Journal of
Teaching in Physical Education, 22(3): 280-297.
Smeeton, N., Goda, D. (2003). Conducting and presenting social work research: some basic
statistical considerations. Br J Soc Work, 33: 567-573.

Spss 1

Uploaded by

Copyright:

Available Formats

Spss 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Spss 1

Uploaded by

Copyright:

Available Formats

INTRODUCTION

Schools which participated in the survey

Postgraduate qualifications - Master's degree

Chart 2: Pie Chart For Postgraduate Qualification

Std. Deviation .607

Valid 20 years and below 3 2.2 2.2 2.2

21 to 40 years 62 45.3 45.3 47.4

41 to 60 years 67 48.9 48.9 96.4

61 years and above 5 3.6 3.6 100.0

Total 137 100.0 100.0

Table 4: Age Categories

Based on the Table 5, these are following measures of central tendency

The following measures of dispersion are:

Case Processing Summary

Age Categories * Schools which participated in the survey Crosstabulation

Case Processing Summary

Data 1: Stem-and-Leaf Plots for School Administrator - Teacher Relationship Stem-and-Leaf

Frequency Stem & Leaf

Stem width: 1.00

G) ADDRESSING MISSING VALUES IN A DATASET

An independent variable, sometimes called an experimental or predictor variable, is a

Dependent Variable: Test Mark (measured from 0 to 100)

The dependent variable is simply that, a variable that is dependent on an independent

Journal of Statistical MethodologyElsevier, WWW.

Savenye, Robinson (2004) Clinical significance of research: A growing concern. Canadian

Silverman, S., Manson, M. (2003). Research on teaching in physical education doctoral

You might also like