Spss 1
Spss 1
Spss 1
SPSS is the acronym for Statistical Package for the Social Science. It is a
comprehensive system used to analyse data. SPSS use the data from any type of file to
generate tavulated reports, charts, pots distributions and trends, descriptive statistics as well
as more complex statistical analysis. This proprietary software is one of the most popular
statistical packages which can perform highly complex data manipulation and analysis with
simple instructions.
SPSS has a good breadth of functionality as it has scores of statistical and
mathematical functions, scores statistical procedures and the capability to flexibly handle
various types and formats of data such as numeric, alphanumeric, binary, date etc. Some of
the functionality of SPSS are:
Data transformation
Data examination
Descriptive Statistics
Reliability test
Discriminant analysis
Graphics and graphical interface and many more
A total 137 respondents from two types of schools are chosen in this survey. Based on
the Table 1, two public schools that are school A which consists of total 35 respondents and
B is 55 respondents. Two private schools are also involved which Private School A consist of
34 respondents and Private School B consist of 13 respondents.
Table 2 shows the frequency table for postgraduate qualification; Without Masters
degree are 65 respondents consisting of 47.4 % of total respondents and With Masters
Degree are 72 respondents consist of 52.6% of total respondents.
Chart 1: Pie Chart For Schools Which Participated in the Survey
Random sampling is the basic sampling technique where we select a group of subjects
(a sample) for study from a larger group (a population). Each individual is chosen entirely by
chance and each member of the population has an equal chance of being included in
the sample.
Age of teachers
Frequency Percent Valid Percent Cumulative Percent
Valid 20 1 3.3 3.3 3.3
22 1 3.3 3.3 6.7
24 1 3.3 3.3 10.0
28 1 3.3 3.3 13.3
30 1 3.3 3.3 16.7
33 1 3.3 3.3 20.0
35 1 3.3 3.3 23.3
36 1 3.3 3.3 26.7
38 2 6.7 6.7 33.3
40 1 3.3 3.3 36.7
43 1 3.3 3.3 40.0
44 1 3.3 3.3 43.3
45 1 3.3 3.3 46.7
47 3 10.0 10.0 56.7
49 2 6.7 6.7 63.3
50 1 3.3 3.3 66.7
53 1 3.3 3.3 70.0
55 3 10.0 10.0 80.0
56 1 3.3 3.3 83.3
59 1 3.3 3.3 86.7
60 1 3.3 3.3 90.0
67 3 10.0 10.0 100.0
Total 30 100.0 100.0
Table 3: Random Sampling 30 cases of total respondents (Teachers Age)
Based on the Table 3, random sampling for each age of the teachers is being used. Data
tabulation showed ranges of age being chosen are between 20 to 67, and some age examples
age of 38, 47, 49, 55 and 67 are more than one time (frequencies are more than 1) picked as a
random sample.
Age Categories
N Valid 137
Missing 0
Mean 2.54
Mode 3
Age Categories
Frequency Percent Valid Percent Percent
Based on the Table 4, there are only 3 teachers (2.2%) in the range of age 20 years
and below, 62 teachers (45.3%) aged 21 to 40 years, 67 teachers (48.9%) from age 41 to 60
years and only 5 teachers (3.6%) aged 61 years and above.
Age of teachers
N Valid 137
Missing 0
Mean 42.07
Median 44.00
Mode 47
Std. Deviation 11.742
Variance 137.877
Range 47
Age of teachers
Frequency Percent Valid Percent Percent
Valid 20 3 2.2 2.2 2.2
22 1 .7 .7 2.9
24 1 .7 .7 3.6
25 11 8.0 8.0 11.7
26 5 3.6 3.6 15.3
27 1 .7 .7 16.1
28 1 .7 .7 16.8
29 1 .7 .7 17.5
30 5 3.6 3.6 21.2
32 1 .7 .7 21.9
33 1 .7 .7 22.6
34 5 3.6 3.6 26.3
35 3 2.2 2.2 28.5
36 12 8.8 8.8 37.2
37 4 2.9 2.9 40.1
38 7 5.1 5.1 45.3
40 3 2.2 2.2 47.4
43 2 1.5 1.5 48.9
44 5 3.6 3.6 52.6
45 6 4.4 4.4 56.9
47 16 11.7 11.7 68.6
48 4 2.9 2.9 71.5
49 5 3.6 3.6 75.2
50 3 2.2 2.2 77.4
53 2 1.5 1.5 78.8
54 1 .7 .7 79.6
55 8 5.8 5.8 85.4
56 5 3.6 3.6 89.1
58 4 2.9 2.9 92.0
59 4 2.9 2.9 94.9
60 2 1.5 1.5 96.4
62 1 .7 .7 97.1
67 4 2.9 2.9 100.0
Total 137 100.0 100.0
Table 5: Measures of Central Tendency and Measures of Dispersion
Based on the Table 6, for Public School A there are 19 respondents (21 to 40 years),
and 16 respondent (41 to 60 years) which total of 35 respondent. In Public School B, 3
respondent (20 years and below), 22 respondent (21 to 40 years), 28 respondent (41 to 60
years) and 2 respondent (61 years and above) which totally 55 respondent taking the survey.
Private school A have 16 respondent (21 to 40 years), 15 respondent (41 to 60 years) and 3
respondent (61 years and above) and totally 34 respondent. Meanwhile for Private School B
only have 5 respondent (21 to 40 years) and 8 respondent (41 to 60 years) which totally 13
respondents of 137 participant in the survey.
Exploratory Data Analysis (EDA) is a data analysis approach that employs various
techniques to maximize insight into data set, uncover underlying structure, extract important
variables, detect outliers and anomalies, test underlying assumptions, develop parsimonious
model, and determine optimal factor setting. EDA relies heavily on the use of graphics to
gain insight into the data. Some common graphical techniques used by EDA are plotting raw
data (data traces, histogram, stem and leaf display, scatter plots), plotting simple statistics
(mean plots, standard deviation plots, and box plots) and position plots to maximize natural
pattern-recognition abilities.
Exploring data can help to determine whether the statistical techniques that you are
considering for data analysis are appropriate. The Explore procedure provides a variety of
visual and numerical summaries of the data, either for all cases or separately for groups of
cases. The dependent variable must be a scale variable, while the grouping variables may
be ordinal or nominal.
Tests of Normality
Postgraduate Kolmogorov-Smirnova Shapiro-Wilk
qualifications - Master's
degree Statistic df Sig. Statistic df Sig.
School administrator - Without Master's degree .096 65 .200 .951 65 .012
teacher relationship With Master's degree .114 72 .022 .940 72 .002
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction
Table 8: Normality Test
Table 7 showed that variables in School Administrator-teacher relationship with
Masters Degree have the value of mean 3.9126. Skewness and Kurtosis are slightly negative
which showed the data distribution to the left from the normal distribution.
Table 8 showed the Normality Test for the variables involved. Both Kolmogorov-
Smirnov and Shapiro-Wilk showed that the data from the variables are not normal which the
sig. value 0.022 (Kolmogorov-Smirnov) and 0.002 (Shapiro-Wilk) are smaller than 0.05.
3.00 2 . 344
8.00 2 . 55788899
6.00 3 . 000002
18.00 3 . 555566777788888999
23.00 4 . 01111222222233333344444
12.00 4 . 566788888999
2.00 5 . 00
1. Listwise Deletion: Delete all data from any participant with missing values. If the
sample is large enough, then we likely can drop data without substantial loss of
statistical power. Be sure that the values are missing at random and that we are not
inadvertently removing a class of participants.
2. Recover the Values: We can sometimes contact the participants and ask them to fill
out the missing values. For in-person studies, weve found having an additional check
for missing values before the participant leaves helps.
Imputation is replacing missing values with substitute values. The following methods
use some form of imputation.
3. Educated Guessing: It sounds arbitrary and isnt our preferred course of action, but
we can often infer a missing value. For related questions, for example, like those often
presented in a matrix, if the participant responds with all 4s, assume that the
missing value is a 4.
4. Average Imputation: Use the average value of the responses from the other
participants to fill in the missing value. If the average of the 30 responses on the
question is a 4.1, use a 4.1 as the imputed value. This choice is not always
recommended because it can artificially reduce the variability of our data but in some
cases makes sense.
5. Common-Point Imputation: For a rating scale, using the middle point or most
commonly chosen value. For example, on a five-point scale, substitute a 3, the
midpoint, or a 4, the most common value (in many cases). This is a bit more
structured than guessing, but its still among the more risky options. Use caution
unless we have good reason and data to support using the substitute value.
6. Regression Substitution: We can use multiple-regression analysis to estimate a
missing value. We use this technique to deal with missing SUS scores. Regression
substitution predicts the missing value from the other values. In the case of missing
SUS data, we had enough data to create stable regression equations and predict the
missing values automatically in the calculator.
7. Multiple Imputation: The most sophisticated and, currently, most popular approach
is to take the regression idea further and take advantage of correlations between
responses. In multiple imputation [pdf], software creates plausible values based on the
correlations for the missing data and then averages the simulated datasets by
incorporating random errors in our predictions. It is one of a number of examples
where computers continue to change the statistical landscape. Most statistical
packages like SPSS come with a multiple-imputation feature.
Gottschalk, L. A. (1995). Content analysis of verbal behavior: New findings and clinical
applications. Hillside, NJ: Lawrence Erlbaum Associates, Inc International
Organization of Scientific Research, IOSR.
Shamoo, A.E., Resnik, B.R. (2003). Responsible Conduct of Research. Oxford University
Shepard (2002), Problems in clinical trials go far beyond misconduct. Science. 264(5165):
1538- 41.
Resnik, D. (2000). Statistics, ethics, and research: an agenda for educations and reform.
Accountability in Research. 8: 163-88
Schroder, K.E., Carey, M.P., Venable, P.A. (2003). Methodological challenges in research on
sexual risk behaviour: I. Item content, scaling, and data analytic options. Ann Behav
Med, 26(2): 76-103.
Smeeton, N., Goda, D. (2003). Conducting and presenting social work research: some basic
statistical considerations. Br J Soc Work, 33: 567-573.