Statistical Analysis: College Graduates' Starting Compensations
Statistical Analysis: College Graduates' Starting Compensations
Statistical Analysis: College Graduates' Starting Compensations
Ryan Butler
1040 Final Project
Eportfolio Link: http://rbutler635147.weebly.com
Business
Communi
cations
Computer
Science
Education
Engineerin
Humaniti
es &
Social
Sciences
Sample
Mean
54537
46227
59542
40021
60664
38817
41923
Sample
Std.
Dev.
7232
7214
4670
2365
7879
7146
4938
Data Collection
Histograms for Compensation Distribution for the
Following Fields of Study
Math &
Sciences
Business
30
20
Frequency
Frequency
10
0
2
Communications
30
20
Frequency
Frequency
10
0
2
Computer Science
30
20
Frequency
Frequency
10
0
1
More
Education
30
20
Frequency
Frequency
10
0
2
Engineering
30
20
Frequency
Frequency
10
0
2
Frequency
10
0
2
20
Frequency
0
2
So what we see here based on our visual analysis of our data is that our
highest paying degree is engineering and the lowest paying degree would be
Humanities and Social Sciences. Not all of the graphs exhibit the normal
distribution, mathematics and science is very right skewed, while computer science
has a small left skew.
After analyzing the data, this would seem discouraging because I have chosen
Sociology and Social Work as my focus. I guess one could say its a good thing Im
not doing it for the money.
Next we will evaluate our data by using a 5-number summary and boxplots for
visual analysis.
Boxplot:
Graduating Student Compensation Distribution for each
Field of Study
Business
40000
45000
50000
55000
Business:
60000
65000
70000
75000
Communications
30000
35000
40000
45000
50000
55000
60000
Communications:
IQR: 54728-39263= 15465
Outliers: 39263-1.5 x 15465= 16065.50 16066
54728+1.5 x 15465= 77925.50 77926
65000
Computer Science
45000
50000
Computer Science:
55000
60000
65000
70000
Education
35000
36000
Education:
37000
38000
39000
40000
41000
42000
43000
44000
45000
Engineering
45000
50000
Engineering:
55000
60000
65000
70000
75000
80000
85000
20000
25000
30000
35000
40000
45000
50000
55000
30000
35000
40000
45000
50000
55000
9
Ryan Butler
Term Project- Part 2
10
Discussing and Interpreting the Confidence Interval Results:
We can say with 95% confidence that the mean starting compensation for students
graduating in Humanities and Social Sciences is between $36,786.71 and
40,847.29. We can also say with 99% confidence that the true standard deviation of
starting compensations for students graduating in Business is between $5,801.10
and $9,288.10. On top of that, we can say with 80% confidence that the proportion
of all students with starting compensation over $50,000 is between 39.22% and
45.98%.
Hypothesis Tests
The purpose and meaning of hypothesis testing is to have a procedure for
testing a claim about a property of a population. Hypothesis testing can also be
called the test of significance.
The claim that students graduating in Education have an average starting
compensation of under $35,000 with using a 0.05 significance level to
test:
H1: =35,000
Ha: <35,000
= 0.05
xx = 40,021
n= 50
s=2,365
Confidence Level= 95%= 0.9500
Critical Value= 1.676
Test Statistic= 39,916.35
There is sufficient evidence to reject the null hypothesis because the average
starting compensation package for graduating students in Education is less than
$35,000.
The claim that the 80% of students graduating with a college degree will
find a starting compensation package valued at over $40,000 with using a
0.01 significance level to test:
H1: p=0.80
Ha: p>0.80
= 0.01
xx = 41,923
n= 269
s=4,938
Confidence Level= 99%= 0.9900
Test Statistic= -32.034
p= 0.80
q= 0.20
pp =0.768571
11
We fail to reject the null hypothesis because there is sufficient evidence to
support the claim that over 80% of graduating students with a college degree will
find a starting compensation package that is valued over $40,000.
Reflection:
The conditions for doing an interval estimate and hypothesis test for
population proportion is that the sample observations must be a simple random
sample and must meet the conditions for a binomial distribution. The conditions for
a binomial distribution are that there is a fixed number of independent trials that
have a constant probability and that each trail has two outcomes which the success
category and the failure category. The conditions for doing an interval estimate and
hypothesis test for a population mean when sigma is not known, is that the sample
must be a simple random sample and it either has to be normally distributed or
n>30. The conditions for doing an interval estimate and hypothesis test for sigma
(standard deviation and variance) is that the sample must be a simple random
sample and the population has to be normally distributed even if it is a large
population.
We do not know however, if our data is from a simple random sample or if it
is normally distributed. Therefore, we could have made the error in assuming that
the data given is normally distributed and is a simple random sample when it is
actually not. Although, for finding the population mean, we can assume that it is
normally distributed because n>30 (n=50). Therefore, we can only assume that the
sample of the population mean, met the conditions. The sampling method could be
improved by letting us know ahead of time if the data is a simple random sample
and is normally distributed. We can conclude from this statistical research that, it is
important to challenge claims and statements. When we are able to do the
statistical research, we are able to look more in depth of what is being claimed and
see if we reject the hypothesis or fail to reject the hypothesis. We are also able to
observe the confidence level and confidence interval when we are able to do
statistical research on the data and claims given.
12
Final Reflection:
What have you learned as a result of this project?
I could really use a course in the applications of excel and word. But overall
what I have learned from the data analysis is that career path I have chosen is the
lowest paying. But, I knew this before I started. Im not doing it for the money.
Discuss how the math skills that you applied in this project will impact
other classes you will take in your school career.
Im currently graduating with a degree in Sociology. I have to be able to
analyze social statistics to be an effective Sociologist. (If I continue down that path.)
Its looking more like Social Work now.
Identify specific parts of the project and your own process in completing
the project that may have applications for other classes.
Being able to use the formulas in excel and creating different types of graphs
as visual aids. I find this will be a valuable asset in the future, especially with
PowerPoint presentations.
Discuss how the project helped to develop your problem solving skills.
Being able to ask for outside help in problem solving, trying to teach yourself
how to do anything is always harder than asking questions, and seeking those that
can help.
13
Discuss how this project changed the way you think about real-world math
applications. If your thinking was not changed, then discuss how the
project supported your views about real-world math applications.
This course has been most beneficial; I can see and understand the practical
application of statistics. We see it every day in newspapers, social media, and on TV.
To be able to understand the process of data collection, and analysis of the data
indispensable. I can now spot biased and unreliable data sources, which cuts down
on my frustration and strengthens my arguments, which is all part of being and
effective student.