Factors Affecting Happiness Score

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Factors Affecting

Happiness Score
SUBJECT - Research Methods and
Statistical Packages

Submitted in partial fulfillment of the requirements for the degree of

B.A (HONS) BUSINESS ECONOMICS

By:

Harjas - 195030
Neha - 195034
Shailjaa - 195040
Simarpreet - 195041
Dashmeet - 195043

1
ACKNOWLEDGMENT
I would like to express my gratitude towards Mrs. Riyanka Jain
for guiding me throughout the project. I also feel thankful and
express my kind gratitude towards our Principal Dr. J. B. Singh
for allowing me to conduct project on Factors Affecting
happiness.

The mentioned project was done under the supervision of our


teacher. I thank all participants for their positive support and
guidance. I'm immensely obliged to my friends for elevating
inspiration, encouraging guidance and kind supervision in the
completion of my project.

I feel thankful to the college staff for giving me such a big


opportunity. I ensure that this project was done by me and is
not copied.

2
DECLARATION
This is to certify that the material embodied in this present project is based on our
original research work, performed under the guidance of Mrs. Riyanka Jain,
Faculty member at Department of Business Economics, Sri Guru Gobind Singh
College of Commerce. Our indebtedness to other works, studies and publications
have been duly acknowledged at the relevant places. This project work has not
been submitted in part in full for any other diploma or degree in this or any other
university.

Student Members Project Supervisor

Harjas Kaur (195030) Mrs. Riyanka Jain

Neha Mourya (195034)

Shailjaa Kotnala (195040)

Simarpreet (195041)

Dashmeet (195043)

3
INDEX

S. No. Particulars Page No.


1 Introduction 05

2 Objective 06

3 Hypothesis 07

4 Data Analysis & 08


Interpretation
5 Problem Detection 17

6 Conclusion 22

7 Recommendation 23

8 Bibliography 24

4
INTRODUCTION

The World Happiness Report is a publication of the United Nations Sustainable


Development Solutions Network. It contains articles and rankings of national
happiness, based on respondent ratings of their own lives, which the report also
correlates with various (quality of) life factors. The report primarily uses data from
the Gallup World Poll. Each annual report is available to the public to download
on the World Happiness Report website.

The first report was published in 2012, the second in 2013, the third in 2015, and
the fourth in the 2016 Update. The World Happiness 2017, which ranks 155
countries by their happiness levels, was released at the United Nations at an
event celebrating International Day of Happiness on March 20th.

The report continues to gain global recognition as governments, organizations


and civil society increasingly use happiness indicators to inform their policy-
making decisions. Leading experts across fields – economics, psychology, survey
analysis, national statistics, health, public policy and more – describe how
measurements of well-being can be used effectively to assess the progress of
nations. The reports review the state of happiness in the world today and show
how the new science of happiness explains personal and national variations in
happiness.

5
OBJECTIVE

To prove that Happiness Score is significantly affected by Economy (GDP per


capita), Health (Life expectancy) and Freedom.

Y : Happiness Score
X1: Economy (GDP per capita)
X2: Health (Life expectancy)
X3: Freedom

6
HYPOTHESIS
Model Hypothesis-

H0: Happiness Score is not significantly affected by Economy (GDP per capita),
Health (Life expectancy) and Freedom, if other factors are held constant.

H1: Happiness Score is significantly affected by Economy (GDP per capita), Health
(Life expectancy) and Freedom, if other factors are held constant.

X1: Economy (GDP per capita)

H0: Economy (GDP per capita) does not significantly affect the happiness score,
if other factors are held constant.

H1: Economy (GDP per capita) significantly affects the happiness score, if other
factors are held constant.

X2: Family
H0: Family does not significantly affect the happiness score, if other factors are
held constant.

H1: Family significantly affects the happiness score, if other factors are held
constant.

X3: Health (Life expectancy)


H0: Health (Life expectancy) does not significantly affect the happiness score, if
other factors are held constant.

H1: Health (Life expectancy) significantly affects the happiness score, if other
factors are held constant.
7
DATA ANALYSIS AND INTERPRETATION
Research data
Happiness Score Economy (GDP per Capita) Family Health (Life
Expectancy)
7.587 1.39651 1.34951 0.94143
7.561 1.30232 1.40223 0.94784
7.527 1.32548 1.36058 0.87464
7.522 1.459 1.33095 0.88521
7.427 1.32629 1.32261 0.90563
7.406 1.29025 1.31826 0.88911
7.378 1.32944 1.28017 0.89284
7.364 1.33171 1.28907 0.91087
7.286 1.25018 1.31967 0.90837
7.284 1.33358 1.30923 0.93156
7.278 1.22857 1.22393 0.91387
7.226 0.95578 1.23788 0.86027
7.2 1.33723 1.29704 0.89042
7.187 1.02054 0.91451 0.81444
7.119 1.39451 1.24711 0.86179
6.983 0.98124 1.23287 0.69702
6.946 1.56391 1.21963 0.91894
6.94 1.33596 1.36948 0.89533
6.937 1.30782 1.28566 0.89667
6.901 1.42727 1.12575 0.80925
6.867 1.26637 1.28548 0.90943
6.853 1.36011 1.08182 0.76276
6.81 1.04424 1.25596 0.72052
6.798 1.52186 1.02 1.02525
6.786 1.06353 1.1985 0.79661
6.75 1.32792 1.29937 0.89186
6.67 1.10715 1.12447 0.85857
6.611 1.69042 1.0786 0.79733
6.575 1.27778 1.26038 0.94579
6.574 1.05351 1.24823 0.78723
6.505 1.17898 1.20643 0.84483
6.485 1.06166 1.2089 0.8116
6.477 0.91861 1.24018 0.69077
6.455 0.9669 1.26504 0.7385
6.411 1.39541 1.08393 0.72025

8
6.329 1.23011 1.31379 0.95562
6.302 1.2074 1.30203 0.88721
6.298 1.29098 1.07617 0.8753
6.295 1.55422 1.16594 0.72492
6.269 0.99534 0.972 0.6082
6.168 1.21183 1.18354 0.61483
6.13 0.76454 1.02507 0.67737
6.123 0.74553 1.04356 0.64425
6.003 0.63244 1.34043 0.59772
5.995 1.16891 1.26999 0.78902
5.987 1.27074 1.25712 0.99111
5.984 1.24461 0.95774 0.96538
5.975 0.86402 0.99903 0.79075
5.96 1.32376 1.21624 0.74716
5.948 1.25114 1.19777 0.95446
5.89 0.68133 0.97841 0.5392
5.889 0.59448 1.01528 0.61826
5.878 0.75985 1.30477 0.66098
5.855 1.12254 1.12241 0.64368
5.848 1.18498 1.27385 0.87337
5.833 1.14723 1.25745 0.73128
5.828 0.59325 1.14184 0.74314
5.824 0.90019 0.97459 0.73017
5.813 1.03192 1.23289 0.73608
5.791 1.12555 1.27948 0.77903
5.77 1.12486 1.07023 0.72394
5.759 1.08254 0.79624 0.78805
5.754 1.13145 1.11862 0.7038
5.716 1.13764 1.23617 0.66926
5.709 0.81038 1.15102 0.68741
5.695 1.20806 1.07008 0.92356
5.689 1.20813 0.89318 0.92356
5.605 0.93929 1.07772 0.61766
5.589 0.80148 0.81198 0.63132
5.548 0.95847 1.22668 0.53886
5.477 1.00761 0.98521 0.7095
5.474 1.38604 1.05818 1.01328
5.429 1.15174 1.22791 0.77361
5.399 0.82827 1.08708 0.63793
5.36 0.63216 0.91226 0.74676
5.332 1.06098 0.94632 0.73172
5.286 0.47428 1.15115 0.65088

9
5.268 0.65435 0.90432 0.16007
5.253 0.77042 1.10395 0.57407
5.212 1.02389 0.93793 0.64045
5.194 0.59543 0.41411 0.51466
5.192 0.90198 1.05392 0.69639
5.192 0.97438 0.90557 0.72521
5.14 0.89012 0.94675 0.81658
5.129 0.47038 0.91612 0.29924
5.124 1.04345 0.88588 0.7689
5.123 0.92053 1.00964 0.74836
5.102 1.15991 1.13935 0.87519
5.098 1.11312 1.09562 0.72437
5.073 0.70532 1.03516 0.58114
5.057 0.18847 0.95152 0.43873
5.013 0.73479 0.64095 0.60954
5.007 0.91851 1.00232 0.73545
4.971 0.08308 1.02626 0.09131
4.959 0.87867 0.80434 0.81325
4.949 0.83223 0.91916 0.79081
4.898 0.37545 1.04103 0.07612
4.885 0.89537 1.17202 0.66825
4.876 0.59066 0.73803 0.54909
4.874 0.82819 1.3006 0.60268
4.867 0.71206 1.07284 0.07566
4.857 1.15406 0.92933 0.88213
4.839 1.02564 0.80001 0.83947
4.8 1.12094 1.20215 0.75905
4.788 0.59532 0.95348 0.6951
4.786 0.39047 0.85563 0.57379
4.739 0.88113 0.60429 0.73793
4.715 0.59867 0.92558 0.66015
4.694 0.39753 0.43106 0.60164
4.686 1.0088 0.54447 0.69805
4.681 0.79907 1.20278 0.6739
4.677 0.98549 0.81889 0.60237
4.642 0.92049 1.18468 0.27688
4.633 0.54558 0.67954 0.40132
4.61 0.271 1.03276 0.33475
4.571 0.0712 0.78968 0.34201
4.565 0.64499 0.38174 0.51529
4.55 0.52107 1.01404 0.36878
4.518 0.26673 0.74302 0.38847

10
4.517 0 1.0012 0.09806
4.514 0.35997 0.86449 0.56874
4.512 0.19073 0.60406 0.44055
4.507 0.33024 0.95571 0
4.436 0.45407 0.86908 0.35874
4.419 0.36471 0.99876 0.41435
4.369 0.44025 0.59207 0.36291
4.35 0.76821 0.77711 0.7299
4.332 0.99355 1.10464 0.04776
4.307 0.27108 0.70905 0.48246
4.297 0.7419 0.38562 0.72926
4.292 0.01604 0.41134 0.22562
4.271 0.83524 1.01905 0.70806
4.252 0.4225 0.88767 0.23402
4.218 1.01216 1.10614 0.76649
4.194 0.8818 0.747 0.61712
4.077 0.54649 0.68093 0.40064
4.033 0.75778 0.8604 0.16683
3.995 0.26074 1.03526 0.20583
3.989 0.67866 0.6629 0.31051
3.956 0.23906 0.79273 0.36315
3.931 0.21102 1.13299 0.33861
3.904 0.36498 0.97619 0.4354
3.896 1.06024 0.90528 0.43372
3.845 0.0694 0.77265 0.29707
3.819 0.46038 0.62736 0.61114
3.781 0.2852 1.00268 0.38215
3.681 0.20824 0.66801 0.46721
3.678 0.0785 0 0.06699
3.667 0.34193 0.76062 0.1501
3.656 0.17417 0.46475 0.24009
3.655 0.46534 0.77115 0.15185
3.587 0.25812 0.85188 0.27125
3.575 0.31982 0.30285 0.30335
3.465 0.22208 0.7737 0.42864
3.34 0.28665 0.35386 0.3191
3.006 0.6632 0.47489 0.72193
2.905 0.0153 0.41587 0.22396
2.839 0.20868 0.13995 0.28443

11
Regression in Excel

Regression in SPSS

12
Testing significance of partial regression coefficients and the
overall model
Unstandardized coefficients

Happiness Score = 2.199 + 0.888 (Economy) + 1.697 (Family) + 1.180 (Health)

Where,
A = 2.199, is the intercept.
B1 = 0.888, measures the change in mean value of Y (Happiness Score) for a unit
change in X1 (Economy), keeping other factors constant.
B2 = 1.697, measures the change in mean value of Y (Happiness Score) for a unit
change in X2 (Family), keeping other factors constant.

13
B3 = 1.180, measures the change in mean value of Y (Happiness Score) for a unit
change in X3 (Health), keeping other factors constant.

1. For partial slope term B1, H0: B1 equals to 0 and H1: B1 does not equal to 0.
2. For partial slope term B2, H0: B2 equals to 0 and H1: B2 does not equal to 0.
3. For partial slope term B3, H0: B3 equals to 0 and H1: B3 does not equal to 0.

Since, all the coefficients in our model has a p-value that is less than 0.05. We can
reject all the null hypothesis. Thus, all the partial regression coefficients are
significant.

Standardized coefficients

Happiness Score = 0.313 (Economy) + 0.404 (Family) + 0.255 (Health)

Where,
B1 =0.313, measures the change in standard deviation of Y (Happiness Score) for a
unit standard deviation change in X1 (Economy), keeping other factors constant.
B2 =0.404, measures the change in standard deviation of Y (Happiness Score) for a
unit standard deviation change in X2 (Family), keeping other factors constant.
B3 =0.255, measures the change in standard deviation of Y (Happiness Score) for a
unit standard deviation change in X3 (Health), keeping other factors constant.

1. For partial slope term B1, H0: B1 equals to 0 and H1: B1 does not equal to 0.
2. For partial slope term B2, H0: B2 equals to 0 and H1: B2 does not equal to 0.
3. For partial slope term B3, H0: B3 equals to 0 and H1: B3 does not equal to 0.

Since, all the coefficients in our model has a p-value that is less than 0.05. We can
reject all the null hypothesis. Thus, all the partial regression coefficients are
significant.

14
Overall model

Overall significance of model is tested using the F test. If F(observed) > F(critical),
we reject the null hypothesis.

H0: Overall model is not significant.


H1: Overall model is significant.

Since, F(observed) = 137.021 which is less than F(critical) = 0, we reject the null
hypothesis. Thus, our model is significant.

Interpretations
Multiple R - The correlation coefficient (r) measures the strength and direction of
a linear relationship between two variables on a scatterplot. The value of r is
always between +1 and –1.
Multiple R obtained from the model is 0.853. Thus, we can say that strength of
the linear relationship is strongly positive.

R-squared (R2): It is a statistical measure that represents the proportion of the


variance for a dependent variable that’s explained by an independent variable or
variables in a regression model. Whereas correlation explains the strength of the
relationship between an independent and dependent variable, R-squared explains
to what extent the variance of one variable explains the variance of the second
variable.
R square obtained from the model is 0.727, which shows that 72.7% of the total
variation in dependent variable is explained by the variation in independent
variables. Thus, we can say that the model is a good fit.

Adjusted R-squared: It is an indicator of the adequacy of the model. It is a better


measure than R square as it takes into account the degrees of freedom. It is

15
always less than R square. When new variables are added to the model R square
always increases. However adjusted R square will increase only if the added
variables make the regression equation clearer.
Adjusted R Square obtained from the model is 0.722, which implies that the fit of
the model is significantly good.

Unstandardized coefficients

Happiness Score = 2.199 + 0.888 (Economy) + 1.697 (Family) + 1.180 (Health)

A = 2.199, indicates the mean value of happiness score when all the other
variables are 0.

B1 = 0.888, indicates that mean value of Happiness Score will change by 0.888
units if GDP per capita changes by 1 unit.

B2 =1.697, indicates that mean value of Happiness Score will change by 1.697
units if Family changes by 1 unit.

B3 =1.180, indicates that mean value of Happiness Score will change by 1.180
units if Life expectancy changes by 1 unit.

Standardized coefficients

Happiness Score = 0.313 (Economy) + 0.404 (Family) + 0.255 (Health)

B1 =0.313, indicates that a change of one Standard Deviation in the GDP per
capita will result in 0.313 Standard Deviation change in the Happiness Score.

B2 =0.404, indicates that a change of one Standard Deviation in the Family will
result in 0.313 Standard Deviation change in the Happiness Score.

B3 =0.255, indicates that a change of one Standard Deviation in the Life


expectancy will result in 0.313 Standard Deviation change in the Happiness Score.

16
PROBLEM DETECTION:
MULTICOLLINEARITY
It means that two or more of the independent variables in a regression model
have a linear relationship.

It is a phenomenon in which two or more independent variables in multiple


regression are highly correlated, means that one can be linearly predicted from
others with a substantial degree of accuracy.

This causes a problem in the interpretation of the regression results and T-


statistics may not be able to properly isolate the unique effects of each variable
and the confidence with which we can presume these effects to be true.

There are a number of methods to detect multicollinearity among regressors,2 of


them are VIF (Variance inflation factor) and Tolerance level.

→VIF is nothing but the reciprocal of tolerance.


VIF. If TOL approaches 1, it means there is no multicollinearity and if
TOL is 0, means there is perfect multicollinearity.

To conduct a test for multicollinearity, each of the time one independent variable
was set as dependant variable and a multiple linear regression was run.

The following results were obtained: -

Coefficientsa

Collinearity Statistics

Model Tolerance VIF

1 Economy (GDP per Capita) .333 3.000

Health (Life Expectancy) .333 3.000

a. Dependent Variable: Family

17
Since, The VIF of all the variables is less than 6, there is no/negligible
multicollinearity.

Coefficientsa

Collinearity Statistics

Model Tolerance VIF

1 Health (Life Expectancy) .718 1.393

Family .718 1.393

a. Dependent Variable: Economy (GDP per Capita)

Since, The VIF of all the variables is less than 6, there is no/negligible
multicollinearity.

Coefficientsa

Collinearity Statistics

Model Tolerance VIF

1 Family .584 1.714

Economy (GDP per Capita) .584 1.714

a. Dependent Variable: Health (Life Expectancy)

Since, The VIF of all the variables is less than 6, there is no/negligible
multicollinearity.

HETEROSCEDASTICITY
It is the problem of unequal variance of error
terms. The variances of the error terms should be constant or homoscedastic.
It occurs in cross sectional data.
It can be tested by interpreting scatter plot. If the data points exhibit any
particular pattern, it shows that the problemof heteroscedasticity exists. If the
data points do not exhibit any pattern
and are random, it means that the problem of heteroscedasticity does not exist.
We have incorporated the analysis of 3 charts (obtained from running regression):
18
Here, the histogram obtained is quite symmetrical as mode lies somewhat in the
middle of the graph. Also The histogram plot of the Standardized Residuals shows
normality i.e. the curve is normal.
Hence, we can conclude that the model is a significant fit and there is no
Heteroscedasticity.

19
Here, the data points are somewhat close to the least square fit line and a
moderate volume of our regressed standardized residuals lies close to the
45°least square fit line.
Hence, we can conclude that the model is a significant fit and there is no
heteroscedasticity.

The scatter plot does not exhibit any pattern i.e. the residuals take up random
values and their covariance is zero.
Hence we can conclude that the model is a significant fit and there is no
heteroscedasticity

20
AUTOCORRELATION
It refers to correlation between the observations which are ordered in a time
series data or in a cross-sectional data separated from each other by a given
interval.
This can be symbolically represented as: E(uiuj) ≠ 0, such that i ≠ j

METHOD- DURBIN-WATSON TEST


Interpretation can be made by looking the following d Statistics value such that
• If d = 2; Then there is NO AUTOCORRELATION
• If d < 2; then there is POSITIVE AUTOCORRELATION
• If d > 2; then there is NEGATIVE AUTOCORRELATION

Model Summaryb

Adjusted R Std. Error of the


Model R R Square Square Estimate Durbin-Watson

1 .853a .727 .722 .6035461 1.333

a. Predictors: (Constant), Health (Life Expectancy), Family, Economy (GDP per Capita)

b. Dependent Variable: Happiness Score

The value of Durbin Watson test is 1.333.


We observe that our d-statistic (1.333) is lower than 2. This shows
presence of positive autocorrelation in our model.

CORRECTION/RECTIFYING THE PROBLEM OF


POSITIVE AUTOCORRELATION

21
CONCLUSION
a)In our research of determinants of Happiness Score , we have studied
the impact of three factors namely – Life expectancy, Family and
Economy (GDP).
These factors have been studied over a pool of 159 countries. We used
multiple regression analysis to measure this impact.
b)Since, all the coefficients in our model has a p-value that is less than
0.05. We can reject all the null hypothesis. Thus, all the partial
regression coefficients are significant.
H0: Happiness Score is not significantly affected by Economy (GDP per
capita), Health (Life expectancy) and Freedom, if other factors are held
constant.
H1: Happiness Score is significantly affected by Economy (GDP per
capita), Health (Life expectancy) and Freedom, if other factors are held
constant.

c)Also , F(observed) = 137.021 which is less than F(critical) = 0, we reject


the null hypothesis. Thus, our model is overall significant.

d) R square obtained from the model is 0.727, which shows that 72.7%
of the total variation in dependent variable is explained by the variation
in independent variables. Thus, we can say that the model is a good fit.

e) Based on several tests, there is no problem of multicollinearity in


independent factors and heteroscedasticity in regression model.

22
RECOMMENDATIONS
~More factors could have been search to get a better regression
analysis, and hence a better R square.

~More time could have been devoted for a thorough research and for a
better understanding of the factors.

~Factors could have been studied for over a period of at least 5 years
to understand the changes that happened over the greater stretch of
years.

~There are various ways in dealing with autocorrelation. Some most


common are
1. Try to find out if the autocorrelation is pure autocorrelation and not
the result of mis-specification of the model. Sometimes we observe
patterns in residuals because the model is mis-specified—that is, it has
excluded some important variables—or because its functional form is
incorrect.
2. Include dummy variable in the data.
3. Estimated Generalized Least Squares
4. Include a linear (trend) term if the residuals show a consistent
increasing or decreasing pattern.
5. In large samples, we can use the Newey–West method to obtain
standard errors of OLS estimators that are corrected for
autocorrelation.

23
BIBLIOGRAPHY
Text Sources:
● Statistical Analysis in Microsoft Excel and SPSS by Mrs.
Riyanka Jain
● Basic Econometrics: Damodar N. Gujarati
Web Sources:
●https://en.wikipedia.org/wiki/World_Happiness_Report
● https://worldhappiness.report/ed/2020/

24

You might also like