Can Well-Being Be Predicted? A Machine Learning Approach: Max Wilckens, Margeret Hall

Download as pdf or txt
Download as pdf or txt
You are on page 1of 74

Can Well-Being be Predicted?

A Machine Learning Approach

Max Wilckens, Margeret Hall


Karlsruhe Service Research Institute, Karlsruhe Institute of Technology, Germany
[email protected]

Abstract
The study of well-being is an interdisciplinary field integrating aspects from psychology, eco-
nomics, and the social and political sciences. Until today, research is still struggling to provide
a robust definition of well-being, explaining its variation and dependencies on e.g. personality,
demographics, way of life and life events. For this study, several machine learning techniques in-
cluding kernel smoothing algorithms, neural networks and feature selection methods are applied
in order to expand the structural understanding of subjective well-being and its dependencies.
Well-being data from a four weeks study sequence including 362 participants is analyzed for non-
parametric structures upon thirteen predictor variables, including the big five personality traits,
a maximizer-satisficer scale, a fairness measure and six demographic variables. Neuroticism, ex-
traversion and conscientiousness were confirmed as the most important predictors. Although
identified non-parametric structures do not lead to significantly higher prediction accuracy, 54%
of the well-being variance between participants was explained upon the predictors set. Surpris-
ingly, the cross-validated machine learning algorithms were not found to achieve higher accuracy
than the linear model.

Keywords: well-being, big-five personality, maximizer-satisficer, fairness, machine learning, predictive an-
alytics, non-parametric regression, neural network, extreme learning machine, k-nearest neighbor, feature
selection, lasso regression, lazy lasso regression, small data

Electronic copy available at: http://ssrn.com/abstract=2562051


Contents
List of Figures v

List of Tables vi

List of Abbreviations vii

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Purpose of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Layout of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Literature Review 4
2.1 Defining Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Perspectives on Well-being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1.1 Eudemonia and Psychological Well-being . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1.2 Hedonism and Subjective Well-being . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1.3 Economic Well-being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Well-Being Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 The Influence of Positive and Negative Affect . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 Equilibrium Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Determinants of Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Demographics / One’s Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Personality Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.3 Life Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Measuring Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Machine Learning on Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Research Questions 21

4 Methodology 23
4.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Apparatus and Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Data Retrieval Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Analysis Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4.1 Comparison of Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4.2 Algorithms and Methods used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4.3 Cross Validation and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Results 30
5.1 Descriptive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Generalized Linear Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

ii

Electronic copy available at: http://ssrn.com/abstract=2562051


Contents iii

5.3 Kernel Smoothing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34


5.3.1 K-nearest Neighbor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3.2 Non-parametric Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3.2.1 LOESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3.2.2 Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3.2.3 npreg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3.3 Support Vector Machines (SVM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.4 Neural Network Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.4.1 Stuttgart Neural Network Simulator (SNNS) . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.4.2 Extreme Learning Machine (ELM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.5 Feature Selection Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.1 Lasso and Elastic Net Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.2 Lazy Lasso Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.6 Accuracy Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6 Evaluation 55
6.1 Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.1.1 Existence of Well-Being Baseline (Hypothesis 1) . . . . . . . . . . . . . . . . . . . . . . . 55
6.1.2 Predictability of Well-Being Baseline (Hypothesis 2) . . . . . . . . . . . . . . . . . . . . . 55
6.1.3 Characterization of Well-Being Trajectory (Hypothesis 3 & 4) . . . . . . . . . . . . . . . . 56
6.2 Further Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7 Implications and Further Research 58

References 59

iii
List of Figures
1.1 J-Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1 Objective vs. subjective happiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5


2.2 Stocks and flows framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 SWB homeostasis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Well-being equilibrium definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Sigmoid / logistic regression function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 Neural network example (1 hidden layer with 5 hidden nodes) . . . . . . . . . . . . . . . . . . . . 18
2.7 Different kernel shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.8 Kernel-smoothing example: Epanechnikov kernel with local linear regression . . . . . . . . . . . . 19
2.9 Support Vector Regression: Fitting inside the kernel . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1 Participants’ demographic structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24


4.2 Independent and dependent variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Anova Type-I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 Caret cross-validation procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.1 HFI distribution and density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31


5.2 Correlation matrix (absolute values) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3 GLM fitted with caret package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4 Variable importance in GLM (t-staticic) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.5 GLM Regression coefficients with standard error bars . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 GLM for participants’ in person well-being variance . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.7 RMSE for k-nearest neighbor using Euclidian metric . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.8 Variance importance for k-nearest neighbor using Euclidian metric . . . . . . . . . . . . . . . . . 35
5.9 RMSE for gamLoess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.10 RMSE for gamSplines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.11 RMSE for npreg with least-squares cross-validation (left) and Kullback-Leibler cross-validation
(right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.12 RMSE density plot for 10-fold cross-validation runs (kernel bandwidth selection upon least-
squares cross-validation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.13 npreg predictors’ partial regression influence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.14 npreg accuracy for reduced predictor dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.15 npreg accuracy for reduced predictor dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.16 npreg predictors’ partial regression influence for reduced predictor dimensionality (1) . . . . . . . 41
5.17 npreg predictors’ partial regression influence for reduced predictor dimensionality (2) . . . . . . . 42
5.18 RMSE accuracy for support vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.19 RMSE accuracy for feedforward neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.20 RMSE accuracy for extreme learning machine (ELM) . . . . . . . . . . . . . . . . . . . . . . . . 46
5.21 Cross-validation results for extreme learning machine (ELM) for 12.000 hidden nodes . . . . . . . 46

iv
List of Figures v

5.22 RMSE accuracy for ELM in trajectory prediction problem . . . . . . . . . . . . . . . . . . . . . . 47


5.23 Lasso regression path (left) and RMSE accuracy (right) . . . . . . . . . . . . . . . . . . . . . . . 48
5.24 Lazy Lasso Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.25 RMSE accuracy for lazy lasso regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.26 Lazy lasso predictor weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.27 Lazy lasso: percentage of local lasso regressions with predictor coefficient unequal to zero . . . . 51
5.28 Accuracy comparison between deployed algorithms for well-being baseline prediction . . . . . . . 53
5.29 RMSE accuracy gains with increased number of training points for neural network . . . . . . . . 54
5.30 RMSE accuracy gains with increased number of training points for npreg . . . . . . . . . . . . . 54

v
List of Tables
2.1 Big Five Trait Taxonomy - Factor Definition (based on John, Naumann, & Soto, 2008) . . . . . . 11

4.1 Participants descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23


4.2 Descriptive statistics for dataset comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3 Applied algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5.1 Weekly HFI correlation matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30


5.2 Explained variance of weekly HFI by the HFI average . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 Standard Deviation between and within participants’ HFI trajectory . . . . . . . . . . . . . . . . 30
5.4 Predictor importance by group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

vi
List of Abbreviations
R2 . . . . . . . . . Coefficient of Determination
RM SE . . . . Root Mean Squared Error
AIC . . . . . . . . Akaike Information Criterion
ANOVA . . . . Analysis of Variance
CV . . . . . . . . . Cross-validation
ELM . . . . . . . Extreme Learning Machine
GAM . . . . . . Generalized Additive Model
GLM . . . . . . . Generalized Linear Model
HFI . . . . . . . . Human Flourishing Index
LOESS . . . . . Locally Weighted Scatterplot Smoothing
OLS . . . . . . . . Ordinary Least Squares
PWB . . . . . . Psychological Well-being
SD . . . . . . . . . Standard Deviation
SVM . . . . . . . Support Vector Machine
SWB . . . . . . . Subjective Well-being

vii
1. Introduction

1.1 Background Well-being or happiness as referred to in parts of


the literature has an extensive influence on human
What determines whether we judge a glass to be half lifes. Happy people have not only been found to live
full or half empty? Determinants of human well- longer and healthier, furthermore they are even more
being have been extensively addressed by recent re- productive and successful (Diener & Chan, 2011; Di-
search in the last years. The possibility of defining ener & Tay, 2013; Lyubomirsky, King, & Diener,
well-being as one of the most governing aspects of 2005). Happiness is per se an aspired feeling, which
one’s life, attracts researchers to identify the factors makes us confident and satisfied with the life we
influencing well-being, and its impact on life, econ- live. Diener (2013, p. 665) argued that well-being
omy and society. The list of statistically significant ”broadly mirrors the quality of life in societies be-
correlational findings is already extensive1 – reaching yond economic factors and thus reflect social capital,
from one’s sex life and health to personality, employ- a clean environment, and other variables” (see also
ment and education, to pets in households (Veen- Diener & Seligman, 2004). Page and Vella-Brodrick
hoven, 2013). Even if correlational findings do not (2008) identified close links between well-being and
imply causality itself, there is a common understand- employee performance with considerations to their
ing that personality and basic demographic char- turnover rate and proposed well-being as a ”valuable
acteristics have a determining or moderating influ- tool” (p. 454) to measure return on investment of em-
ence on well-being; whereas variables such as health, ployee enhancement programs and ”track employee
longevity or productivity have been identified as pos- reaction to workplace changes” (p. 455). Moreover,
itive outcomes of well-being (Diener, 2013). Due to high well-being values have been proven to be benefi-
a high degree of complexity none of the conducted cial for turnover, customer loyalty, productivity and
studies have thus far been able to describe com- profitability (cf. Harter, Schmidt, & Keyes, 2003).
putable dependencies that allow a well-being predic-
The social dimension of well-being is closely linked
tion.
to social stability. In 1962 Davies outlined his theory
of revolution, analyzing that revolutionary forces are
This study aims for a prediction on individual well-
based on a ”dissatisfied state of mind rather than the
being. To do so, it focuses on one’s personality traits
tangible provision of adequate or inadequate supplies
and basic demographic variables such as age, employ-
of food, equality, or liberty” (p. 6) within the soci-
ment and gender. Being able to predict individual
ety’s majority. Figure 1.1 illustrates the gap between
well-being based upon personality traits enables the
personal expectations and reality, leading to dissatis-
isolation of other well-being factors in order to an-
faction and finally revolution, if the gap becomes to
alyze their influence specifically. If personality and
big (Davies, 1962). The described state of dissatis-
demography account for the foundational well-being
faction is similar to low well-being values, since sub-
base-level, additional influences such as certain work
jective well-being is mainly determined through life
environments or life situations could be researched
satisfaction. High well-being is therefore not only im-
independently. Consequently, the interdependencies
portant for individuals themselves, but also for well-
between personality and well-being could be utilized
functioning societies.
to measure and monitor well-being more indepen-
dently from conventional methods such as well-being Consequently, well-being is increasingly discussed as
questionnaires, and aid in finding a robust definition an indicator for any kind of human environments
for well-being. such as societies, institutions, companies, social in-
teractions, services or even products. Measuring
1 Theworld database of happiness by Veenhoven (2013) lists well-being enables to identify effects of social change
12564 correlational findings on happiness found in 1203
publications. within societies on peoples’ life satisfaction and hap-

1
2 1. Introduction

Satisfaction
acceptable unacceptable
gap gap

ns
a tio
ct
pe
Ex ty
ali
Re

Time

Figure 1.1: J-Curve


(based on Davies, 1962, p. 6)

piness and might therefore serve as a measure for man brain or more generally rule-based engines are
quality of life. Bhutan was the first country to estab- not able to grasp, enables for a new understanding of
lish a national well-being measure besides economic those complex contexts within datasets. Initially, the
indicators, as for example the gross national product machine learning algorithms were developed for data
(Thinley, 2011). Some political movements in differ- compression3 upon underlying structures, but nowa-
ent countries claim similar well-being indicators for days also allow for an interpretation of the data’s
their countries in order to support political decisions context. (Hastie et al., 2009; Nilsson, 2005)
not necessarily leading to economic wealth, but to a
higher life satisfaction and well-being in society2 . In Predictive machine learning algorithms not only
order to develop, understand and interpret well-being learn from historic data by identifying the numeric
measures deeper insights into the determinants for linkages between variables in time and provide un-
well-being and its internal structure are demanded. derstanding of the contexts; furthermore they predict
future trajectories of certain variables. Mentioned al-
Machine learning is used in this study to add to
gorithms utilize the basics of statistical analysis such
the scientific discussion about the definition of well-
as Bayes rule or Gaussian distributions and scale
being, its predictors and the dependencies’ charac-
them through computer repeated application. Nu-
teristics. To do so this study aims for a prediction of
meric approaches calculate conditional probabilities,
well-being upon the structures identified within the
conduct multidimensional and multilayered regres-
dependencies via machine learning analysis. Predic-
sions, optimize the parameters upon e.g. the max-
tive analytics aims to predict real world variables
imum likelihood principle in recursive procedures
based upon historic data. It relates to topics of
and thereby achieve expressiveness that hasn’t been
’big data,’ since predictive analytics powers the infor-
reached before (cf. Heckerman, 1996). What is for
mation mining of those approaches. Computational
example achieved by interaction terms within linear
power developed in recent years allows us to analyze
models, is implemented natively by most machine
and find non-parametric correlations within vast and
learning algorithms using for example local neigh-
complex amounts of data. The ability to trace cor-
borhood estimators or multilayered regressions.
relational linkages on a numeric level, which a hu-
2 See e.g. Stiglitz, Sen, and Fitoussi (2009) for well-being 3 Compare for example mp3 and jpeg data compression tech-
indices in France and Waldron (2010) for subjective well- niques, email-spam detection or handwritten digit recogni-
being measures by the UK Office for National Statistics. tion (Hastie, Tibshirani, & Friedman, 2009).

2
1.3. Layout of the Study 3

For social problems, such as in the present well-being given within the following chapter. This includes def-
prediction study machine learning is still rarely ap- inition attempts for well-being, identified determents
plied. However, machine learning promises valuable of well-being, measuring well-being and basic back-
insights in order to understand the underlying struc- ground on the machine learning techniques utilized
tures determining well-being. Besides the predomi- in this study. Given this background, the study’s
nant goal of machine learning algorithms to explain research questions are derived, explained upon and
as much of the variance enclosed in the data as pos- condensed to four hypotheses formulated in chapter
sible, a second goal identifying the predominant pre- three. The fourth chapter explains the methodology
dictors arises. Since social problems generally in- by which the empirical survey is conducted and the
clude a high number of predictors with relatively low data analyzed. Moreover, it includes an overview on
explanatory power, profound interpretability is es- the participants’ demographics, the questionnaires
sential (Hainmueller & Hazlett, 2013). Consequently, and the applied algorithms. Used cross-validation
this study applies a wide range of different machine techniques are explained in order to provide under-
learning algorithms to test their capabilities for the standing of the validity tests conducted. Results for
prediction of well-being on the one hand side and to the conducted algorithms are given in chapter five.
identify the underlying dependencies between demo- For each analysis a subsection explains the applied
graphics, personality and well-being on the other. algorithm in detail including definition and setting of
parameters first, and then outlines obtained results.
In order to evaluate the results with regard to formu-
1.2 Purpose of the Study lated hypotheses chapter six aggregates the findings,
before implications for further research are presented
Within this study machine learning algorithms are
in the final, seventh chapter.
conducted to improve the understanding of correl-
ative dependencies between well-being, personality
traits and demographic data. Aiming for the pre-
diction of the future well-being level of individu-
als upon the analyzed data is the main intention
of this study. Moreover, the predictors’ importance
and non-parametric influences are analyzed in or-
der to identify the underlying structures and under-
stand each predictor’s partial influence on individual
well-being and its development over time. Generally,
gained knowledge is intended to add to the scientific
aim for a robust definition of well-being.

This study reviews first results from a feasibility


study on the predictability of well-being by Hall,
Caton, and Weinhardt (2013, p. 21), amends
the data set and aims to underpin their findings
through predictive analytics with machine learning
techniques. These include kernel smoothing algo-
rithms such as k-nearest neighbor and local linear
regressions, as well as neural network approaches.

1.3 Layout of the Study


This study consists of seven chapters. A general lit-
erature review on the relevant historical research is

3
2. Literature Review

2.1 Defining Well-Being The aim is for well-being to finally be defined with
subjective, individually perceived well-being provid-
Research has not yet come up with an agreed, fi- ing us the possibility to measure well-being (Frey
nalized definition of well-being. This is especially & Stutzer, 2002). However, considering historically
astonishing due to the long history of well-being lit- proposed approaches towards a definition of well-
erature. Aristotle already proposed attempts to de- being so far, is important in order to understand
fine well-being more than 2000 years ago (Aristotle, found dependencies, the applied measurements and
2002). However, over the centuries and especially hence this study’s hypotheses regarding the nature
in the last thirty years the attitude towards happi- of well-being.
ness has changed and new dimensions of well-being
have defined the common understanding. Moreover, 2.1.1 Perspectives on Well-being
the evaluation of individual well-being varies from
concept to concept, including psychological well- To understand well-being, it is important to differ-
being, subjective well-being and economically calcu- entiate between hedonistic and eudemonistic well-
lated well-being (Diener & Suh, 1997; Frey & Stutzer, being approaches. While hedonism is focused on the
2002; Hall et al., 2013). The absence of a unified mea- individual subjective perception of happiness, eude-
surement of well-being supports the lack of a single monism describes well-being as an objective state in
definition (Dodge, Daly, Huyton, & Sanders, 2012; life, reached not by the notion of ”one is pleased with
Ryan & Deci, 2001; Ryff, 1989). one’s life,” but if the individual has ”what is worth de-
siring and worth having in life” (Diener, 1984; Telfer,
Dodge et al. (2012) identified historic definition at- 1980; Waterman, 1993).
tempts, but outline that they focused rather on the
dimensions than on the actual definition of well- 2.1.1.1 Eudemonia and Psychological Well-
being. The existence of a well-being baseline in- being
fluenced by the individual life satisfaction is widely
The eudemonistic, normative perspective has its
supported (Brickman & Campbell, 1971; Headey &
roots in the work of Aristotle, who refused the hedo-
Wearing, 1991; Veenhoven, 1984). Additionally, pos-
nistic view of well-being as perceived happiness, be-
itive and negative affect seem to influence individ-
cause striving for hedonistic well-being would cause
ual well-being (Ryff, 1989). Diener (1994) accounted
a life of gratification similar to those of ”grazing ani-
that well-being ”comprises people’s longer-term lev-
mals” (Aristotle, 2002, p. 98). Instead he proclaimed
els of pleasant affect, lack of unpleasant affect, and
well-being as an overall goal in life based on virtues
life satisfaction” (p. 103).
and the achievement of what is desirable in life, nowa-
days also referred as the feeling of personal expres-
Moreover, it has to be questioned whether well-being
siveness and self-realization (Waterman, 1993). The
actually exists as an objectively measureable ’real
concept of eudemonia calls on individuals ”to real-
thing’ or whether it might be a construct of sev-
ize their full potentialities in order to achieve a good
eral different factors, each measurable individually
life” (Diener & Suh, 1997, p. 189). The eudemonic
and all together misinterpreted as a measurable phe-
perspective is rather objective, since the standards
nomenon. Nevertheless, Dodge et al. (2012) dis-
defining ’being well’ are assumed to be universal and
agreed with the proposition of well-being as a con-
equal for every single individual.
struct and proposed ”that well-being should be con-
sidered to be a state” (p. 226), in which the essential Closely linked is psychological well-being (PWB) in-
qualities in life are relatively stable (Headey & Wear- cluding indicators that assess individual well-being
ing, 1991). from different psychological perspectives. Ryff and

4
2.1. Defining Well-Being 5

Keyes (1995) proposed six measures for psychologi- 2.1.1.2 Hedonism and Subjective Well-being
cal well-being, namely ”positive evaluations of one-
self and one’s past life (self-acceptance), a sense of Contrarily to eudemonia, modern well-being litera-
continued growth and development as a person (per- ture revives the hedonistic perspective on well-being
sonal growth), the belief that one’s life is purposeful as individual subjective well-being (SWB) measures
and meaningful (purpose in life), the possession of have come to the fore. In this regard, well-being is
quality relations with others (positive relations with defined ”in terms of pleasure versus pain” (Ryan &
others), the capacity to manage effectively one’s life Deci, 2001, p. 144) essentially equivalent to hedo-
and surrounding world (environmental mastery), and nism as initially introduced by Aristotle (2002). In
a sense of self-determination (autonomy)” (p. 720). contrast to Aristotle’s attitude, each individual’s per-
Significant correlations between the mentioned mea- ceived happiness is assessed as important, so that the
sures and other measures such as life satisfaction individual’s relative standards are applied for well-
and self-esteem were found (Ryff, 1989). The un- being measurements, which is hence subjective.
derlying aim of psychological well-being is to cover
positive human functioning as an important part of Nowadays, most studies on well-being correlation
well-being. Again, this is closely linked to Aristo- employ a subjective well-being measure in order to
tle’s eudemonic perspective on well-being. However, assess well-being upon the standards of the respon-
the standards for being well according to psycholog- dent (Diener, 1984). Frey and Stutzer (2002, p. 2)
ical well-being are partially adapted to the individ- proposed that SWB is a ”useful way out” of the
ual’s valuation (cf. e.g. ’self-acceptance’, ’purpose in well-being definition dilemma, because one can just
life’), in case self-reports and questionnaires are used ”ask the individuals how happy they feel themselves
to gather psychological well-being information. to be.” In one of the first meta-analyses on subjec-
tive well-being Diener (1984) concluded that sub-
jective well-being covers happiness and life satisfac-
Moreover, this perspective on well-being is supported tion, as well as positive affect. The eudemonic, ob-
by the possibilities of objective physiological mea- jective well-being conditions such as health, virtue
surements. Being able to detect brain waves re- and wealth are obviously not directly part of sub-
sponsible for positive emotions and well-being allow jective well-being, but are seen as potential influ-
for a technical definition and measurement of well- ences. These days, there is evidence in literature,
being. Thereby an objective rating and valuation- that subjective well-being generally aggregates three
independent measurement of well-being becomes interrelated components, namely positive affect, the
possible. Figure 2.1 illustrates the differences be- absence of negative affect and the overall life sat-
tween subjective and objective well-being as well as isfaction (Diener, 1984). Many studies agreed with
possible measures. (Frey & Stutzer, 2002) this classification and defined subjective well-being

Objective happiness Subjective happiness

Affect
Physiological Psychological
measures measures
Cognition, memory

Experience Global
Brain waves sampling measures self-reports

Figure 2.1: Objective vs. subjective happiness


(based on Frey & Stutzer, 2002, p. 4)

5
6 2. Literature Review

upon these components (see e.g. Dodge et al., 2012; The Joyless Economy that most pleasures are not
Veenhoven, 2010). achieved as a result of economic decisions. Following
recent progress on the measurement of well-being,
In order to decouple the results from short-term ef-
it is consequently demanded that ”utility should be
fects some studies apply the naturalistic experience-
given content in terms of happiness, and that it
sampling method (ESM), in which ”researchers as-
[. . . ] should be measured” (Frey & Stutzer, 2002,
sess respondents’ SWB at random moments in their
p. 20). Moreover, the explicit measurement of well-
everyday lives, usually over a period of one to
being as utility would enable for interpersonal com-
four weeks” (Diener, 2000). Kahneman, Krueger,
parison within economic theory (cf. Easterlin, 1974;
Schkade, Schwarz, and Stone (2004) proposed the
Ng, 1997).
day reconstruction method (DRM) for the measure-
ment of SWB. Different approaches on well-being However, the conventional economic perspective on
measures are discussed in section 2.3. well-being is still popular, since it is relatively easy to
Even if contrarily defined by Aristotle (2002), eude- measure, tangible and widely used to support polit-
monia or personal expressiveness is nowadays seen as ical decision-making (cf. Diener & Seligman, 2004).
one possible track to achieve hedonistic well-being, Economic well-being measures are not intended to
measured as subjective well-being (Telfer, 1980; Wa- provide insights about personal well-being levels, but
terman, 1993). This finding is supported by corre- about well-being on a more general, averaged or even
lational dependencies between personal expressive- national basis. Nevertheless, several studies support
ness and hedonic enjoyment identified by Waterman a correlation between economic indicators and sub-
(1993). However, for the understanding of the feel- jective well-being, but rather on a macroeconomic
ing ”happiness” both perspectives remain important. scale (see e.g. Stevenson et al., 2008). Diener and
Ryan and Deci (2001) concluded that ”the two tradi- Seligman (2004) explained the importance of eco-
tions – hedonism and eudemonism – are founded on nomic measures for well-being particularly for the
distinct views of human nature and of what consti- ”early stages of economic development, when the ful-
tutes a good society” (p. 143). fillment of basic needs was the main issue” (p. 1),
but relativize this importance for highly developed
2.1.1.3 Economic Well-being countries. This assessment is based on the ’Easterlin
paradox’ describing a saturation point in the rela-
Besides PWB and SWB, economic well-being refers tionship between income and well-being on a national
to external measures upon economic indicators, in- basis (cf. Easterlin, 1974). The finding has been con-
cluding for example income, wealth, social security firmed several times (cf. e.g. Easterlin, 1995; Os-
and safety. It is based on the assumption that certain wald, 1997) and is not only observed in comparisons
levels of these economic measures allow individuals between countries, but also in time-series analyses
to achieve personal fulfillment, which again results in for averaged national data. Economically saturated
well-being. According to the new welfare economics countries as e.g. the US do not obtain higher av-
individual choices are based on maximizing utility eraged well-being when the income per capita rises
in order to achieve personal well-being. Utility, ini- over time (cf. Clark, Frijters, & Shields, 2008). The
tially introduced as cardinal value was then inter- paradox is explained by decreasing importance of ad-
preted as ordinal measure indicating personal pref- ditional income when basic needs are satisfied (cf.
erences for maximizing life satisfaction (cf. Frey & Stevenson et al., 2008).
Stutzer, 2002). However, recent developments show
that these self-concerned preferences might not be Consequently, several countries including e.g. the
sufficient to explain changes in personal well-being US, France, Germany and the UK initialized pro-
(cf. Ng, 1997). It is for example argued that ”people grams measuring well-being independently from eco-
are not always able to choose the greatest amount nomic measures (see e.g. Enquete-Kommission, 2013;
of utility for themselves” (Frey & Stutzer, 2002, p. Stiglitz et al., 2009; Waldron, 2010). Nevertheless,
22). Furthermore, Scitovsky (1976) found in his book decision-making is still mainly based on the underly-

6
2.1. Defining Well-Being 7

ing idea of economic well-being that increased wealth Stocks


Flows / psychic Subjective Well-
income being
and social status lead to higher well-being within the
Social background
society. Economic well-being is therefore widely used •  Sex
as an argument in favor of economically beneficial •  Age Favorable events
•  Socio-eco- which yield
development (cf. Diener & Seligman, 2004) and does nomic status satisfaction
Life Satisfaction
Personality (income gains)
not entirely reflect the individual self-perceived well- •  Extraversion
Positive Affect
being. Economic indicators are therefore not directly •  Neuroticism
•  Openess Adverse events
considered within this study, however might have in- which yield distress Negative Affect
Social Network
•  Intimate (income loss)
fluenced the applied subjective well-being measures.
attachments
•  Friendship
network
2.1.2 Well-Being Baseline
Subjective well-being has been found to be ’fairly Figure 2.2: Stocks and flows framework
stable’ for most people most of the time (Headey (based on Headey & Wearing, 1991, p. 56)
& Wearing, 1991; Veenhoven, 1984). This position
is based on the finding of Brickman and Campbell
(1971), who identified a baseline of well-being that ferent well-being values even if they experienced the
an individual tends to return to after positive or exact same favorable or adverse events. Headey and
negative external influences. Headey and Wearing Wearing (1991) concluded that the model ”account
(1991) explained this with equilibrium of ”stock lev- for stability and change in subjective well-being in
els, psychic income flows and subjective well-being” the medium term; say, five to ten years” (p. 66), but
(p. 49). Stock levels include the individual’s so- also that stock levels (e.g. personality) may change
cial background, networks and personality, while the in the long term leading to changes of the well-being
flows of psychic income describe favorable and ad- baseline.
verse events yielding satisfaction or distress.
Several other authors have empirically validated the
According to the model, given in Figure 2.2, well- baseline theory, nowadays also referred to as the set-
being in the dimensions of life satisfaction, as well point theory. For example, Suh, Diener, and Fujita
as positive and negative affect is dependent on the (1996) found within a study of 115 participants ”that
income flows covering all recent experiences in life only life events during the previous 3 months in-
and on the stock levels, which are stable during mid- fluenced life satisfaction and positive and negative
term. Stock levels are thereby responsible for the affect” and that their ”magnitude drops quickly af-
stability of well-being, while positive and negative terward” (p. 1095). Cummins (2009) described the
income flows cause fluctuation around the baseline. essence of the theory as ”a self-correcting process that
Furthermore, the stocks also provide the resources maintains stability around set-points that differ be-
to cope with life experiences. Headey and Wearing tween individuals” and that ”SWB is a neurologically
(1991) proposed therefore ”favorable events and high maintained in a state of dynamic equilibrium” (p. 4).
levels of psychic income are due high stock levels” (p.
Nevertheless, Fujita and Diener (2005) also had to
61). Consequently, the stocks including background,
limit the idea of a wellbeing baseline as they analyzed
personality and social network influence well-being in
data from a representative 17 year study from Ger-
two different ways: firstly, they determine the well-
many. It was found that the life satisfaction in 9% of
being baseline; secondly, they moderate the effect of
the respondents had changed more than 3 points on
life experiences on well-being (see also Figure 2.2).
a 10 point scale from the first to the last five years
(Headey & Wearing, 1991)
on average. They concluded that ”life satisfaction
As a result each individual has their own level of can and does change for some people, even in the
subjective well-being depending on individual stocks face of significant stabilizing factors such as herita-
and flows. The dependency on personality and so- ble disposition” (p. 162). They also supported the
cial background provides each individual with dif- general idea of ”a ’soft baseline’ for life satisfaction,

7
8 2. Literature Review

with people fluctuating around a stable set point that defensive range, in which the affects have a small in-
nonetheless does move for about a quarter of the pop- fluence on SWB only, until the challenge becomes
ulation” (p. 162). too strong and SWB drops out of the homeostatic
defense range (see Figure 2.3). Cummins (2009) con-
cluded if ”this condition is chronic, people experience
2.1.3 The Influence of Positive and
depression” (p. 15).
Negative Affect

In 1991 Diener, Sandvik, and Pavot (1991) already


argued that ”well-being can be equated with the rela-
2.1.4 Equilibrium Theory
tive amount of time a person experiences positive vs.
Dodge et al. (2012) proposed an integrated approach
negative affect” (p. 136). The intensity of the affects
towards a definition of well-being, containing the
seems to be of less impact (see also Larsen, Diener,
baseline theory as well as the concepts of positive and
& Emmons, 1985; Lyubomirsky et al., 2005). Fol-
negative affect. They identified well-being as a state
lowing the idea of a ”set-point” for well-being Cum-
reached, when the individual’s resources fit with the
mins (2009) proclaimed that the theory does ”not
individual’s challenges, so that the equilibrium is sta-
attempt to account for the nature of the relationship
ble. The resources and challenges are thereby influ-
between SWB in dynamic equilibrium, and other de-
enced by long-term and short-term changes on the
mographic and psychological variables” (p. 4). To
psychological, social and physical field. Dodge et al.
cope with this, Cummins (2009) proposed a certain
(2012) described well-being to be stable accordingly
process of SWB management for positive and nega-
to their model, when ”individuals have the psycho-
tive affects, called SWB homeostasis.
logical, social and physical resources they need to
The process addresses the question, of which level of meet a particular psychological, social and/or phys-
affect has what impact on SWB. A high resilience of ical challenge” (p. 230). If challenges and resources
the participant’s SWB against moderately challeng- are out of balance, the individual well-being drops
ing life conditions has been found, leading to a wide (see Figure 2.4 for a graphical representation).

Dominant Sources of SWB Control

Set- Homeostasis Challenging


point (Defensive range) conditions

80
Upper threshold
Set point
range
Strong homeostatic defense
Lower threshold
70

SWB

50
No Very strong
challenge Strength of challenging agent
challenge

Figure 2.3: SWB homeostasis


(based on Cummins, 2009, p. 5)

8
2.2. Determinants of Well-Being 9

Resources Challenges ciety, has social confidants, and possesses adequate


Psychological Psychological resources for making progress toward valued goals”
Well-
Social Social
being (p. 295). This sections reviews the most relevant de-
Physical Physical
terminant in the three categories demographics, per-
sonality traits and the way of life in order to extend
Diener’s picture (see Diener et al., 1999), but it does
Figure 2.4: Well-being equilibrium definition
not claim completeness in this regard.
(based on Dodge et al., 2012, p. 230)

2.2.1 Demographics / One’s Back-


To summarize, well-being definitions described in lit- ground
erature share the common idea of an individual level
of well-being determined by personality, social and One of the most discussed determinants of well-being
physical factors. Those determinants influence the in recent years is the by Blanchflower and Oswald
general well-being level on the one hand, and oppose (2008) identified correlation of well-being and age.
challenging conditions and corresponding affect on During their study they examined 500,000 randomly
the other, in order to preserve the individual well- sampled Americans and Europeans from the General
being level. Identified determinants for well-being Social Surveys of the United States and the Euro-
are hence outlined in the following chapter. barometer Surveys. According to which they have
found a robust U-shape, reaching the minimum of
well-being in the middle age (see also Clark & Os-
2.2 Determinants of Well-Being wald, 2006). This finding is also supported by Stone,
Numerous studies aim to identify the most impor- Schwartz, Broderick, and Deaton (2010) and Blanch-
tant determinants of well-being. And even more flower (2001). Frey and Stutzer (2002, p. 53) added
researchers have identified correlational linkages to that ”the old have lower expectation and aspirations,”
well-being within their research (Veenhoven, 1984). so that ”the gap between goals and achievement is
For example Sheldon and Hoon (2006) confirmed smaller” and the perceived life satisfaction is con-
their hypothesis of psychological need-satisfaction, sequently higher. They even reported older people
a positive Big Five trait profile, good personal to be better adjusted ”to their conditions” and are
goal-progress, high self-esteem, positive social sup- therefore happier. This would support a positive cor-
port, and a happiness-conducing cultural member- relation of well-being with age. Age is therefore con-
ship would each uniquely correlate with SWB. How- sidered to be important for our predictive analytics
ever, most studies agreed that ”mutual interaction approach.
must be taken in account” and that ”causation may
Related is the finding that different generations, also
go in both directions (Frey & Stutzer, 2002, p. 103).”
referred to as cohorts, are characterized by differ-
Ryff and Keyes (1995) identified in their study auton-
ent average well-being levels. Blanchflower and Os-
omy, environmental mastery, personal growth, posi-
wald (2008) traced this finding to the circumstances,
tive relations with others, purpose in life and self-
good or bad times, the cohorts experienced in their
acceptance as key determinants for well-being. Di-
life. Interestingly, they have found evidence, ”that
ener, Suh, Lucas, and Smith (1999) reviewed the
successive American birth-cohorts have become pro-
last thirty years of SWB research and predicted that
gressively less happy between 1900 and today” (p.
”progress will have been even more rapid [thirty years
1740).
from now] than it has been in the past three decades”
(p. 295). They draw a picture of their ’happy person’ Less important, but still discussed in research is the
and concluded that the ”person is blessed with a pos- linkage between well-being and gender. However,
itive temperament, tends to look on the bright side of there is no conclusive evidence of significant correla-
things, and does not ruminate excessively about bad tion between sex and well-being (cf. Diener & Lucas,
events, and is living in an economically developed so- 1999). This finding is especially astonishing, since

9
10 2. Literature Review

women suffer significantly more often from depres- the US, which were explained by ”differences in the
sion and unpleasant affect (cf. for a literature review norms governing the experience of emotion [. . . ] due
Fujita, Diener, & Sandvik, 1991). Diener and Lucas to affective regulation” (p. 7). Moreover, regional
(1999, p. 292) identified one possible explanation as differences have been found between European coun-
women experience ”both positive and negative emo- tries as well as in worldwide comparisons (cf. Deaton,
tions more strongly and frequently than men.” The 2007). However, for the latter, results were found
finding is based on research by Fujita et al. (1991), to highly correlate with the national gross domestic
in which sex accounted for only 1% of the well-being product per capita. To summarize, a variable for the
variance, but for 13% of the variance of the intensity location is included within the analysis. Location
of positive and negative affect. refers to the cultural system the participant lives in
and accounts therefore for cultural, geographic and
Further determinants such as the ethnic group and ethnic differences.
religion have been tested for correlation. Taking reli-
gion Frey and Stutzer (2002, p. 59) for example sum-
marized that ”the effect is not large.” Nevertheless, 2.2.2 Personality Traits
many studies found significant positive influences (cf.
Personality determines well-being on several levels.
Diener et al., 1999; Dolan, Peasgood, & White, 2008).
DeNeve and Cooper (1998) conduced a meta-analysis
Ellison (1991, p. 80) suggested that the ”positive in-
and identified 137 distinct personality constructs cor-
fluence of religious certainty on well-being [. . . ] is
relating with subjective well-being. The importance
direct and substantial: individuals with strong re-
of personality traits for well-being is explained with
ligious faith report higher levels of life satisfaction,
the top-down theory of well-being assuming that
greater personal happiness, and fewer negative psy-
there is a ”global propensity to experience things in a
chosocial consequences of traumatic life events.” Ac-
positive way” (Diener, 1984, p. 565), which is based
cording to the study, religion was found to explain 5%
on the individuals personality (DeNeve & Cooper,
- 7% of the well-being variance. Moreover, Diener et
1998). Contrariwise, bottom-up theories1 explain
al. (1999) concluded that religion ”may provide both
well-being as a ”sum [of] the momentary pleasures
psychological and social benefit,” since it provides a
and pains” (Diener, 1984, p. 565), which is closely
”sense of meaning in daily life” and furthermore offers
linked to the theory of positive and negative affect
a ”collective identity and reliable social networks” (p.
(Diener et al., 1991). Additionally, Steel, Schmidt,
289).
and Shultz (2008) found that ”considerable advances
have been made in the psychobiology of both SWB
Regarding ethnic differences within one country like
and personality” (p. 139) and ”that the two con-
the US it was found that the ”gap between the happi-
structs share common physical underpinnings” (p.
ness of the white and the black populations has nar-
139), which explain the correlational findings be-
rowed” (Frey & Stutzer, 2002, p. 56). The authors
tween personality and well-being. Beside these di-
traced this back to reduced discrimination. However,
rect psychobiologic linkages, it is argued, ”personal-
differences between ethnics can still be observed.
ity may help create life events that influence SWB”
Luttmer (2005) for example found that Hispanics
(p. 139). An often replicated finding in this regard
show significantly higher well-being values than other
is the linkage between sociability, a facet of extraver-
ethnics. Moreover, whites tend to show higher lev-
sion, and positive affect as a component of well-being
els of well-being than African American (cf. Dolan
(Steel et al., 2008). In general, extraversion2 is posi-
et al., 2008). However, in this study religion and
tively correlated to well-being, while neuroticism has
ethnic groups are not considered directly, as found
a negative influence (Şimşek & Koydemir, 2012; Steel
influences are comparably small. Nevertheless, it is
et al., 2008).
accounted for regional differences embodying those
influences at least partially. Diener, Suh, Smith, and 1 For a comparison of top-down and bottom-up well-being see
also Headey, Veenhoven, and Wearing (1991).
Shao (1995) found regional differences between the 2 ’Extraversion’ and ’extroversion’ are often used interchange-

Pacific Rim countries (e.g. Japan, China, Korea) and ably within literature.

10
2.2. Determinants of Well-Being 11

Personality is measured and categorized in different stability are the most important determinants of
dimensions and scales, of which the most common set those three. This finding is also supported by Vit-
is the ”big five personality trait taxonomy” initially tersø (2001). In order to perform predictive analytics
proposed by McCrae and Costa Jr. (1985) upon their on well-being the big five must be considered as im-
previously published NEO (Neuroticism, Extraver- portant predictors.
sion, Openness) model and the five personality fac-
Another dimension of personality to be considered
tors initially found by Norman (1963). The factors
is the differentiation between maximizing and satis-
originate from a set of 16 personality factors iden-
ficing individuals. Nenkov, Morrin, Ward, Hulland,
tified by Cattell (1947), of which the big five have
and Schwartz (2008) developed a short form of the
been proven to be replicable in many other studies
maximization scale, initially introduced by Schwartz
(see Goldberg, 1993). The big five scale measures
et al. (2002). The question addressed, whether ”peo-
personality in five dimensions, namely extraversion
ple [can] feel worse off as the options they face in-
vs. introversion, agreeableness vs. antagonism, con-
crease,” is related to well-being (Schwartz et al., 2002,
sciousness vs. lack of direction, neuroticism vs. emo-
p. 1178). It has been found, ”that maximizers re-
tional stability and openness vs. closeness to expe-
ported significantly less life satisfaction, happiness,
rience (John et al., 2008; John & Srivastava, 1999).
optimism, and self-esteem, and significantly more re-
This study utilizes the 44 single item scale proposed
gret and depression, than did satisfiers.” However,
by John, Donahue, and Kentle (1991) with a five-
it is also doubted that maximizing always prevents
point scale for each item. Each item adds to one of
from being well. In order to address this question,
the five factors; some of the items are reverse coded.
more advanced research exceeding correlational anal-
For the original conceptual definition of the five fac-
ysis has to be conducted.
tors see Table 2.1.
The third psychometric measure included in this
Although different views regarding the importance study is a fairness test. The questions are based on
of each personality trait occur, extraversion, agree- a research by Schmitt and Dörfel (1999), who found
ableness and neuroticism have been found to be the a negative correlation between procedural injustice
most important determinants of well-being (DeNeve and psychometric well-being. The correlational find-
& Cooper, 1998; Haslam, Whelan, & Bastian, 2009; ings were moderated by justice sensitivity, which is
Steel et al., 2008). In 1998 DeNeve & Cooper pro- referred to as fairness in this study. Schmitt, Goll-
posed that neuroticism and respectively emotional witzer, Maes, and Arbach (2005) refined the find-

Factor labels Conceptual definition

Extraversion Implies an energetic approach toward the social and material world
(Energy, Enthusiasm) and includes traits such as sociability, activity, assertiveness, and
positive emotionality.

Agreeableness Contrasts a prosocial and communal orientation toward others with


(Altruism, Affection) antagonism and includes traits such as altruism, tender-mindedness,
trust, and modesty.

Conscientiousness Describes socially prescribed impulse control that facilitates task-


(Constraint, Control of impulse) and goal-directed behavior, such as thinking before acting, delaying
gratification, following norms and rules, and planning, organizing,
and prioritizing tasks.

Neuroticism Contrasts emotional stability and even-temperedness with negative


(Negative Emotionality, Nervousness) emotionality, such as feeling anxious, nervous, sad, and tense.

Openness Describes the breadth, depth, originality, and complexity of an in-


(Originality, Open-Mindedness) dividual’s mental and experiential life.

Table 2.1: Big Five Trait Taxonomy - Factor Definition (based on John et al., 2008)

11
12 2. Literature Review

ings as they analyzed justice sensitivity from three countries in East Europe and the individuals’ age
perspectives (victim, observer and perpetrator). In were observed. Moreover, Frey and Stutzer (2002,
order to measure the overall fairness sensitivity for p. 101) reminded that ”individuals tend to evalu-
participants, this study averages a simplified version ate their own situation relative to others,” so that
of the questionnaire by Schmitt et al. (2005) for all ”both, the psychic and the social effects are miti-
three perspectives. gated” when ”unemployment is seen to hit many per-
sons one knows or hears of.” This finding is supported
Concluding, many studies verify the importance of
by Shields and Price (2005), who analyzed the UK
personality as determinants of well-being. Conse-
general health questionnaire and found that the ef-
quently, the fig five trait taxonomy as well as the
fect of individual unemployment is neutralized in ar-
maximizer vs. satisficer and the fairness score are in-
eas with the highest employment deprivation values
cluded in this study in order to cover a broad range
(> 22%). In those areas the average unemployed per-
of personality factors.
son was ”estimated to have at least the same level of
psychological well-being as an equivalent employee”
2.2.3 Life Situation
(p. 531). The mitigating effect is not only found
Besides the rather static predictors personality and for geographic areas, but also for partners. Employ-
demographics, background decisions and circum- ees having an unemployed partner show significantly
stances in life have been found to be significantly lower well-being levels, whereas having an employed
correlated to well-being including life satisfaction as partner increases the well-being of the unemployed
well as positive and negative affect. partner (cf. Clark, 2003). Dolan et al. (2008) ar-
gued in their literature review that those impacts
Firstly, employment has been found to be important.
result from ”the extent to which the individual can
McKee-Ryan, Song, Wanberg, and Kinicki (2005) re-
substitute other activities for work, belong to non-
viewed 104 studies and concluded ”unemployed indi-
work based social networks and are able to legitimize
viduals had lower psychological and physical well-
their unemployment” (p. 102). Generally, Frey and
being than did their employed counterparts” (p. 67).
Stutzer (2002) pointed out the importance of one’s
Frey and Stutzer (2002) supported the finding, ex-
reference group for the influence of unemployment on
plaining, ”job satisfaction is a crucial part in life sat-
well-being and concluded that ”unemployed people’s
isfaction” (p. 107). The measured drop in SWB is
well-being [. . . ] depends on the strength of the social
not only due to income losses, but also due to social
norm to work” (p. 102).
effects, because ”not having work leads to isolation,
which makes it difficult or impossible to lead a satis- Closely linked to this insight are the findings on
factory life” (p. 107). Lucas, Clark, Georgellis, and income. Contrarily to the ’Easterlin paradox’ (see
Diener (2004) continued to support the set-point the- Easterlin, 1974) suggesting well-being to be indepen-
ory, identifying an influence of employment as ”indi- dent from the income per capita as a measure of the
viduals first reacted strongly to unemployment and society’s economic development after a certain satu-
then shifted back toward their former (or ’baseline’) ration point, Stevenson et al. (2008) found a ”clear
levels of life satisfaction” (p. 2). However, on average positive link” and ”no evidence of a satiation point
they have found that ”individuals did not completely beyond which wealthier countries have no further in-
return to their former levels of life satisfaction, even creases in subjective well-being” (p. 1). Neverthe-
after they became re-employed” (p. 2). Unemploy- less, it has to be taken into account that the studies
ment influences the well-being baseline in the long refer to countrywide averages and do not assess the
run, even if ”considerable individual differences in re- dependency on an individual level. Stevenson and
action and adaptation to unemployment” were found Wolfers (2013) refined their previous conclusion from
(Lucas et al., 2004, p. 2). Blanchflower (2001) sup- 2008 and added in-country analyses. They found a
ported this finding, but also outlined moderating ge- roughly linear-log relationship, but still rejected the
ographic and demographic influences on this corre- existence of a satiation point or a ”critical income
lational finding. Differences between the examined level beyond which the well-being–income relation-

12
2.2. Determinants of Well-Being 13

ship is qualitatively different” (p. 598). Frey and toward their goals or to adapt to changes in the world
Stutzer (2002) supported the finding of positive cor- around them” (p. 293). Nevertheless, it is also seen
relation, but propose the existence of an aspiration that education raises expectations as well as aspira-
level, from which well-being is constant with increas- tion and might therefore lower SWB due to a less
ing income. Similar results have recently been found satisfied life (cf. Diener et al., 1999). Because of the
by Jorm and Ryan (2014), who analyzed eight re- correlational findings between education and income,
search databases and found an increase of subjec- several studies accounting for both variables found
tive well-being with an increasing income per capita negative influences of higher education on well-being
on a national basis, even if these gains decrease for (cf. Campbell, Converse, & Rodgers, 1976; Clark,
richer countries. Poverty affects well-being when it 2003). Clark (2003) concluded that education either
affects basic needs, but once those are satisfied the ”raises expectations at the same time as outcomes”
linkage becomes less significant and more complex. or is ”endogenous, being chosen by people who are
It has also been found that income is judged in re- ’naturally’ more difficult to please” (p. 331). In gen-
lation to one’s social environment, so that high in- eral, the findings regarding education differ crucially
come does only raise well-being, when it is compara- depending on whether correlated variables such as
bly high in relation to others (Frey & Stutzer, 2002, income and occupational status are accounted for.
p. 85f.). Clark et al. (2008) supported this find-
Similar research has been conducted on the link-
ing and argued that the macroeconomic finding by
age between success and subjective well-being.
Easterlin (see 1974) is consistent with the found pos-
Lyubomirsky et al. (2005) argued that the correla-
itive correlations of income and subjective well-being
tion is found, because ”success makes people happy,
on an individual level, when relative income terms
but also because positive affect engenders success”
are used to explain the gains in utility. Diener et
(p. 803). The latter is also supported by Wright
al. (1999) concluded that ”wealthy people are only
and Cropanzano (2000) on the field of job perfor-
somewhat happier than poor people in rich nations,
mance, because well-being was found to be a predic-
whereas wealthy nations appear much happier than
tor of job satisfaction (cf. also Harter et al., 2003).
poor ones.”
While there is agreement in literature that higher
Education is a further variable, for which an influ- levels of well-being lead to success in life (see also Di-
ence has been found, even if the correlational findings ener & Tay, 2013), it is contrarily discussed whether
seem to correlate significantly with those for income success results in higher well-being. For example,
(cf. e.g. Clark, 2003; Diener, Sandvik, Seidlitz, & Samuel, Bergman, and Hupka-Brunner (2013) found
Diener, 1993; Witter, Okun, Stock, & Haring, 1984). no evidence for adolescents that a ”lack of educa-
High education leads to higher income, so that it has tional and occupational success [. . . ] leads to a de-
to be questioned, whether ”education may be only crease in well-being as hypothesized” (p. 90). On
indirectly related to well-being” (Diener et al., 1999, the other hand, Diener et al. (1999) concluded that
p. 293). However, Blanchflower and Oswald (2004) success or achievement of personal goals well has
found significant positive influences of higher edu- a positive effect on subjective well-being and that
cation levels within their US and UK studies and people ”react negatively when they fail to achieve
concluded that ”education is playing a role indepen- goals” (p. 284). However, they also emphasized that
dently of income” (p. 1371). Diener et al. (1999) those ”goals serve as an important reference stan-
summarized that ”education is more highly related dard for the affect system” (p. 284) and may hence
to well-being for individuals with lower incomes” (p. lead to higher or lower well-being values depending
293), but the independent influences have not finally on whether the goal meets the person’s needs and
been identified yet (cf. Diener et al., 1993). More- whether they are therefore valued (see also Brun-
over, the influence is in particular found for low in- stein, Schultheiss, & Grässmann, 1998). Moreover,
come countries (cf. Dolan et al., 2008). Diener et al. it was found that ”simply having valued goals inde-
(1999) proposed that education has an influence on pendent of past success, was associated with higher
well-being as it allows ”individuals to make progress life satisfaction” (p. 285), so that actual successful

13
14 2. Literature Review

achievement of those goals might not be the predomi- ground, personality traits and life situation) to-
nant factor (see also Emmons & Diener, 1986). Since gether, the review shows that many different deter-
Diener’s review in 1999 only little insight has been minants explain a significant share of the SWB vari-
gained regarding the relation of well-being and suc- ance. Still, little research has actually addressed the
cess, even if the dependencies between goal achieve- linkages and moderating effects between those sig-
ment, valuation of goals and the influence on well- nificant determinants in order to predict well-being
being is still explained insufficiently. sufficiently. Before the existing literature on predic-
tive analytics on well-being is reviewed, the question
Furthermore, the linkage between well-being and on how to measure well-being has to be addressed.
health, especially not only mental health, but also Within the reviewed studies on determinants several
physical health is researched extensively. Within different measurements have been used. However,
their meta-analysis Diener et al. (1999) concluded every single study discussed the same question on
that SWB is strongly correlated to health, but only if how to measure well-being persistently.
the health measure is self-reported. Correlation with
objectively health measured by physicians is consid-
erably weaker. But they also note that the percep- 2.3 Measuring Well-Being
tion of health is influenced by the actual objective Due to the complexity of defining well-being, there
health, negative affect and the individual’s personal- is no single right answer on how to measure well-
ity (Diener et al., 1999). Although objective health being (see e.g. Larsen et al., 1985). Nevertheless, in
plays this minor role for SWB, it is rated highest, order to study well-being and its determinants one
when ”respondents are asked to judge the importance well-founded measuring approach has to be applied.
of various domains in life” (Diener et al., 1999, p. Generally, most studies within the last two decades
293). One explanation given for this contradiction have used subjective well-being measures including
is that ”people appear to be remarkably effective in ”life satisfaction, the presence of positive mood, and
coping, using cognitive strategies [. . . ] that induce the absence of negative mood” (Ryan & Deci, 2001,
a positive image of their health condition” (Diener p. 144).
et al., 1999, p. 287). The subjectively perceived
health is consequently higher than the actual. Nev- Within this study well-being is assessed via the Hu-
ertheless, it has also been found that perceived health man Flourishing Index (HFI) developed as a concep-
does sometimes not recover entirely after periods of tual framework by Huppert and So (2011). Well-
serious illnesses. The more different diseases were being is considered as ”positive mental health” (p.
diagnosed per test person, the weaker was the ca- 837) and subjectively measured with a ten features
pability to recover after the drop of well-being (Di- questionnaire. The features are derived by inversion
ener et al., 1999). Positive correlation between health of ”internationally agreed criteria for depression and
and well-being is also recently reported by Jorm and anxiety” (p. 837), which are defined as the opposite
Ryan (2014) and Lacey et al. (2008), who exam- of mental health, respectively well-being. Depression
ined the effect of illnesses on the recalibration of the and anxiety were selected, as they have the ”highest
quality of life, respectively well-being scale. Within prevalence in the population” (p. 842) and also be-
their study with patients and non-patients no evi- cause ”the other categories of anxiety disorder (pho-
dence could be found for a significant recalibration bias, OCD, PTSD) do not have a polar opposite”
of the well-being scale due to the illnesses (Lacey et (p. 842). The features include feelings as well as
al., 2008). However, subjectively perceived health is functioning and therefore account for both, hedonis-
considered important in this study in order to pre- tic as well as eudemonic well-being. Namely they are
dict subjective well-being, since it reflects both, the stated as: ”competence, emotional stability, engage-
eudemonic virtue of a healthy life and the hedonistic ment, meaning, optimism, positive emotion, positive
perspective of perceived physical health. relationships, resilience, self-esteem, and vitality” (p.
837). A panel of three psychologists and one lay per-
Taking all determinants of SWB (demographic back- son developed each feature as the mirror opposite of

14
2.3. Measuring Well-Being 15

a symptom of one of the mental disorders, depression Pc = {cj : cj > 0}, Pf = {fk : fk > 0},
or anxiety.
where pe stands for the question on positive emotion,
Huppert and So (2011) validated the framework upon
cj for the positive characteristics (vitality, resilience,
the European Social Survey (ESS). They found two
optimism, happiness, self-esteem) and fk for positive
underlying factors explaining 43% of the variance.
functioning items (engagement, meaning, positive re-
Additional analysis evaluating the differences be-
lationships, competence).
tween the framework and the single item measure
life satisfaction from the EES showed high correla- Hall et al. (2012) used the HFI framework for their
tion between positive emotion and life satisfaction, approach towards the ”gamification” of well-being
which can be equated with happiness and hedonic measurement. They developed a Facebook game
well-being. In the resulting model explaining 52% assessing the participants’ well-being via the HFI
of the variance, both items loaded on a third factor, framework. Participants were able to track their
which was hence explained as the hedonic part of SWB over time and send their data enriched by
the HFI. Consequently, Huppert and So (2011) con- demographic information to the authors for scien-
cluded that the first two factors including ”all other tific purposes. The authors prove the feasibility of
items are measuring eudemonic aspects of well-being” well-being measuring via social networks, concluding
(p. 845) and explained the two eudemonic factors ”well-being games are a means to support the design
as positive characteristics and positive functioning. and management of complex institutions and virtual
Hence, the HFI is a framework measuring hedonic communities” (Hall et al., 2012, p. 8).
well-being in terms of positive emotion, as well as
Since the HFI intends to measure hedonic as well as
the eudemonic constructs of positive characteristics
eudemonic well-being it also reduces the risk embed-
and positive functioning.
ded in exclusively hedonic approaches. Veenhoven
In order to propose an operational definition of flour- (1984) outlined that ”people may in their heart know
ishing, Huppert and So (2011) defined the criterion that they are disappointed with life, but repress that
for flourishing as ”having all but one of the features of though, because they cannot deal with its conse-
positive characteristics and all but one of the features quence” (p. 44-45). Moreover, the diversification of
of positive functioning, together with positive emo- well-being on different features reduces the impact
tion” (p. 845). This again is founded upon the defi- of a single measure. Overstatement and misinforma-
nitions for depression and anxiety requiring at least tion, widely reported in SWB measures, are there-
one of the inverted features to be present. Hall, Kim- fore less likely (Veenhoven, 1984). Veenhoven (1984)
brough, Haas, Weinhardt, and Caton (2012) empha- moreover suggested to include ”non-verbal cues” (p.
sized that the operationalized definition suggested by 46) and ”expert ratings” (p. 47) into the assessment,
Huppert and So (2011) ”is an excellent representation but this is due to simplicity not considered any fur-
of current well-being literature and its multidimen- ther within this study. He also addressed the prob-
sional properties” (p. 3). A mathematical represen- lems of contextual and response-type biases as well
tation of the operational definition is provided: as different participants’ moods, why the well-being
questions will be answered in different ways. For
 ”overall well-being” he concluded that ”happiness can
n m
!
X X be assessed only by asking people about it” and that
HF I = pe ∗ Ic ∗ If ∗ cj + fk 
”self-ratings are to be preferred to ratings by others”
j=1 k=1
(Veenhoven, 1984, p. 62).

1, if |Pc | ≥ n − 1 Other techniques address the problem of biased in-
Ic =
0, else formation for example with a close link of the ques-
 tion to a certain event or activity. The ”Day Re-
1 if |P | ≥ m − 1
If =
f construction Method” (DRM) by Kahneman et al.
0, else (2004) identified the remembered well-being for each

15
16 2. Literature Review

activity and experience of the preceding day. The dividuals are first asked explicitly about the weather”
participants ”first revive memories of the previous (p. 6). In this study participants are consequently
day by constructing a diary consisting of a sequence asked about the weather or the temperature outside
of episodes. Then they describe each episode by an- before addressing HFI questions.
swering questions about the situation and about the
feelings that they experienced” (p. 1776). The re-
2.4 Machine Learning on Well-
view of the previous day causes that recent memories
loose dominance, so that errors and biases of recall Being
are reduced (Kahneman et al., 2004). The survey
So far little research has been conducted on the ex-
part of this method is based on the experience sam-
planation of well-being observations with machine
pling method developed by Diener (2000), as feelings
learning techniques. Although machine learning has
in different situations are aggregated toward an over-
gained growing importance for the analysis of high
all well-being measure. But deviating from the ESM,
dimensional, non-linear data3 , numerical problems in
Kahneman et al. (2004) proposed that the DRM al-
social science are rarely treated with machine learn-
lows for measuring a sufficient number of different
ing so far. The following chapter reflects the rare
events during just one day and that it is therefore
examples of machine learning within social sciences,
more efficient.
especially personality as in this study with due regard
All currently discussed well-being measures have one to the underlying principles of the various algorithms
characteristic in common: They aim for an average of used.
the participants’ well-being feelings either over time Two different types of machine learning need to be
and therefore different events or via different dimen- distinguished: (1) Firstly, supervised learning; refer-
sions measured on various scales. Diener, Emmons, ring to a setup, in which the dependencies of one
Larsen, and Griffin (1985, p. 13) claimed that single dependent variable or clustering upon the other in-
item measures are ”generally less reliable over time dependent variables, also referred to as predictors
than multi-item scales, are probably more suscepti- are learned. The dependent variable is present and
ble to acquiescence response bias, are more likely to known within the historic data, and is also referred to
be affected by the particular wording of the item, as the training set of data points. (2) Secondly and
may not be entirely suitable for parametric analysis in contrast, unsupervised learning means the identi-
since responses tend to be highly skewed, and do not fication of clusters within the given historic data set.
provide an assessment of the separate components The dependent variable is not part of the training
of subjective well-being.” A multivariate measure is set and describes some sort of higher order classifica-
therefore also used within this thesis. Larsen et al. tion assigning each training data point to a specific
(1985) also suggested that life satisfaction should cluster. Unsupervised learning is used to reduce com-
play a major role within the multivariate analysis, plexity, gain understanding of the data and interpret
since it accounts for the rather stable parts of well- complex data structures, while supervised learning
being and provides ”high temporal reliability” (p. is generally suitable for the prediction of future de-
14). The HFI embodies this finding as life satisfac- velopment and closing of gabs within the dataset.
tion, respectively positive emotion is multiplicatively (Hastie et al., 2009; Nilsson, 2005)
included in the HFI calculation.
Both approaches for understanding contexts within
Moreover, Kahneman and Krueger (2006) gave ad- data are computationally expensive, since the learn-
vice on how to measure well-being and report find- ing is based on repeated sequentially adaption of the
ings on the influence by changing weather condi- algorithm’s parameters. Thereby, depending on the
tions. Weather has been found to be an important algorithms characteristics local and global optima are
determinant of well-being (higher well-being on nicer found via a stepwise approach upon the objectives
days). On the other hand, according to Kahneman function gradient. (Hastie et al., 2009)
and Krueger (2006) this influence is eliminated, if ”in- 3 Compare public discussions on the fields of ’big data’.

16
2.4. Machine Learning on Well-Being 17

Supervised learning has been for example applied by Therefore, inputs and outputs have to be standard-
Minbashian, Bright, and Bird (2009), who analyzed ized on a scale between zero and one before processed
the applicability and accuracy of neural networks in by a neural network. The edge weights of the linear
comparison to multiple regressions on the prediction combination are computed via an iterative training
of work performance upon the big five trait taxonomy procedure, comparing the predicted output with the
(see chapter 2.2.2 Personality Traits). The authors training value and then adjusting the weights. Sev-
were able to ”identify the specific [. . . ] nonlinear rela- eral different learning functions for different types
tionships” (p. 540) and found ”superior performance of data and networks have been developed in recent
of the neural networks” (p. 540) regarding a relative years (e.g. backpropagation, scaled conjugate gradi-
accuracy measure. However, both methods achieved ent decent). Figure 2.6 illustrates the basic structure
comparable absolute accuracy values. of a neural network. For further details on the al-
gorithms applied in this study see the methodology
Neural networks, also referred to as artificial neural chapter (Anthony & Bartlett, 2009).
networks (ANN) and multi-layer perceptions (MLP),
are non-parametric classification and/or regression In the study on the predictability of work-
tools to solve supervised learning problems. They performance Minbashian et al. (2009, p. 554) found
4
consist of a usually fully-connected network with that ”neural networks performed as well or bet-
several layers of nodes. The first layer is the input ter than MR [multiple regression] equations [. . . ]
layer, including one node for each independent vari- with respect to a relational index of predictive ac-
able of the dataset. The last layer is the output layer, curacy” (p. 554). The use of neural networks is
which consists of one node for each dependent vari- therefore recommended for the prediction upon un-
able or class in classification problems. The layers known and complex data, especially ”when theory
in-between are ”hidden layers.” The number of hid- about the underlying functional relationships is not
den layers and number of nodes per hidden layer are strong” (p. 554). The neural networks applied in
the most important model parameters. Each node their study achieved prediction-original-correlation
performs a weighted linear combination of a bias con- of r = 0.55 at one hidden layer. Comparable re-
stant (usually 1) and all node values of the previous search was conducted by Martı́nez, Rodrı́guez-dı́az,
layer. The sum is processed through a non-linear Licea, and Castro (2010), who used neural networks
activation function, which is usually the sigmoid / extended by fuzzy systems (namely adaptive-neuro-
logistic regression function resulting in a node value fuzzy-inference-system) with personality data as in-
between zero and one (see Figure 2.5). puts in order to assign employees on different soft-
ware engineering roles. The algorithm is capable of
describing the roles upon personality profiles and fur-
ther more is used to develop decision rules for the
1
employee-role match.

One of the first applications of neural networks to


personality related prediction problems is found in
0.5 the study on workplace behavior by Collins and
Clark (1993). They utilized at that time newly de-
veloped neural networks for classification in a low
and high performance group. It was concluded that
the neural network performed at least as good as
-6 -4 -2 0 2 4 6
other classification methods. Nowadays, neural net-
Figure 2.5: Sigmoid / logistic regression function works also contribute to the understanding of hu-
man personality itself. Read et al. (2010) proposed a
complex, highly integrative neural network structure
4 There are edges between all nodes of two adjacent layers. in order to accurately model personality. Quek and

17
18 2. Literature Review

Input Hidden Output


layer layer layer

Independent variable 1

Independent variable 2
Dependent variable
Independent variable 3

Independent variable 4

Figure 2.6: Neural network example (1 hidden layer with 5 hidden nodes)

Moskowitz (2007) focused on the analysis of event– nikov kernels weight the neighbors influence on the
contingent recorded data in order to identify person- result upon their distance to the requested point,
ality patterns upon neural networks. which is the center of the kernel. Different kernels
embody various variables, as for example the number
Nevertheless, neural networks lack in well-
of neighbors included, or the bandwidth and scale of
interpretable results. While prediction accuracy and
the kernel, determining the kernel’s width. For each
mathematical expressiveness is comparably high,
independent variable a kernel with different charac-
the computed weights within the neural network
teristics can be used. See Figure 2.7 for a graphical
are rarely interpretable; especially if the neural net-
representation of possible kernel types (cf. Hofmann,
work consists of multiple hidden layers. For social
Schölkopf, & Smola, 2008).
problems however, interpretability is an important
consideration. Often, a high number of predictors Secondly, within the kernel different methods can be
explain a comparably small amount of variance, applied to calculate the dependent variable for the
so that identifying the important predictors and kernel center (the requested point). Most common is
individual influences is crucial for scientific results for example the average or a local linear regression.
(cf. Hainmueller & Hazlett, 2013). Figure 2.8 illustrates an Epanechnikov kernel with
Kernel smoothing algorithms provide an alternative a local linear regression estimator with one indepen-
allowing for an individual inspection of single pre- dent variable. The actual regression line results from
dictors. Kernel smoothing algorithms solve super- repeated application of the kernel algorithm for each
vised non-parametric regression problems by estima- step of the input scale.
tion upon the nearest neighbors of the new requested
Other machine learning algorithms have also been
data point in the training sample (Nilsson, 2005).
used for predictions related to personality. For ex-
The kernel describes the selection of the neighbor-
ample, Chittaranjan, Blom, and Gatica-Perez (2011)
hood in which the estimation takes place. Thereby,
used support vector machines (SVM) to predict the
the kernel size as well as the individual local fitting
big five personality traits upon long-term recorded
operations can be undertaken greater analysis and
smart-phone data. SVM were initially developed to
interpretation. Friedman (2006) provided a good
solve two-classes supervised classification problems
mathematical definition of kernel smoothing.
(see Cortes & Vapnik, 1995). Therefore, the al-
Several different kernels and estimation methods gorithm computes a hyperplane separating the two
have to be distinguished. Firstly, the kernels differ classes by maximizing the minimum distance be-
in shape. Uniform kernels are used for example in k- tween the training points and the hyperplane. In or-
nearest neighbor; Gaussian, triangular or Epanech- der to describe non-parametric structures SVMs ap-

18
2.4. Machine Learning on Well-Being 19

Uniform / Rectangular Epanechnikov Biweight

0.5
0.6 0.8

0.3 0.4

-2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2

Gauss Triangular

0.4 1

0.2 0.5

-4 -2 0 2 4 -2 -1 0 1 2

Figure 2.7: Different kernel shapes

ply kernel smoothing. The hyperplanes of all kernel Further developments of the SVM perform regres-
environments result in a non-parametric separation sions, as well. This is referred to as support vec-
between the two classes. Friedman (2006) provided tor regression (SVR, cf. Vapnik, Golowich, & Smola,
a good mathematical description of SVMs. (see also 1997). In comparison to the kernel smoothing algo-
Basak, Pal, & Patranabis, 2007; Cortes & Vapnik, rithms mentioned before (e.g. local linear regression)
1995) the function inside the kernel of SVMs does not min-

Dependent Kernel
variable environment

Kernel average
smoother fit

Original
function
Data points
Data points in kernel

Independent variable

Figure 2.8: Kernel-smoothing example: Epanechnikov kernel with local linear regression

19
20 2. Literature Review

Dependent Local fitting area


variable (kernel) with central
prediction line

Panelized
Accepted
variation
range ε

Data points

Independent variable

Figure 2.9: Support Vector Regression: Fitting inside the kernel

imize the training error (e.g. least squares), but in- side a kernel smoothing environment. Thereby, the
stead minimizes the generalization error bounds of regularization can either be done on a global level
the local linear function (Basak et al., 2007). All or for each kernel individually as for example in the
training points within an -environment of the lin- lazy lasso algorithm (Vidaurre, Bielza, & Larrañaga,
ear function are excluded from the error calculation. 2011).
All points exceeding this error bound are panelized
(see Figure 2.9). Thereby the SVM generates a more
general prediction model and is computational less
expensive than minimizing the entire training error
(Basak et al., 2007).

Social sciences often encounter the problem of com-


parably noisy and high dimensional data, since ques-
tionnaire answers are based on subjective judgments
and interpretations. Within machine learning fea-
ture extraction algorithms address this demand for a
reduction of dimensionality and identification of the
most important predictors in a model. Many algo-
rithms of varying complexity have been developed
over the last decades. Two different approaches can
be observed. Firstly, feature selection upon gener-
alized linear models as for example lasso regression,
least angle regression or elastic net regression (Efron,
Hastie, Johnstone, & Tibshirani, 2004; Tibshirani,
1996; Zou & Hastie, 2005). Those algorithms pan-
elize the number of predictors included in the linear
model. Secondly, this procedure can be applied in-

20
3. Research Questions

The research conducted tests for hypotheses on the methods? Are interaction effects beyond simple mul-
nature of personal well-being. Derived from the lit- tiplication of determinants as in an ANOVA impor-
erature on well-being each participant’s well-being is tant? And which of the algorithms is applicable for a
estimated to follow certain, still unknown rules. The sufficient prediction without over-fitting the dataset?
first assumption is based on the proposals by Brick- Therefore, this study broadly explores the power and
man and Campbell (1971) and Headey and Wearing the results characteristics of several machine learning
(1991), that well-being is characterized by a baseline, techniques developed in recent years with regard to
which is constant over mid periods. It is assumed well-being and personality data.
that their finding can be supported by this study’s
Aside from the prediction of the well-being baseline,
dataset. The first hypothesis is therefore as follows:
it is questioned how the actual well-being trajectory
Hypothesis 1: Each individual has a floats around the baseline and weather personality
well-being baseline, which is constanted and demographics influence these short-term well-
in the mid-run. being oscillations. Two additional hypotheses are
As stated in the literature review well-being has derived from this research question. Closely linked
been found to be dependent on personality and de- to the first hypothesis, it is expected that the indi-
mographics (Diener et al., 1999; Veenhoven, 1984). vidual well-being trajectory follows the baseline.
Obviously, this does not include daily oscillations, Hypothesis 3: Each individual’s well-
but should explain a large proportion of the vari- being trajectory floats around the well-
ance between participants’ well-being baselines. Pre- being baseline.
vious research focused on linear models and simple
explanation of variance to proof dependencies. In Related to the suggestion that the baseline is influ-
this study several non-linear and non-parametric ap- enced and predictable by demographics and psycho-
proaches are tested in order to predict the well-being metric measures (compare Hypothesis 2), it is ques-
baseline. It is questioned whether this approach tioned whether the trajectories characteristics can be
upon demographics and personality leads to higher reduced to certain personality traits and demograph-
proportions of explained variance than simple linear ics. Personality influences the way we deal with ex-
models and whether the individual well-being level ternal triggers as positive and negative affect. Only
can be predicted sufficiently. The second hypothesis repeated measuring of well-being over a certain time
tested is therefore stated as follows: frame allows for a comparison of different well-being
trajectories. This study does not cover external in-
Hypothesis 2: Each individual’s well-
fluences themselves, but questions whether there is a
being baseline is primarily dependent on
linkage between personality and the well-being tra-
and predictable by the individual’s psy-
jectory, which can be explained by machine learning.
chometric profile and demographic vari-
The forth hypothesis is consequently stated as fol-
ables.
lows:
Besides the aim for a sufficient well-being prediction,
Hypothesis 4: Each individual’s fu-
the analysis with machine learning techniques is ex-
ture well-being trajectory can be ap-
pected to provide a new perspective on the impor-
proximated upon its well-being baseline,
tance of the included single personality and demo-
personality, demographics and historic
graphic measures, as for example extraversion, neu-
well-being data.
roticism and agreeableness as well as age and ed-
ucation. Which of the determinants has a signifi- To summarize, this study tests the capabilities of sev-
cant influence when analyzed by advanced statistical eral machine learning algorithms in order to provide

21
22 3. Research Questions

insights regarding the dependencies between personal


well-being (dependent variable) and personality as
well as demographics (independent variables). This
research is conducted regarding two different per-
spectives. On the one hand side the first and second
hypothesis cover the fairly constant mid-term well-
being. On the other hand hypothesis three and four
refer to short-term changes within a person’s well-
being trajectory.

22
4. Methodology

4.1 Participants (62% of all participants). Nevertheless, participants


with lower educational degrees are covered as well.
This study is based on an online survey with four
For histograms of the demographic variables see Fig-
sequential questionnaires and an overall number of
ure 4.1.
126 questions. A first dataset with 85 participants
was generated by Hall et al. (2013) during a four Generally, this study is not dependent on a statisti-
weeks period in February 2013. The participants cally representative sample of the society, since none
were asked by email to answer one questionnaire each of the analyses are based on input variable distribu-
Wednesday over the given period. Out of 85 initial tions. More important is the sample’s completeness
respondents from the first questionnaire in week one in order to cover as many different input variable
66 participants completed all four questionnaires in settings as possible and fully represent the field of
entirely. Nine participants aborted after week two analysis. The age as well as the education variable
and another four participants after week three. From meets this requirement.
six participants only single values are missing.

Due to the small sample size it was decided to repeat


4.2 Apparatus and Materials
the survey during February 2014, one year after the In order to measure the independent and dependent
first series in order to avoid seasonal influences. An variables, empirical questionnaires have been used as
additional dataset with 343 respondents for the first described in the literature review. Next to the basic
questionnaire has been generated. The questions and demographics (gender, age and location), education,
the setting for the four questionnaires were equal to employment and health have been asked with a sin-
the once in 2013. Out of the 343 respondents, 296 gle question each as indicators for different life situ-
participants completed all four questionnaires (see ations. The personality traits are covered with three
Table 4.1). psychometric measures, namely the 44-item big five-
inventory test, the 13-item maximizer vs. satisficer
Dataset
test and a 3-item fairness measure (John et al., 1991;
Measure 2013 2014 combined John & Srivastava, 1999; Nenkov et al., 2008; Schmitt
N 85 343 428 & Dörfel, 1999).1 Moreover, the dependent variable
Ncomplete 66 296 362 personal well-being is measured by the human flour-
32 female 133 female 165 female ishing index (HFI) each week (see Huppert & So,
Gender 197 male
34 male 163 male 2011). These 10 weekly-asked HFI questions are ran-
domized in order to reduce the chance of recognition
Table 4.1: Participants descriptive statistics by the participants. All psychological and HFI ques-
tions were asked by providing a five to nine point
As expected, participants’ demographics differ be- Likert scale (cf. Likert, 1974). A weather control
tween the two samples. While the sample in 2013 is question at the beginning of each questionnaire elim-
internationally well distributed over America, Asia inates external weather influences on the recorded
and Europe, the sample in 2014 exists of 86% Ger- well-being (cf. Kahneman & Krueger, 2006).
mans, 5% English, 2% Asians and 2% Americans.
Both samples are fairly well distributed between gen- In order to reduce participants’ workload, the three
ders, but characterized by a high number of partici- psychometric measures were distributed over four
pants at the ages between 18 and 35 (78% of all par- weeks. The questionnairesfor the four weeks con-
ticipants). Due to the universal background within tained the following constructs and single questions:
1 For
the big five personality test and the maximizer-satisficer
the social networks the surveys were distributed in,
measure the construct’s goodness of fit has been proven by
the participants’ majority has a university degree confirmatory factor analyses.

23
24 4. Methodology

100 150 200 250 300 350 Location Gender

200
194
322
2013 2013
2014 167 2014

150
Participants

Participants
100
50
50

28
4 8 0 1
0

0
non disclosed USA Europe Asia female male other non disclosed

(a) Participants by Location (b) Participants by Gender

Age Education
200

Post−doctoral education 4
2013 2013
2014 Doctorate degree 23 2014
149 Professional degree 29
150

140
Master's degree 104
Participants

Bachelor's degree 102


100

Technical degree 3
Associate degree 1
Some college, no degree 13
50

Apprentiship 7
28 27 Higher education entrance qualification 73
11 Certificate of secondary education 2
1 2 4
0

Elementary school 1
< 18 18−25 26−35 36−45 46−55 56−65 66−75 76 +
No formal education 0
0 20 40 60 80 100

(c) Participants by Age (d) Participants by Education

Figure 4.1: Participants’ demographic structure

1. First week: Weather control question, 13-item an online survey application. The gathered data is
maximizer vs. satisficer test, 10-dimensional processed in Microsoft Excel and analyzed with R3 ,
human flourishing index, demographics an open-source math and statistics language. R has
been chosen since it is a ”flexible and powerful lan-
guage that many data analysts are now using” (Beau-
2. Second week: Weather control question, 44-item
jean, 2013, p. 1). R-code was implemented using
big five inventory test, 10-dimensional human
Rstudio4 , an open source R editor software.
flourishing index

3. Third week: Weather control question, 3-item 4.3 Data Retrieval Procedure
fairness test, 10-dimensional human flourishing Participants were invited via posts in social networks,
index emails as well as personal contacts to register with
their email address in a SurveyMonkey registration
4. Fourth week: Weather control question, 10- form during the weeks before the first survey was
dimensional human flourishing index sent. Registration is necessary in order to gather a
sufficient number of participants answering the fol-
lowing four questionnaires on the specific Wednes-
The surveys were conducted with SurveyMonkey2 , 3 See http://www.r-project.org
2 See http://www.surveymonkey.com 4 See http://www.rstudio.com

24
4.4. Analysis Procedure 25

days in February. As an incentive, each participant via email how to access their personal report from a
had the chance to win one out of two 30 Euro Ama- password-protected webserver at Karlsruhe Institute
zon vouchers. Moreover, for each participant com- of Technology (KIT) upon their personal code. Re-
pleting all four surveys 20ct were given to the Unicef ports were completely anonymous.
Childhood Foundation for their projects in Syria.
This combination of lottery and donation incentive
was chosen to attract egoistic as well as altruistic 4.4 Analysis Procedure
attuned participants. The lottery incentive is well Gathered survey data has been analyzed using differ-
established and has been found to increase response ent statistical and machine learning approaches. In
rates significantly (e.g. Deutskens, de Ruyter, Wet- total 13 independent variables and 4 HFI data points
zels, & Oosterveld, 2004). Contrasting, the literature were calculated per participant then standardized be-
on donations to charity is inconsistent and mainly tween zero and one for the descriptive analyses. In
still based on paper surveys. Some studies reported order to perform the machine learning algorithms the
a significant influence of donations on the response data has further been normalized to zero mean and
rate (cf. Deutskens et al., 2004; Robertson & Bel- standard deviation of one per variable. These include
lenger, 1978). Other research rejected the influence six demographics and seven psychometric measures,
(cf. Furse & Stewart, 1982; Hubbard & Little, 1988). calculated upon the single items well-described in lit-
To ensure anonymity each registered participant erature. An overview on the data dimensionality can
was assigned a personal random identification code, be gained from Figure 4.2.
which allows for an anonymous match of the four The following analyses were conducted exclusively
questionnaires per participant without the partici- on the described 13 + 4 variables. The four HFI
pants’ email-addresses or names. In addition, the variables have been averaged for some analyses (in
participants were able to access their personal re- particular regarding the first and second hypothe-
ports after completing the survey via this code sis). Several incomplete responses had to be removed
anonymously from a KIT webserver. from the dataset. Participants were included within
Each Wednesday morning in February (starting on the analyses, when on the one hand side all 13 in-
the 5 February 2014) SurveyMonkey automatically dependent variables were available and on the other
sent emails with an individual survey links dedicated hand the HFI values for at least three weeks could be
to the participant’s code to the list of registered par- calculated upon the ten measured values. Responses
ticipants. The email contained the individual link with incomplete independent variables or less than
to the current Wednesday questionnaire, a link to three HFI data points have been excluded. Regard-
unregister from the mailing list in order to abort fur- ing the personality measures two thirds of the sin-
ther participation and a note that the survey must gle items per measure had to be available for mean
be answered until the end of each given Wednesday. calculation. Before the basic characteristics of the
The email was sent at 1:00am CET, to ensure that selected machine learning algorithms are outlined in
American and Asian participants had sufficient time this chapter, the challenge of combining the two dif-
to answer the questionnaires. Additional reminders ferent datasets from 2013 and 2014 is addressed.
were sent at 3:30pm and 8:30pm CET. Answers were
accepted until Thursday morning 6:00 am for Euro- 4.4.1 Comparison of Datasets
pean participants and corresponding for other time
Since this study is based on two different datasets
zones.
(2013 and 2014), which are combined in order to gain
In order to provide a benefit for the participants a a sufficient number of participants for further anal-
personal report including the well-being trajectory ysis, the feasibility of this combination was tested
and the psychometric measures was generated for first. In order to test possible influences of the differ-
each participant one week after the final question- ences between the two datasets, a dummy variable
naire was completed. The participants were informed for the 2013 and 2014 dataset was introduced. After

25
26 4. Methodology

Figure 4.2: Independent and dependent variables

controlling for variables highly correlated with well- / linear model. Levene’s test for homogeneity was
being as neuroticism and extraversion (p < 0.001), conducted for ’dataset’ and shows no significance
this dummy variable ”Dataset” had no significant in- for rejecting the null hypothesis of equal variances
fluence on the well-being index averaged for each (p > 0.31).
participant over the four weeks (p > 0.38). The
null hypothesis of no influence by dataset cannot be However, there is a shift in means between the two
rejected. See Figure 4.3 for the detailed ANOVA datasets of 0.067 (∼7% of the entire scale between 0

Levene ’ s Test for Homogeneity of Variance ( center = median )


Df F value Pr ( > F )
group 1 1.0175 0.3138
360
Df Sum Sq Mean Sq F value Pr ( > F )
Neuroticism 1 4.648 4.648 322.456 < 2e -16 ***
Extraverted 1 0.749 0.749 51.942 3.63 e -12 ***
Agreeableness 1 0.148 0.148 10.251 0.00149 **
Optimisim 1 0.104 0.104 7.209 0.00760 **
Conscientious 1 0.379 0.379 26.291 4.90 e -07 ***
Maximizer 1 0.110 0.110 7.643 0.00601 **
Fairness 1 0.011 0.011 0.774 0.37956
Health 1 0.116 0.116 8.072 0.00476 **
Age 1 0.021 0.021 1.454 0.22876
Location 3 0.059 0.020 1.354 0.25684
Gender 1 0.104 0.104 7.211 0.00760 **
Education 1 0.006 0.006 0.419 0.51763
Job 1 0.032 0.032 2.252 0.13437
Dataset 1 0.011 0.011 0.744 0.38906
Residuals 345 4.973 0.014
---
Signif . codes : 0 ’* * * ’ 0.001 ’* * ’ 0.01 ’* ’ 0.05 ’. ’ 0.1 ’ ’ 1

Figure 4.3: Anova Type-I

26
4.4. Analysis Procedure 27

and 1), respectively 0.48 SD. The power of the t-test duction, background and related literature regard-
for ’dataset’ is therefore power = 0.54, which is not ing the algorithms’ application in social sciences is
sufficient to eliminate the type-II error. The depen- also given in chapter 2.4. Machine learning is not
dent variables’ (HFI) mean and standard deviation only utilized to solve the prediction problem result-
for both datasets are given in Table 4.2. ing from hypothesis two, but specific algorithms also
contribute to the evaluation of hypotheses three and
Dataset
four. Consequently, some of the algorithms men-
Measure 2013 2014 combined
tioned above were used in different contexts, con-
µHF I 0.4940 0.5617 0.5493 tributing to the hypotheses evaluation.
SDHF I 0.2035 0.1915 0.1954

4.4.3 Cross Validation and Testing


Table 4.2: Descriptive statistics for dataset
comparison
For prediction problems, in which accuracy is the
predominant target, over-fitting the data is the most
Furthermore, the analysis shows no significant differ-
crucial concern (Cawley & Talbot, 2010). Over-
ences in variances between the datasets, when com-
fitting revers to the fact that powerful algorithms
pared with a variance F-test. This applies for the
might actually fit the sample in such a precise man-
well-being index averaged for each participant over
ner that the computed results are not generally valid
the four weeks (p = 0.44) as well as if the individual
anymore when tested on new data. The algorithms
weeks’ data is compared (p = 0.23).
should explain the structural variance within the
To summarize, no evidence for a strong influence data, but should not fit the entire variance within
of the dummy variable ’dataset’ was found. Conse- the training set, including non-structural variance,
quently, the datasets are conjointly used within the also referred to as noise (Kuhn & Johnson, 2013).
following analyses.
The most common solution to prevent over-fitting
is cross-validation. This study utilizes k-fold cross-
4.4.2 Algorithms and Methods used
validation for testing different algorithms. The sam-
In this study different machine learning and statisti- ple is thereby divided into k equal subsamples,
cal methods were used to test the proposed hypothe- whereof k − 1 subsamples are used for training pur-
ses (see chapter 3 for details). First of all, descrip- poses and one for testing. Training and testing is re-
tive statistics provide an overview on the available peated k times, so that each subsample is once used
dataset. This is especially important to test the first as a testing sample. By this procedure, the accu-
hypothesis, whether subjective well-being follows an racy of each training round is always tested on data
individual baseline. Secondly, machine learning was points not yet used for training. Finally, the results
utilized to test the second hypothesis and predict of the k training-testing loops are averaged in order
subjective well-being (measured as human flourish- to receive one performance measure for the applied
ing index) upon the 13 predictor variables. Kernel set of parameters (e.g. root-mean-squared-error or
smoothing as well as support vector machines and R2 ). (Arlot & Celisse, 2010; Berrueta, Alonso-Salces,
neural network algorithms were implemented and & Héberger, 2007; Bouckaert & Frank, 2004; Kuhn,
tested on the prediction problem. Additionally, fea- 2008)
ture selection algorithms were deployed in order to
To avoid possible influences by random division in
identify the predictors’ importance. Table 4.3 pro-
k subsamples, repeated k-fold cross-validation con-
vides a list of the performed algorithms.
ducts the described process (folding the sample and
Due to the high number of algorithms and different testing each fold) several times. Repeated k-fold
configurations, details and parameter sets are drawn cross-validation initially proposed by Burman (1989)
together with their findings in the corresponding sub- increases the reproducibility, even if the variance
sections within the results’ chapter. A general intro- over the cross-validation increases (cf. Braga-Neto &

27
28 4. Methodology

Category Algorithms
K-nearest neighbor with local average smoother
Generalized additive model using loess (local linear regression)
Kernel Smoothing Generalized additive model using splines (linear regression after non-parametric transformation of inputs)
Non-parametric regression (local linear regression with varied kernels)
Support vector regression

Neural network with standard backpropagation learning


Neural Networks Neural network with scaled conjugate gradient learning (SCG)
Extreme learning machine (ELM)

Lasso regression
Feature Selection Elastic net regression
Lazy lasso regression

Table 4.3: Applied algorithms

Dougherty, 2004; Burman, 1989). This study’s re- training data together with the current parameters
sults are based on two-times repeated 10-fold cross- to the original learning algorithms for each loop. The
validation, if not explicitly otherwise stated differ- procedure is summarized in Figure 4.4. (Kuhn, 2008)
ently. Deviations occur, if certain algorithms’ low re-
Comparison of different algorithms and parameter
source consumption allows for a more often repeated
sets is either based on the root mean squared error
approach in order to enhance precision or if resource
RM SE or the coefficient of determination R2 . In
limitations require a less often repeated and fold ap-
caret, the latter is calculated as the squared corre-
proach in order to deliver results in reasonable time
lation coefficient between the prediction and the de-
at all.
pendent test values, since the number of predictors
is unknown for certain algorithms, so that an adjust-
In this study cross-validation is conducted with the
ment is not possible (Kuhn, 2008). Since the number
caret5 package in R for most algorithms (cf. Kuhn et
of data points within this study (Ncomplete = 362)
al., 2014). The caret package already includes imple-
is much higher than the maximal number of predic-
mentations of common algorithms, but also allows
tors (pmax = 13), the differences between R2 and ad-
defining custom models and parameter sets. The
justed R2 can be neglected (cf. Yin & Fan, 2001). For
package splits the data, loops over given parameter
the comparison of different algorithms upon k-fold
sets for each fold and repeated fold and passes
Thethe
caret Package
cross-validation the R2 is nevertheless not suitable,
5 Short for ”Classification and Regression Training.” since the testing set consists of only 100/k percent
More formally:

1 Define sets of model parameter values to evaluate


2 for each parameter set do
3 for each resampling iteration do
4 Hold–out specific samples
5 [Optional] Pre–process the data
6 Fit the model on the remainder
7 Predict the hold–out samples
8 end
9 Calculate the average performance across hold–out predictions
10 end
11 Determine the optimal parameter set
12 Fit the final model to all the training data using the optimal parameter set

There are options for customizing almost every step of this process (e.g. resampling technique,
Figure 4.4: Caret cross-validation procedure
choosing the optimal parameters etc). To demonstrate
(Kuhn, 2014) this function, the Sonar data from the
mlbench package will be used.
The Sonar data consist of 208 data points collected on 60 predictors. The goal is to predict the two
classes (M for metal cylinder or R for rock).
First, we split the data into two groups: a training28set and a test set. To do this, the createDataPartition
function is used:
4.4. Analysis Procedure 29

of the entire data. RM SE is therefore more accu-


rate and used in most cases for comparison in this
study. The dependent variable HFI is standardized
to zero mean and standard deviation one for all anal-
yses in order to ensure comparability. Consequently,
the RM SE can be interpreted as the root residual
sum of squares well known from linear regression and
ANOVA analyzes. Hence, the squared RM SE cor-
responds to the variance not explained by the model.
If R2 values are given for comparison, they refer to
the adjusted R2 calculated by the Wherry formula
(cf. Yin & Fan, 2001).

Besides cross-validation, bootstrapping is a well-


known alternative for validation (cf. Efron, 1979).
The procedure does not split the dataset as for exam-
ple k-fold cross validation does, but instead samples a
subset with replacement of the same size as the avail-
able dataset, fits the model upon this training subset
and tests it upon the remaining points. Due to re-
placements, 63.2% of the data is on average used for
training. Sampling, fitting and testing is repeated
several times for an averaged accuracy result (Ko-
havi, 1995). Bootstrapping has been tested on sev-
eral of the applied algorithms, but was not found to
result in significantly different accuracy. Addition-
ally, the newer bootstrap .632+ validation method
introduced by Efron and Tibshirani (1997) did not
provide reliable results, since the available dataset is
too small for the accuracy smoothing proposed by
Efron and Tibshirani (1997), such that over-fitting
occurred.

The caret package also implements parallel comput-


ing using the R multithread package doMC. Using
two to four cores simultaneously reduces time con-
sumption accordingly. The technique has widely
been used for the following analyses.

29
5. Results

5.1 Descriptive Analysis Weekly HFI variance explained


by HFI average
Descriptive statistics were reviewed in order to Week 1 79.96%
achieve a greater understanding of the well-being Week 2 88.72%
data. Firstly, the four weeks HFI data has been av- Week 3 86.21%
eraged per participant and compared to the weekly Week 4 79.76%

data. Correlation analysis shows that each week’s Average 83.66%


HFI is highly correlated with the averaged HFI. The
correlation coefficient between the four weeks’ HFI Table 5.2: Explained variance of weekly HFI by the
HFI average
and the averaged HFI lies between 0.89 and 0.94.
The correlation matrix (see Table 5.1) also indicates
higher correlation coefficients for consecutive weeks’ the standard deviation between participants’ aver-
HFIs (0.80 − 0.85) compared to other pairs of weeks aged HFI value (0.1954).
(0.71 − 0.78). These scores support previous findings
as for example by Lucas, Diener, and Suh (1996), Dataset

who reported a correlation coefficient of 0.77 for a Measure 2013 2014 combined
test-retest well-being survey over four weeks. Avg. SDwithin particpant 0.0787 0.0765 0.0769

SDbetween particpants 0.2035 0.1915 0.1954


HFI Week 1 Week 2 Week 3 Week 4
Ratio 2.59 2.50 2.54
Average 0.90 0.94 0.93 0.89
Week 1 - 0.82 0.76 0.71
Week 2 - 0.85 0.78 Table 5.3: Standard Deviation between and within
Week 3 - 0.80 participants’ HFI trajectory
Week 4 -
Figure 5.1 provides a descriptive impression of the
Table 5.1: Weekly HFI correlation matrix
HFI distribution1 , in which data is sorted by the
averaged HFI per participant. The solid dark line
Upon the comparably high correlation coefficients,
indicates the averaged HFI per participant; the er-
it can be concluded that the amplitude within each
ror bars cover each participant’s single weekly values
participant’s HFI is rather small compared to the
from minimum to maximum. The sample is well dis-
overall scale of well-being (between zero and one).
tributed over the whole well-being scale from zero to
This finding is supported by simple linear regressions
one with an average of 0.55 as presented in the den-
between the averaged HFI per participant and each
sity plot. The small peaks at zero and one result from
participant’s weekly well-being values. Each regres-
special characteristics of the HFI, which has several
sion includes the averaged HFI with an intercept as
input constellations leading to extremes at zero and
the independent and one week’s HFI value as the
one (see chapter 2.3).
dependent variable. As shown in Table 5.2, the aver-
aged HFI per participant accounts for 83.66% of the For each individual HFI data point, the hour of the
variance within the weekly HFI data. day has been recorded in order to control for possi-
ble influences. Except for a slight decrease for late
The high percentage of explained variance indicates
evenings after midnight, no significant influence was
a larger deviation between participants than within
observed. Moreover, the lower averages during nights
each participant’s HFI trajectory. This can also be
are based on a few values with high variance only and
found within the standard deviations (see Table 5.3).
are hence not further considered as outliers.
The averaged standard deviation within each partic-
ipant’s HFI values (0.077) is 2.5 times smaller than 1 Participants with n ≥ 3 HFI data points included.

30
5.1. Descriptive Analysis 31

1.00 1.00

0.75 0.75

HFI Index
HFI Index

0.50 0.50

0.25 0.25

0.00 0.00

0 100 200 300 0 1 2


Participants Density

Figure 5.1: HFI distribution and density

In order to check for multicollinearity, a graphical pendencies (Belsley, 1991; Mason & Perreault Jr.,
representation of the correlation matrix for all vari- 1991). As a result, multicollinearity is not further
ables in the dataset is given in Figure 5.2. It is found considered, so that multivariate models can be ap-
that none of the input variables are highly correlated plied without previous feature reductions. (cf. also
to others (|r| ≤ 0.44 ∀ bivariate correlations). The Kuhn & Johnson, 2013)
strongest correlation was found between age and ed-
ucation. Additionally, the condition of the input ma-
trix is 15.6, indicating moderate, but no strong de-

Neuroticism 1.0
Extraversion

Agreeableness
0.8
Optimisim

Conscientious

Maximizer 0.6

Fairness

Health 0.4
Age

Location
0.2
Gender

Education

Job 0.0
Extraversion
Neuroticism

Optimisim

Conscientious

Maximizer

Fairness

Health

Age

Location

Gender

Education

Job
Agreeableness

Figure 5.2: Correlation matrix (absolute values)

31
32 5. Results

5.2 Generalized Linear Model of 0.54 and a RM SE 3 of 0.68 as given in Figure


5.3. The non cross-validated standard linear model
In terms of advanced machine-learning algorithms, fitted to the entire dataset reaches an only slightly
the generalized linear model (GLM) is an important better RM SE of 0.66, so that over-fitting is an un-
benchmark. Therefore, a GLM including all 13 pre- founded concern for this model. The results are rel-
dictors and the averaged HFI as dependent variable atively equal for both combined datasets: for 2013
is conducted with 10-times repeated 10-fold cross- a RM SE = 0.67 and for 2014 a RM SE = 0.69 is
validation in order to ensure a highly reliable and re- achieved.
peatable result. The GLM is a generalization of the
standard linear regression2 in order to allow for non- Compared to the SD of the averaged HFI (normal-
normal distributed dependent variables (cf. Nelder ized to SD = 1) the GLM predicts the independent
& Wedderburn, 1972), which is in this case not nec- variable 32% more accurate than a simple average
essary, since the HFI variable has been normalized. prediction. Each predictor’s importance measured
However, the GLM is available in the caret R pack- by the absolute value of the t-statistic is given in
age for cross-validation (cf. Kuhn, 2008) and conse- Figure 5.4.
quently used instead of a standard linear model with Found results support previous research identifying
similar results. The optimization results in an R2 neuroticism and extraversion as the by far most im-
2 Also referred to as ordinary least squares (OLS). 3 Referring to the normalized data with SD = 1.

Generalized Linear Model

358 samples
13 predictors

No pre - processing
Resampling : Cross - Validated (10 fold , repeated 10 times )

Summary of sample sizes : 322 , 322 , 322 , 322 , 322 , 322 , ...

Resampling results

RMSE Rsquared RMSE SD Rsquared SD


0.678 0.537 0.0834 0.108

Figure 5.3: GLM fitted with caret package

Neuroticism
Extraversion
Conscientious
Health
Gender
Optimisim
Maximizer
Age
Job
Agreeableness
Location
Fairness
Education

0 20 40 60 80 100

Importance

Figure 5.4: Variable importance in GLM (t-staticic)

32
5.2. Generalized Linear Model 33

Coefficient Plot
Job
Education
Gender
Location
Age
Coefficient Health
Fairness
Maximizer
Conscientious
Optimisim
AgUHHDEOHQHVV
Extraversion
Neuroticism
(Intercept)

−0.4 −0.2 0.0 0.2


Value

Figure 5.5: GLM Regression coefficients with standard error bars

portant factors (DeNeve & Cooper, 1998; Haslam et Generalized Linear Model
al., 2009; Steel et al., 2008), followed by conscious- 358 samples
13 predictors
ness and the self-reported physical health. For the
No pre - processing
regression coefficients see Figure 5.5. Resampling : Cross - Validated (10 fold , repeated 10
times )
As expected, neuroticism is negatively and extraver-
Summary of sample sizes : 323 , 322 , 322 , 322 , 322 ,
sion positively correlated with the HFI. According to 322 , ...

the categorical variables, gender is negatively corre- Resampling results


lated, meaning male participants tend to show lower RMSE Rsquared RMSE SD Rsquared SD
0.999 0.0242 0.21 0.0321
well-being than female. Education, as well as fair-
ness, location, age and job have no significant in- Coefficients :
Estimate Std . Error t value Pr ( >| t |)
fluence (p > 0.1). Remarkable is the comparably ( Intercept ) -0.003128 0.052704 -0.059 0.9527
Neuroticism 0.039829 0.069191 0.576 0.5652
strong negative correlation of the personally per- Extraverted -0.059986 0.059779 -1.003 0.3163
Agreeableness 0.044546 0.060848 0.732 0.4646
ceived health. The healthier the participant judges Optimism 0.004190 0.056109 0.075 0.9405
Conscientious -0.044208 0.059889 -0.738 0.4609
himself to be, the lower is the measured well-being Maximizer 0.078849 0.059527 1.325 0.1862
Fairness -0.064784 0.054269 -1.194 0.2334
index. Health -0.020235 0.059328 -0.341 0.7333
Age -0.053825 0.063193 -0.852 0.3949
In order to test for possible interactions, the GLM Location 0.089265 0.053402 1.672 0.0955
.
was fitted with linear interaction terms. The non Gender 0.002496 0.059480 0.042 0.9666
Education -0.011433 0.062806 -0.182 0.8557
cross-validated fit has an RM SE of 0.55 (compare Job -0.007346 0.055374 -0.133 0.8945
---
the GLM without interactions: RM SE = 0.66) with Signif . codes : 0 ’* * * ’ 0.001 ’* * ’ 0.01 ’* ’ 0.05
’. ’ 0.1 ’ ’ 1
a significant, positive interaction term for optimism
* age (p < 0.05). However, if the GLM with interac- Figure 5.6: GLM for participants’ in person well-
tions is 10-times repeated 10-fold cross-validated, the being variance
accuracy drops to RM SE = 0.83. Consequently, the
interaction terms do not explain structural variance
but rather over-fit the data.
being prediction problem resulting from the second
Mentioned results are for the general well-being pre- hypothesis, the GLM is also applied to provide ba-
diction problem with the averaged well-being index sic knowledge according to hypothesis three and four
per person as dependent variable. Besides this well- aiming for an understanding of the in-person well-

33
34 5. Results

being variance (see chapter 3). The results (see Fig- be applied through a distance parameter (1 for Man-
ure 5.6) indicate that no linear dependence exists be- hattan and 2 for Euclidian metric; cf. Hechenbich-
tween the 13 predictor variables and the dependent ler & Schliep, 2004). Furthermore, differing kernels
variable, which is the normalized standard deviation including Gaussian, Epanechnikov and the standard
between the four HFI measures per participant. All uniform, also referred to as rectangular kernel, can
predictors are not significant (p > 0.05) and the over- be applied and compared.
all 10-times repeated 10-fold cross-validated model
The results show a slight superiority of the Euclidian
explains less than 1% of the variance within the par-
metric for all kernels, why the l1 -metric is not fur-
ticipants’ HFI standard deviation (RM SE = 0.999).
ther considered. The prediction accuracy is best for
A similar analysis was conducted on the slope of each the Epanechnikov kernel at k = 22 (RM SEEpan. =
participant’s well-being trajectory. To do so, each 0.792). The Gaussian kernel and the uniform kernel
participant’s four HFI data points were separately perform best for k = 12 (RM SEGaus. = 0.794 and
fitted with a linear regression. The regression coef- RM SEU ni. = 0.796). Figure 5.7 provides a graphi-
ficient indicating the slope was then normalized and cal representation. Nevertheless, all results are sig-
used as dependent variable within the GLM. How- nificantly worse than the GLM (RM SE = 0.678).
ever, the resulting GLM does not explain any vari- Given results already indicate that a static local
ance between the participants’ well-being slope upon structure might not be present within the data.
the 13 predictor variables (RM SE > 1). None of the
predictors had a significant influence (p > 0.05). However, the importance of the variables differs from
the GLM’s variance importance. As seen in Figure
5.8 neuroticism gains even more importance, while
5.3 Kernel Smoothing Algo- the demographics lose influence on the dependent
variable HFI.
rithms
The following kernel smoothing algorithms are ap-
5.3.2 Non-parametric Regression
plied to solve the general prediction problem result-
ing from the second hypothesis including the per- Non-parametric regression refers to algorithms,
participant averaged HFI as the dependent variable which calculate a local linear regression within a
and the 13 demographic and personality variables as kernel environment instead of averaging the nearest
predictors. All variables are normalized to zero mean neighbors. Three different non-parametric regression
and standard deviation one. algorithms have been tested, namely a Generalized
Additive Model using LOESS, a Generalized Additive
5.3.1 K-nearest Neighbor Model using Splines and Nonparametric Regression
(see Hayfield & Racine, 2013).
The easiest kernel method is a uniform kernel, includ-
ing the k-nearest neighbors of the requested point
into the analysis. For the k-nearest neighbor algo- 5.3.2.1 LOESS
rithm the dependent variables’ value of these k neigh-
The LOESS (locally weighted scatterplot smoothing)
bors within the training set are averaged. In R the
algorithm (see Cleveland, 1979; Cleveland & Devlin,
algorithm is implemented using the kknn package (cf.
1988) fits a linear or quadratic regression within a k-
Hechenbichler & Schliep, 2004; Schliep & Hechen-
nearest neighbor environment with a uniform shape.
bichler, 2014).
The kernel’s size is defined by parameter α, the pro-
The implemented algorithm allows for an adjustment portion of training data points included in each ker-
of the metric, by which the distance for the k-nearest nel. For α = 1 all training points are included in
neighbors are calculated. By using the Minkowski every kernel, while α = 0.25 takes the 25% near-
distance the l1 - (Manhattan-), as well as the l2 - est points of the entire training data into the kernel.
(Euclidian-) metric and graduations in-between can LOESS consequently turns into a GLM for α = 1.

34
5.3. Kernel Smoothing Algorithms 35

Kernel
rectangular epanechnikov gaussian
RMSE (Repeated Cross−Validation)

0.82

0.81

0.80

0.79

20 40 60 80

Neighbors

Figure 5.7: RMSE for k-nearest neighbor using Euclidian metric

Neuroticism
ExtraverVLRQ
Conscientious
Health
Agreeableness
Maximizer
Optimisim
Age
Fairness
Education
Location
Gender
Job

0 20 40 60 80 100

Importance

Figure 5.8: Variance importance for k-nearest neighbor using Euclidian metric

35
36 5. Results

RMSE (Repeated Cross−Validation)

0.4 0.6 0.8 1.0

Span

Figure 5.9: RMSE for gamLoess

The distance calculation for the neighborhood defi- 5.3.2.2 Splines


nition is conducted with the tri-cube weight function:
(1 − (distance/max(distance))3 )3 . (Hastie, 2013)
A different smoothing can also be achieved using
splines (cf. de Boor, 2001). Instead of using kernels,
The algorithm is implemented using the caret pack- the independent variables are thereby steadily trans-
age’s gamLoess model. GamLoess implements the formed using splines before integrated in the GAM.
LOESS algorithm separately for each independent The model is tuned upon the degrees of freedom (df )
variable within a Generalized Additive Model (GAM; parameter, which controls the degrees of freedom for
cf. Hastie & Tibshirani, 1986; Wood, 2004). Due the spline function (the more degrees of freedom, the
to high computational costs, only the linear regres- higher the adaption to local structures). Two degrees
sion was conducted. As seen in Figure 5.9 the ac- of freedom lead to a fit with linear regression (cf. also
curacy converges towards the GLM’s accuracy at Hastie et al., 2009). Analogous to the gamLoess al-
0.678, when α is close to one. However, an in- gorithm the results demonstrate that an adaption to
crease in accuracy cannot be observed, when α is local structures does not increase the model’s accu-
reduced. This result is in line with the previously racy. The best fit is achieved for df = 2, the linear
mentioned low accuracy of the k-nearest neighbor al- model which was already tested with the GLM (see
gorithm. Noticeable is the RM SE drop for α = 0.32, Figure 5.10).
4
which equals approximately 103 training points in-
cluded in the local regression. Even this configu-
Even though a small improvement using splines was
ration (RM SE = 0.753) does not outperform the
expected and not achieved, those results are not
GLM.
astonishing; splines fit each independent variable
within the GAM independently and are not capa-
4 For ble of modeling interdependencies (cf. Hastie et al.,
10-fold cross-validation with 90% ∗ 362 training points
∗ 0.32 = 103. 2009).

36
5.3. Kernel Smoothing Algorithms 37

1.8

1.6
RMSE (Repeated Cross−Validation)

1.4

1.2

1.0

0.8

0 10 20 30 40 50

Degrees of Freedom

Figure 5.10: RMSE for gamSplines

5.3.2.3 npreg For each cross-validation run, the kernel bandwidth


for each input variable is computed via Kullback-
The most advanced kernel smoothing algorithm ap- Leibler cross-validation (Hurvich, Simonoff, & Tsai,
plied in this study is computed upon the np-package 1998) or least-squares cross-validation (Li & Racine,
in R (see Hayfield & Racine, 2007, 2013). The npreg 2004; Racine & Li, 2004), which is applied to compare
function computes a kernel for each independent vari- algorithms upon RM SE in this study. In contrast,
able and applies a local linear regression within the the Kullback-Leibler cross-validation compares dif-
kernel. The optimal kernel parameters are indepen- ferent bandwidths upon the Akaike information cri-
dently data-driven optimized for each independent terion (AIC), which compares the goodness of fit with
variable. Thereby a different bandwidth results for the model’s complexity. As a result of bandwidth se-
each of the independent variables (Hayfield & Racine, lection and parameter comparison, two nested cross-
2007, 2013). One of the most important advantages validations with correspondingly high computational
of this algorithm is that continuous as well as cate- costs have to be performed in order to test each band-
gorical, unordered variables (as present in this study) width specification on several folds. Since the algo-
can be included in the regression (Racine & Li, 2004). rithm was not available for the caret cross-validation
The algorithm is consequently capable of predicting package, a custom model was implemented.
upon mixed datasets.
The algorithm moreover uses either local-linear re-
The mentioned algorithm can either be computed gression (ll) or the local-constant estimator (lc) by
with a Gaussian, an Epanechnikov or a linear ker- Nadaraya (1964) and Watson (1964). The latter is an
nel for continuous input data. Categorical data is average smoother, similar to the k-nearest neighbor
calculated with an Aitchisonaitken or Liracine ker- smoother, but contrarily computes different band-
nel (Aitchison & Aitken, 1976; Titterington, 1980). widths and scale factors for each independent vari-
For this study, the categorical predictors (location, able (Hayfield & Racine, 2007, 2013).
job and gender) were fitted upon the Aitchisonaitken
kernel only. The results (see Figure 5.11) show that the local-

37
38 5. Results

Kernel Regression Estimator Kernel Regression Estimator


ll lc ll lc
RMSE (Repeated Cross−Validation)

RMSE (Repeated Cross−Validation)


0.80 0.80

0.75 0.75

0.70 0.70

uniform epanechnikov gaussian uniform epanechnikov gaussian

Continuous Kernel Type Continuous Kernel Type

Figure 5.11: RMSE for npreg with least-squares cross-validation (left) and Kullback-Leibler cross-validation
(right)

linear regression is more accurate than the local- graphical representation in Figure 5.13 presents the
constant estimator and reaches the GLM perfor- partial, almost linear (kernel bandwidths  n) re-
mance for the Epanechnikov kernel with least squares gressions. The predictors were abbreviated to sim-
cross-validation (RM SE = 0.682; RM SE SD = plify the analysis5 .
0.065). The uniform kernel with local-linear regres-
High dimensionality of the input data masks several
sion and Kullback-Leibler cross-validation does not
non-linear linkages of certain independent variables.
reach sufficient accuracy (RM SE > 5), so that the
If less important independent variables are removed
corresponding data point is not included in the chart.
from the analysis, they will become known. Figure
Besides the models’ accuracy, the variance between 5.15 shows selected subsets of independent variables
several cross-validation loops is an important aspect with reached performance measures. All calculations
to evaluate the model’s prediction capability. Upon were conducted upon least-squares cross-validation
10-fold cross-validated model selection Figure 5.12 with local linear regression within Epanechnikov ker-
shows the RM SE density plots. The Epanechnikov nels to fit the bandwidths and two-times repeated
kernel provides the smallest variance between CV 8-fold cross-validation to evaluate the performance.
runs, followed by the Gaussian and then the linear Due to the computational costs, only a limited num-
kernel. For the local-constant estimator the variance ber of repetitions and subsets could be tested.
is even smaller compared to the local-linear regres- Certain subsamples of the input data achieve almost
sion, but the latter performs better regarding RM SE as good of accuracy as the original model including
mean (see Figure 5.12). all independent variables. This applies to RM SE as
well as the RM SE standard deviation. For example,
The algorithm has also been tested with higher kernel
the independent variables’ subset including the big
orders (kernel order = 2 and 4), but no accuracy
five personality traits, health and the maximizer vs.
gains could be realized. Consequently, the following
satisficer test achieved an error of RM SE = 0.691,
analyses apply secondary Epanechnikov kernels only.
which is only one percent worse than the best full
Due to the variable bandwidth and scale estimations model fit. A graphical representation of the depen-
for the independent variables, npreg usually allows dencies within this subsample fit is given in Figure
for an advanced analysis of the predictors’ impor- 5.16. The fact that subsamples of the independent
tance. Since the npreg algorithm does not predict variables reach similar accuracy leads to the conclu-
the averaged well-being data more precisely than the 5 Abbreviations: N - Neuroticism, E - Extraversion, A -
GLM in this case, the variables’ importance just re- Agreeableness, O - Optimisim, C - Conscientious, M - Max-
imizer, F - Fairness, H - Health, Age - Age, L - Location,
flects the GLM predictor importance. However, the G - Gender, Edu - Education, J - Job.

38
5.3. Kernel Smoothing Algorithms 39

0.4 0.6 0.8 1.0

gaussian gaussian
ll lc
6

0
uniform uniform
ll lc
6
Density

0
epanechnikov epanechnikov
ll lc
6

0.4 0.6 0.8 1.0

RMSE

Figure 5.12: RMSE density plot for 10-fold cross-validation runs (kernel bandwidth selection upon least-squares
cross-validation)

sion that the correlation between the predictors has The overall model shows a small positive linear influ-
an influence when fitted locally. ence of age, but those results are not obtained from
time series-based measurement and are consequently
The maximizer-satisficer measure has been found to not corrected for influences by different cohorts.
have a U-shaped partial influence in many subsets,
even if the overall model fits almost linear (very large
kernel bandwidth; see Figure 5.13). In contrast to Moreover, the negative influence of self-perceived
the intuitive suggestion that maximizers have lower health already identified by the GLM was confirmed
well-being than satisficers, maximizers seem to be by non-parametric regression. None of the calcu-
happier than the average. This is further supported lated predictor subsets showed a positive influence
when age, as the predictor most correlated with the of a healthy lifestyle, although positive correlation
maximizer-satisficer variable is included in the model has widely been discussed in literature (Diener et al.,
(see Figure 5.17). Directly compared to the predic- 1999; Jorm & Ryan, 2014; Lacey et al., 2008).
tors consciousness and agreeableness, the maximizer-
satisficer predictor explains less variance than con-
sciousness (higher RM SE), but more than agree- An interesting observation was made when the pre-
ableness (compare Figure 5.14). dictors were reordered. The algorithm results in
different accuracies for different predictor orders
The U-shaped relationship discussed in the literature (cf. Figure 5.15), which are stable during cross-
between age and well-being (cf. Blanchflower & Os- validation. The algorithm calculates different band-
wald, 2008) could not be observed within the dataset. widths for different predictor orders.

39
40 5. Results

0.5

0.5

0.5

0.5
HFI

HFI

HFI

HFI
−0.5

−0.5

−0.5

−0.5
−1.5

−1.5

−1.5

−1.5
−2 0 1 2 3 −3 −1 0 1 2 −3 −1 0 1 2 −3 −1 0 1 2
N E A O
0.5

0.5

0.5

0.5
HFI

HFI

HFI

HFI
−0.5

−0.5

−0.5

−0.5
−1.5

−1.5

−1.5

−1.5
−3 −2 −1 0 1 2 −3 −1 1 2 3 −4 −2 0 1 2 −1 0 1 2 3
C M F H
0.5

0.5

0.5

0.5
HFI

HFI

HFI

HFI
−0.5

−0.5

−0.5

−0.5
−1.5

−1.5

−1.5

−1.5
−1 0 1 2 3 4 non disclosed Asia female male −2 −1 0 1
Age L G Edu
0.5
HFI
−0.5
−1.5

employed
J

Figure 5.13: npreg predictors’ partial regression influence

Resampling results across tuning parameters :

subset RMSE Rsquared RMSE SD Rsquared SD


N E 0.735 0.459 0.0628 0.145
N E C 0.697 0.507 0.0745 0.135
N E M 0.715 0.491 0.0583 0.139
N E A 0.726 0.471 0.0596 0.131

Tuning parameter ’ regtype ’ was held constant at a value of ll


Tuning parameter ’ ckertype ’ was held constant at a value of epanechnikov
Tuning parameter ’ ckerorder ’ was held constant at a value of 2
Tuning parameter ’ bwmethod ’ was held constant at a value of cv . ls

Figure 5.14: npreg accuracy for reduced predictor dimensionality

40
5.3. Kernel Smoothing Algorithms 41

Resampling results across tuning parameters :

subset RMSE Rsquared RMSE SD Rsquared SD


N E A O C H M F Age L G Edu J 0.702 0.51 0.0773 0.127
M H E A C N O F Age L G Edu J 0.762 0.513 0.285 0.131
N E A O C H M 0.691 0.522 0.0682 0.118
N E A O C 0.703 0.509 0.0695 0.128
O C E A N 0.701 0.513 0.0688 0.13
M N C E 0.692 0.522 0.0683 0.119
N E H M 0.708 0.499 0.0749 0.139
E M N H 0.709 0.498 0.0754 0.14
M E N H 0.708 0.499 0.0749 0.14
N E H 0.723 0.478 0.085 0.152
N E 0.728 0.471 0.0757 0.147
N H M 0.748 0.446 0.0713 0.133
N M H 0.744 0.452 0.0758 0.138
Age M N H E 0.704 0.505 0.0737 0.139
M H E A C N O F Age 0.7 0.512 0.0794 0.121
Age G J L 1 0.0367 0.0859 0.0373
N H 0.766 0.421 0.0829 0.138
N M 0.774 0.412 0.0584 0.11
N A 0.768 0.418 0.0638 0.114
N A E 0.724 0.477 0.0676 0.132

Tuning parameter ’ regtype ’ was held constant at a value of ll


Tuning parameter ’ nmulti ’ value of epanechnikov
Tuning parameter ’ ckerorder ’ was held constant at a value of 2
Tuning parameter ’ bwmethod ’ was held constant at a value of cv . ls

Figure 5.15: npreg accuracy for reduced predictor dimensionality


1.0

1.0

1.0
HFI

HFI

HFI
0.0

0.0

0.0
−1.0

−1.0

−1.0

−2 0 1 2 3 −3 −1 0 1 2 −3 −1 0 1 2
N E A
1.0

1.0

1.0
HFI

HFI

HFI
0.0

0.0

0.0
−1.0

−1.0

−1.0

−3 −1 0 1 2 −3 −2 −1 0 1 2 −1 0 1 2 3
O C H
1.0
HFI
0.0
−1.0

−3 −1 1 2 3
M

Figure 5.16: npreg predictors’ partial regression influence for reduced predictor dimensionality (1)

41
42 5. Results

0.0

0.0
0.0

0.0
0.0−2.0 HFI

HFI
HFI

HFI

0
HFI

HFI
Maximizer, Neuroticism, Conscientious, Extraversion as dependent Neuroticism, Extraversion, Age as dependent variables:
variables:
0.0−2.0

0.0−2.0
0.0−2.0

0 −2

0 −2
1.0

1.0
1.0

−2 −0.5 0 1.0
1

0 1
HFI
−2.0 HFI

HFI
HFI

HFI
HFI

HFI
HFI
-2 - 1 0

-2 - 1 0
−3 −3−1 −11 1 3 3 −2 −2 0 10 21 32 3
HFI

HFI
HFI

HFI
−2 0 1 2 3 −3 −1 1 2

HFI

HFI
−0.5

−0.5
−0.5
M M N N

−2

−2
−2
N E
−2.0

−2.0
−2.0
−2.0

−2.0
−2.0

−2.0
−3 −1 0 11 2 33 −2
−3 1 −100
0 11 1 22 333 −2 -2 - 10 0 1 12
-3−1 −1
−3 −3 -1 1 13 3 -2 -−2
−2 10 21 32 3 −2−2 0 0 1 1 2 233 3
3 -3−3
−3 - 2 −1
-1−10 11 1 2
2 2
M NM N
M N N E
M M N N NN EE
1.0 1.0

1.0
1.0 1.0

1.0
1
0.0

0.0
0.0

0.0

-2 - 1 00
1.0

1.0
0.0−2.0 HFI

HFI
HFI

HFI

HFI HFI HFI


HFI HFI

HFI HFI
HFI

HFI
HFI
−0.5 −0.5

−0.5
−0.5 −0.5

−0.5 −0.5
HFI

−0.5

0 −2
−2.0

0.0−2.0
0.0−2.0
0.0−2.0

−2.0
−2.0 −2.0

−2 −2.00−2.0
HFI
HFI
−2.0 HFI

HFI
HFI

−3 -2−1-1 01 21 3 2 −2
−3 1−1−1
00 111 222 333 −2 -1 0
−3 −3
-3 −1 −1 1 21 2 -2 -−3
−3 −1 1 21 2 −1 0 111 22 3
33 4
−2.0

−2.0

M NM N
−3 −2 −1C 0 1 2 −3−3 −2 −1 E 00 11 2 2 −3 −1 Age
0 1 2

−2
C C E E Age
−2.0

−2.0
−2.0

C AC A
Maximizer, Neuroticism, Conscientious, Agreeableness, Extraversion, Maximizer, Neuroticism, Age as dependent variables:
Optimism,
−3 −3Age −1as−1
dependent
1 21 variables:
2 −3 −3−1 −1 1 21 2 −1−1 1 1 33
1.0 1.01.0

1.0
1.0 1.01.0

1.0 1.0
0 1 2

2
C C E E Age
Age

1
1.0
HFI

HFI
HFI

HFI HFI
−0.5

−0.5
−0.5

−0.5 −0.5
HFI

HFI
0
HFIHFI

HFIHFI
−0.5−0.5

−0.5
HFI
−0.5
−0.5
−2.0

−2.0
−2.0

−2.0 −2.0
−2

−2
−2.0−2.0

−2.0

−3 −2 −1 0 1 2 −3−3 −2 −1 00 11 2 2 −3 −1 0 1 2
−2.0
−2.0

−3 −1 1 2 3 −2 0 1 2 3 −3 −1 1 2 3 −2 0 1 2 3
C AC A
−3 −1M 0 1 2 −3−3 −1
−1N 0 0 1 1 2 2 −3 −1 0 1 2
Maximizer Neuroticism
M N
E OE O
1.0

1.0
1.0

1.0
1.0 1.0

1.0 1.0

0 1 2
HFI

HFI
HFI

HFI
−0.5

−0.5
−0.5

−0.5
HFI HFI

HFI HFI
−0.5−0.5

−0.5

HFI
−0.5
−2.0

−2.0
−2.0

−2.0

−2
−2.0−2.0

−2.0−2.0

−3 −1 0 1 2 −3
−3 −1
−1 0 0 1 1 2 2 −3 −1 0 1 2
−3 −2 −1 0 1 2 −3 −1 0 1 2
E OE −1 0 O1 2 3 4
−1 0 1C 2 3 4 −1 0 1A 2 3 4
Age
Age Age
Neuroticism, Extraversion, Conscientious as dependent variables:
1.0

1.0
1.0
1.0

0 1 2

0 1 2
HFI

HFI
−0.5

−0.5
HFI
HFI

HFI

HFI
−0.5
−0.5
−2.0

−2.0

−2

−2
−2.0
−2.0

−1 0 −1
−3 1 20 31 42 −3−1 0 −11 02 13 24 −2 0 1 2 3 −3 −1 0 1 2
Age
E Age
O N E
1.0

0 1 2
HFI
−0.5

HFI
−2.0

−2

−1 0 1 2 3 4
−3 −2 −1 0 1 2
Age
C

Figure 5.17: npreg predictors’ partial regression influence for reduced predictor dimensionality (2)

42
5.4. Neural Network Algorithms 43

5.3.3 Support Vector Machines (SVM) cost of constraint violation does not have an influ-
ence on this result. When the kernel width is re-
This study’s data was then tested for prediction accu-
duced with increasing Sigma, the RM SE increases
racy using support vector machines (SVM; cf. Vapnik
and approaches the independent variables standard
et al., 1997). SVMs solve kernel smoothing problems
deviation, which is normalized to one. For extremely
by minimizing the error bounds of a linear regression
small σ, however, the performance drops again. Re-
within a local kernel environment. Therefore, it does
ducing the influence of points at the far end of the
not differ significantly from the kernel smoothing al-
predictor dimension space is consequently found to
gorithms previously mentioned. Consequently, it is
be beneficial for accuracy.
not remarkable, that the SVM does not provide any
value added for the prediction of personal well-being
upon the 13 predictor variables. 5.4 Neural Network Algorithms
However, the obtained results are as follows: The
5.4.1 Stuttgart Neural Network Simu-
SVM has only been tested with a Gaussian ker-
nel, which is parameterized by a bandwidth param-
lator (SNNS)
eter Sigma specifying the inverse kernel width6 : the The neural networks applied in this study are imple-
larger Sigma chosen, the smaller the kernel. More- mented using the Stuttgart Neural Network Simula-
over, the SVM implementation allows specification tor (SNNS) package in R (see Bergmeir & Benitez,
of the cost of constraint violation via a parameter 2012, 2013). In order to perform the same cross-
C, which is set to 1 standardly and varied between validated analyses as for the before mentioned al-
0.7 and 1.3 within this analysis. Due to the compu- gorithms, a custom model was built to integrate a
tational costs, results are calculated upon five-times fully customizable version of the SNNS into the caret
repeated 10-fold cross-validation only. package. (Kuhn, 2008)
Found results in Figure 5.18 indicate that a large The SNNS allows for a variety of different learning al-
kernel leading to a linear model performs best. The gorithms, of which standard backpropagation (SBP),
6 Gaussian kernel defined as k(x, x0 ) = exp(−σ ∗ ||x − x0 ||2 ). the most common learning algorithm, also referred

Cost Cost
0.7 1 1.3 0.7 1 1.3
0.9 1.1 0.9 1.1

0.95
RMSE (Repeated Cross−Validation)

RMSE (Repeated Cross−Validation)

0.90
0.75

0.85

0.80
0.70

0.75

0.70

0.0 0.2 0.4 0.6 0.8 0.01 0.02 0.03 0.04

Sigma Sigma

(a) full Sigma range (b) zoom into small Sigma

Figure 5.18: RMSE accuracy for support vector machine

43
44 5. Results

# Hidden layers
1 3 5
2 4
RMSE (Repeated Cross−Validation)

1.1

1.0

0.9

0.8

0 100 200 300 400 500

# Hidden nodes per layer

(a) Neural network with SCG learning algorithm

# Hidden layers
1 3 5
2 4

1.2
RMSE (Repeated Cross−Validation)

1.1

1.0

0.9

0.8

0 100 200 300 400 500

# Hidden nodes per layer

(b) Neural network with standard backpropagation learning algorithm (learning rate = 0.1 and maximum difference = 0)

Figure 5.19: RMSE accuracy for feedforward neural network

44
5.4. Neural Network Algorithms 45

to as vanilla backpropagation (Rojas, 1996; Rumel- (see previous chapter) generally face issues of slow
hart, Hinton, & Williams, 1986), and scaled conjugate learning speed (backpropagation) and customizable
gradient (SCG) (Møller, 1993) have been applied. learning functions with a high number of crucial pa-
Both perform supervised learning for feed forward rameters to set. A new method fitting neural net-
neural networks, but differ in the optimization rou- works has therefore been developed: Extreme learn-
tine. While SBP uses the first derivative of the goal ing Machines (ELM) fit single-hidden layer feed-
function, SCG optimizes upon the second derivative, forward neural networks upon mathematical, non-
which is computationally more expensive, but gen- iterative solving only (Huang, Chen, & Siew, 2006).
erally ”finds a better way to the (local) minimum” The input weights for each hidden note are ran-
(Zell et al., 2013, p. 210). SCG is a combination domly chosen and not adapted, so that training is
of a conjugate gradient approach and ideas of the omitted. Training is only applied to the weights for
Levenberg-Marquardt algorithm (Bergmeir & Ben- the output calculation, which is computationally less
itez, 2012; Marquardt, 1963). Regarding the different costly and can consequently ”run thousands times
learning algorithms’ performance and accuracy no faster than [. . . ] conventional methods” (Rajesh &
clear ranking persists in the literature so far. Conse- Prakash, 2011, p. 35). By an increase of the num-
quently, comparable studies usually apply and com- ber of hidden nodes with random inputs weights the
pare several different learning algorithms in order to ELM is theoretically as powerful as conventional neu-
find algorithms fitting the data best. ral networks and capable of approximating ”any con-
tinuous target functions” (Rajesh & Prakash, 2011,
Due to the characteristics of neural computing the
p. 880).
dependent and independent variables have been nor-
malized to zero mean and standard deviation one.
The elmNN package in R (see Gosso, 2013) allows for
The categorical variables (e.g. gender, age, educa-
the training of ELMs with different activation func-
tion) were consequently transformed to numeric vari-
tions (e.g. sigmoid function for standard neural net-
ables. The neural network has been constructed with
works). For this study five activation functions have
one to five hidden layers and 20 to 1000 nodes on each
been tested for the hidden and the output nodes:
layer. For standard backpropagation the parameters
sigmoid (sig), slightly steeper tan-sigmoid (tansig),
have been kept fix at a level best for accuracy, but as-
stepwise zero / one function hard-limit (hardlim),
sociated with rather high computational costs, which
stepwise minus one / one function symmetric hard-
due to the small sample is acceptable: the learning
limit (hardlims) and a pure linear function (purelin).
rate at a low level of 0.1 and the maximum output
For a comparison of the activation functions with dif-
difference at zero.
ferent numbers of hidden nodes see Figure 5.20. The
The achieved accuracy with different learning algo- pure linear activation function obviously explains the
rithms is given in Figure 5.19. None of the tested same variance as the GLM and leads once more to
network layouts and none of the applied learning al- the best fitting model.
gorithms reaches better performance than the GLM.
The neural network with four hidden layers and 40 All tests have been conducted with 20-times repeated
hidden nodes each performed best and reached a min- 10-fold cross-validation. Since the hidden nodes in-
imum RM SE of 0.765 for the SCG learning function put weights were randomly set, a sufficient number
and a RM SE of 0.763 for the standard backprop- of repeated analyses has to be performed in order to
agation learning function. Both learning functions achieve a valid accuracy result.
provide very similar results.
Since the tansig, hardlim and hardlims activation
5.4.2 Extreme Learning Machine functions was found to show decreasing RM SE with
increased number of nodes at 5000 hidden nodes,
(ELM)
a single five-times repeated 10-fold cross-validated
Standard feedforward neural networks as imple- analysis has been conducted for 12000 hidden nodes.
mented by the Stuttgart Neural Network Simulator However, it was still found that the sigmoid based ac-

45
46 5. Results

Activation Function Activation Function


sig hardlims purelin sig hardlims purelin
hardlim tansig hardlim tansig

0.90

1.6

0.85
RMSE (Repeated Cross−Validation)

RMSE (Repeated Cross−Validation)


1.4

0.80
1.2

1.0 0.75

0.8
0.70

0 1000 2000 3000 4000 5000 20 40 60 80 100

#Hidden Units #Hidden Units

(a) full range (b) zoom for small number of hidden nodes

Figure 5.20: RMSE accuracy for extreme learning machine (ELM)

tivation functions do not outperform the GLM (see and internal regression coefficient (slope) of the lin-
Figure 5.21). ear trajectory smoothing could be explained (see Fig-
ure 5.22). All models upon the tested parameter sets
Extreme Learning Machine
result in higher RM SE than the samples standard
358 samples
13 predictors deviation (RM SE > 1).
No pre - processing
Resampling : Cross - Validated (10 fold , repeated 5
times ) 5.5 Feature Selection Algo-
Summary of sample sizes : 324 , 322 , 322 , 322 , 322 ,
322 , ... rithms
Resampling results across tuning parameters :
The selected algorithms here do not aim for an ac-
actfun RMSE Rsquared RMSE SD Rsquared SD
sig 0.957 0.283 0.103 0.124 curate prediction of the dependent variable. Instead,
hardlim 0.724 0.472 0.0863 0.134
hardlims 0.727 0.469 0.0857 0.137 feature selection algorithms evaluate the importance
tansig 0.8 0.388 0.0786 0.121
purelin 0.676 0.531 0.0792 0.136
of certain predictors for the output variable (see lit-
erature review). The deployed kernel smoothing al-
Tuning parameter ’ nhid ’ was held constant at a
value of 12000 gorithms (see chapter 5.3) indicate that certain inde-
pendent variables within this study do not have an
Figure 5.21: Cross-validation results for extreme
learning machine (ELM) for 12.000 hid- important influence on well-being. To evaluate this
den nodes in detail, two different feature selection algorithms
were applied.

Due to the computational efficiency in combination


5.5.1 Lasso and Elastic Net Regression
with comparable accuracy, the ELM was applied to
test for possible structures within each participant’s The lasso regression is a basic feature selection algo-
well-being trajectory proposed by this study’s fourth rithm for generalized linear models (GLM). In com-
hypothesis (compare hypothesis 3 and 4). As already parison to algorithms using regularization the lasso
obtained from the GLM analysis, no variance be- algorithm limits the sum of coefficients (l1 norm) to
tween the participants’ internal standard deviation a constant and therefore results in coefficients being

46
5.5. Feature Selection Algorithms 47

Activation Function Activation Function


sig hardlims purelin sig hardlims purelin
hardlim tansig hardlim tansig

1.25
1.25

1.20
1.20
RMSE (Repeated Cross−Validation)

RMSE (Bootstrap)
1.15
1.15

1.10 1.10

1.05 1.05

1.00 1.00

20 40 60 80 100 20 40 60 80 100

#Hidden Units #Hidden Units

(a) dependent var.: standard deviation (b) dependent var.: regression coefficient

Figure 5.22: RMSE accuracy for ELM in trajectory prediction problem

actually zero (Tibshirani, 1996). Hastie et al. (2009) for continuous adjustment of the regularization norm
provided a good description of possible regulariza- including l1 and l2 norm by the parameter lambda.
tion norms and comparison of feature selection algo- However, for this study the elastic net regression in-
rithms. The lasso regression is parameterized by the cluding a parameterization for ridge regression did
fraction of the full model coefficients’ l1 norm, defin- not provide an improvement in accuracy or feature
ing a maximum threshold for the sum of the current selection.
regression coefficients’ l1 norm. A fraction of one
consequently results in the full GLM, while a fraction 5.5.2 Lazy Lasso Regression
of zero forces all coefficients to zero. The algorithm
The lazy lasso algorithm has been developed by
is implemented using the lars and elasticnet package
Vidaurre et al. (2011) in order to combine kernel
in R (Hastie & Efron, 2013; Zou & Hastie, 2013) and
smoothing with lasso regression. The combination
is five-times repeated 10-fold cross-validated to en-
allows fitting non-linear functions upon the locally
sure sufficient reproducibility. Figure 5.23 outlines
most important independent variables only. Since
the lasso regression path and accuracy.
the algorithm implements the lasso algorithm men-
As expected, the RMSE of the model approaches tioned before, it actually zeroes unimportant regres-
the GLM accuracy for the full solution. From the sion coefficients by fitting the local lasso regression
RMSE plot, a small improvement to the GLM can with the lars R package (see Hastie & Efron, 2013).
be observed, if the fraction is set to 0.9, so that fair- However, the lazy lasso algorithm is not available
ness and education are not part of the model. These as an R package yet, a simple version with a uni-
variables explain no structural variance in the linear form kernel has therefore been implemented. The
model and hence overfit the data. The lasso path implementation follows the abstract algorithm given
includes neuroticism as first, extraversion as second in Figure 5.24 by Vidaurre et al. (2011).
and conscientiousness as third variable.
Additionally, the algorithm is cross-validated using
Further developments of the lasso regression led to the caret package in order to test different parame-
alternative norms for coefficient regularization. The ter sets. The parameters include the bandwidth pa-
Elastic Net Regression (Zou & Hastie, 2005) allows rameter t for the uniform k-nearest neighbor kernel

47
48 5. Results

LASSO
0 1 2 3 5 7 8 10 12

2
**
**
4

* *
* **

5
** ** **
* *
* **
**
2

4
**
**
**
* * * **

3
***** **
* ** * **
**

12
** 0.90
0
Standardized Coefficients

* * * * ** * ** * * ** **
**

RMSE (Repeated Cross−Validation)


**
**
* **

6
* **
* **
** 0.85
−2

**

8
**

0.80
−4

*
0.75
−6

** * 0.70
**
−8

* *
**
1

**

0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

1 Neuroticism 4 Optimisim
|beta|/max|beta|
7 Fairness 10 Location 13 Job
Fraction of Full Solution
2 Extroverted 5 Conscientious 8 Health 11 Gender

Figure 5.23: Lasso regression path (left) and RMSE accuracy (right)

(number of neighbors included) and a stopping pa- each predictor, so that no adjustment of the kernel
rameter k, which defines the number of loops in a row to the predictor weight takes place.
to be calculated without performance improvements
until the algorithm aborts. For each iteration, the As the algorithm performs feature selection upon the
distances for the kernel calculation are parameter- Lasso regression, a criteria to define the number of
wise weighted with the regression coefficients from predictors included in the local linear regression is
the previous iteration. The first iteration starts with- necessary. Upon the residual standard error for each
out weighting. Vidaurre et al. (2011) argued that step of the lars path Mallows’ Cp statistic is calcu-
this approach ”attaches more importance to relevant lated. Predictors are included in the final model as
variables” (p. 539), because distances by irrelevant long as Cp is larger than the total number of predic-
predictors are neglected. In order to parameterize tors multiplied by a bias factor, which is bias = 1 for
the distance adjustment, the calculation of δj is as the standard configuration, but may be parameter-
follows: ized. A larger bias factor results in a less complex
model, a smaller bias factor includes more predictor
variables.
|βj |d
δj = p ∗ Pp
0 d
j 0 =1 |βj | Due to feature selection, the accuracy achieved by
the model is not comparable with the prediction
This allows for a scaling of the adjustment’s power models mentioned previously. However, the results
by the distance adaption parameter d. For d = 1, δ from the parametric optimization can be gained
is equal to the relative predictor weight as proposed from Figure 5.25. As expected, the kernel smooth-
by Vidaurre et al. (2011); for d = 0, δ equals 1 for ing demonstrates once more that the best model is

48
5.5. Feature
540 Selection Algorithms D. Vidaurre et al. 49

Algorithm 1 lazy lasso


Input: training data set D with p variables and n data items
Input: bandwidth τ and stopping criterion parameter κ
Input: weight function g(·) and distance function d(·)
Input: point x(l) , whose response is to be predicted
(l)
Output: set of coefficients β̂ and estimated response ŷ (l)
Initialization:
δ j := 1, for j = 1, . . . , p
overall Best := ∞ ; toStop := 0
repeat
(i) (l)
√all distances di := dδ (x , x ), for i = 1, . . . , n
Calculate
(l)
w := gτ (d)
(l) (l)
W(l) := n × n diagonal matrix, Wii = wi , for i = 1, . . . , n
Z := W(l) X
v := W(l) y
path := L ARS(Z, v)
β ∗ := best (path;
0 p Z, v)
δ j := p|β j |/ j ′ =1 |β j ′ |, for i = 1, . . . , n
scor e := evaluate(β ∗ ; Z, v)
if scor e ≥ overall Best then
toStop := toStop + 1
else
toStop := 0
overall Best := scor e
(l)
β̂ := β ∗
end if
until toStop = κ
T (l)
ŷ (l) := x(l) β̂

Figure 5.24: Lazy Lasso Algorithm

3.2 Validation procedures


Bias Factor Distance Adaption Factor
0.3 1 2 5 0 0.3 0.7 1
0.6 1.5 3
Validation plays a crucial role in the lazy lasso. On the one hand, a specific point of the
regularization path must be selected from each LARS run. On the other hand, a final
0.95

solution should be selected from the final lazy lasso sequence. Hence, the number of 0.85

solutions for evaluation can be considerably large. An efficient evaluation method is


RMSE (Repeated Cross−Validation)
RMSE (Repeated Cross−Validation)

0.90

thus required. In addition, we do not know in advance the proper bandwidth τ for the
incoming point x(l). The procedure recommended for finding a specific τ value for x(l)
0.85
0.80

should be data-driven and adaptive.


0.80 We first deal with model selection along the LARS regularization path. Since we 0.75

assume local homoscedasticity and we have used the k-nearest neighborhood function
to weight the data set, we can now reasonably assume σi to be constant for the weighted
0.75

data set (that is, within this neighborhood of x(l) ). 0.70

0.70 The Mallows’ C p statistic (Mallows 1973), which needs σi = σ for all i, is
defined as 50 100 150 200 50 100 150 200

Bandwidth Bandwidth

(a) d = 1 (b) bias = 1

Figure 5.25: RMSERSS( ˆ for lazy lasso regression


β)
accuracy
Cp = 2
− n + 2ν, (13)
σ

123 49
50 5. Results

achieved for large kernels approaching the general- the assessment of the local predictor importance has
ized linear model. The stopping parameter k was been conducted on models with 30 to 80 points per
tested for values k = 5 and k = 8 without noticeable kernel, even if those were not performing best in
differences, so that it is fixed to k = 5 for all further terms of accuracy. Figure 5.26 provides an overview
analyses. of the predictor weights depending on the bias fac-
tor. Neuroticism is the predominant predictor gain-
The bias factor was as expected found to reduce
ing even more importance, if the restriction is tight-
the number of predictors included in the local linear
ened (higher bias). Extraversion and conscientious-
regressions and consequently reduces the accuracy
ness were found to be the second most important
when increased. Different from original expectations,
predictors. However, their influence decreases, when
the distance adaption factor d had a rather small in-
the kernel size is shrunken and the prediction conse-
fluence on the model’s accuracy. For medium-sized
quently based on fewer neighbors. This is different
kernels (30 - 80 points), models with little distance
than expected, because a local analysis usually in-
scaling actually fitted the testing points better than
creases the relative importance of generally less im-
the proposed distance scaling with d = 1. Moreover,
portant variables. Even for kernels with less than 30
those models generally included fewer variables on
points (< 10% of the sample size) neuroticism is the
average.
only important predictor. Extraordinarily increased
In order to evaluate the predictors’ importance the fi- weights for other predictors are not observed. How-
nal local regression coefficients for each testing point ever, the unrestricted model (bias = 0) for small ker-
are saved7 and allow for later statistical analysis as nels weights all predictors relatively equal with five
for example counting the regressions with coefficients to 15 percent of the total predictor weight8 . As seen
unequal to zero for each participant or sum the abso- in Figure 5.26 this includes an increased weight for
lute regression coefficients by parameter. However, the location variable. However, this has to be treated
since the best performing model has a large ker- with caution, because the underlying sample is not
nel, those feature selection results are similar to the representative in this regard. Moreover, the gender
variance importance identified by the GLM. Hence, variable is comparably important in the unrestricted

7 The nominalTrainWorkflow function in the R caret package 8 Note in this regard that the lars algorithm called for each
had to be adapted in order to return additional data from local kernel environment individually shifts the training
the prediction. points to zero mean and variance one for each predictor.

80
80 Predictor

Neuroticism

Extraversion

Agreeableness
60
60
Optimisim
Predictor Weights (in %)

Predictor Weights (in %)

Conscientious

Maximizer

40 40 Fairness

Health

Age

Location
20 20
Gender

Education

Job

0 0

1 2 3 4 5 1 2 3 4 5
Bias Bias

(a) t ∈ [30, 80] (b) t ∈ [100, 200]

Figure 5.26: Lazy lasso predictor weights

50
5.5. Feature Selection Algorithms 51

model with large kernel drops weight when fitted lo- 40% of all local fitted regressions with small kernels
cally. (30 - 80 points) only, while included in over 65% of
the regressions with larger kernels. Correspondingly,
Since the lasso regression zeros unimportant predic-
variables not important in larger kernels are included
tors when called with sufficient restriction via the
in local regressions with smaller kernels more often.
bias variable, an analysis of the number of coefficients
Nevertheless, this is likely to result from over-fitting
unequal to zero per predictor over all testing points
the data, since those small kernels result in signifi-
is promising. Again, neuroticism, extraversion and
cantly less cross-validated accuracy (see Figure 5.27).
consciousness stack out as the most often included
predictors, followed by health and the maximizer- In general, differences for the predictors’ order in re-
satisficer measure (see Figure 5.27). When fitted lo- gard to the frequency of coefficients unequal to zero
cally with small kernel sizes, the differences between are not observed between different kernel sizes. This
predictors are though less distinct. For an average once more supports that the high predictor weight of
number of 2.5 predictors, neuroticism is included in the location for small kernels is due to irregularities

100 100

Predictor

Neuroticism

Extraversion
Number of local Regressions (in %)

Number of local Regressions (in %)

75 75 Agreeableness

Optimisim

Conscientious

Maximizer
50 50
Fairness

Health

Age

Location
25 25
Gender

Education

Job

0 0

0.0 2.5 5.0 7.5 2.5 5.0 7.5 10.0 12.5


Average number of predictors Average number of predictors

(a) measure relative to total number of regressions (left: t ∈ [30, 80]; right: t ∈ [100, 200])

80 80

Predictor

Neuroticism
Share of average number of predictors (in %)

Share of average number of predictors (in %)

Extraversion

60 60 Agreeableness

Optimisim

Conscientious

Maximizer
40 40
Fairness

Health

Age

Location
20 20
Gender

Education

Job

0 0

0.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0 12.5
Average number of predictors Average number of predictors

(b) measure relative to total number of regressions corrected with total number of predictors per regression (left: t ∈ [30, 80];
right: t ∈ [100, 200])

Figure 5.27: Lazy lasso: percentage of local lasso regressions with predictor coefficient unequal to zero

51
52 5. Results

RMSE contribution Variance explained as


Predictors
to full model single predictor
Neuroticism 41 %
Most important predictors
Extraversion 0.40 22 %
(Group 1 )
Consciousness 15 %

Maximizer
Health
Moderately important
Gender 0.04 8 - 12 %
predictors (Group 2 )
Agreeableness
Optimism

Age
Fairness
Less important predictors
Job 0 0-8%
(Group 3 )
Education
Location

Table 5.4: Predictor importance by group. Note: Numbers in the second column indicate the difference between
RM SE of model including the group as predictors and model including the more important groups
only; analysis conducted with npreg algorithm.

in the dataset. However, the variables can be clus- influence as for example the npreg algorithm. The
tered into three groups by importance. On the one kernel smoothing selects local environments around
hand side they are fairly constant regarding predictor the predicted test points, but does not currently save
weights and the frequency of coefficients unequal to the bandwidth information in order to compute the
zero. On the other hand identified groups correspond complete partial influence plot. Changes of local pre-
with the finding from the npreg algorithm mentioned dictor importance along the predominant regression
before (see Table 5.4). Neuroticism, extraversion and line of neuroticism could for example be subject to
conscientiousness explain by far most of the variance further research.
– neuroticism alone already around 40%, if fitted
with non-parametric regression. Extraversion and
consciousness add another ∼ 10% of explained vari-
5.6 Accuracy Comparison
ance after controlling for neuroticism. The second The proposed hypotheses two and four aim for a
group includes the maximizer-satisficer scale, health, prediction of each participant’s well-being baseline
optimism, agreeableness and gender. Especially for and the corresponding well-being trajectory upon the
large kernels the second group accounts for signif- psychometric and demographic input variables. Dif-
icantly more predictor weight than the remaining ferent machine learning approaches have been tested.
variables. Together with the first group those vari- However, the algorithms do not achieve higher ac-
ables explain approximately 47% of the variance be- curacy than the generalized linear model (GLM). A
tween the averaged HFI per participant. The third comparison of the conducted algorithms and the ac-
group contains the remaining predictors fairness, ed- curacy achieved can be obtained from Figure 5.28.
ucation, job, location and age, which were found to
Since this study’s sample is comparably small for the
have a rather small influence and explain very little
number of predictors included in the prediction mod-
variance after controlling for the groups one and two.
els, an accuracy test for a reduced sample size is ad-
Within the third group age and fairness are the most
vised in order to test for possible accuracy advantages
relevant predictors. This division in three clusters is
from larger datasets. This test has been conducted
supported by the findings of the npreg algorithm and
for the neural network model (see chapter 5.4.1).
furthermore corresponds with the separation in the
Mentioned model was adapted to loop over different
linear lasso regression on the whole dataset.
subsets of the sample and apply the cross-validated
While the lazy lasso algorithm is capable of effec- neural network algorithm on these subsets. Subsets
tive feature selection and interpretation, it does not including 50% − 100% of the original dataset were
allow for an overall picture of a single predictor’s tested. The neural network was built with the two

52
5.6. Accuracy Comparison 53

GLM accuracy
0.54
GLM 0.68

0.37
K−nearest neighbor 0.79

0.53
Local linear regression − LOESS 0.68

0.53
Local linear regression − Splines 0.68

0.53
Local linear regression − NPREG 0.68
Model

0.51
Support Vector Regression 0.7

0.41
Neural Network (SCG) 0.76

0.42
Neural Network (Backpropagation) 0.76

0.54
Extreme Learning Machine (linear) 0.68

0.48
Extreme Learning Machine (hardlim) 0.72

0.08
Extreme Learning Machine (sigmoid) 0.96

0.00 0.25 0.50 0.75 1.00

Legend Accuracy Explained_Variance

Figure 5.28: Accuracy comparison between deployed algorithms for well-being baseline prediction

best performing parameter sets identified before (see


chapter 5.4.1): three hidden layers with 100 nodes
each and four layers with 40 nodes each. Found re-
sults indicate that further increases of the sample
size do not promise large accuracy improvements (see
Figure 5.29). The RM SE curve already flattens for
training sets larger than 80% of the data available
(362 points).

For further prove the same analysis has been


conducted with the npreg algorithm (see chapter
5.3.2.3). However, due to computational costs not
the full 13-variable predictor set, but the seven most
important predictors9 have been fitted. The results
in Figure 5.30 support the implications previously
mentioned. An extension of the dataset does not
automatically lead to higher prediction results. Con-
trarily, the npreg algorithm almost achieves the max-
imum accuracy achieved in this study with 60% of
the training data already.

9 Big five traits, health and maximizer vs. satisficer measure.

53
54 5. Results

# Hidden layers
3 4

0.88
RMSE (Repeated Cross−Validation)

0.86

0.84

0.82

0.80

0.78

0.2 0.4 0.6 0.8 1.0

percentage of used data

Figure 5.29: RMSE accuracy gains with increased number of training points for neural network
RMSE (Repeated Cross−Validation)

0.9

0.8

0.7

0.2 0.4 0.6 0.8 1.0

percentage of used data

Figure 5.30: RMSE accuracy gains with increased number of training points for npreg

54
6. Evaluation

This study’s rationale is the assumption that to test for non-linear linkages between demograph-
machine-learning techniques may contribute to the ics as well as personality and the well-being baseline.
understanding and prediction of subjective well- However, none of the algorithms achieved higher ac-
being. As specified in the results chapter conducted curacy results than the linear model when appropri-
algorithms do not provide higher prediction accu- ately tested with sufficient cross-validation. Three
racies than the general linear model for the avail- possible causes would explain the obtained findings:
able dataset. However, the obtained results allow (1) Firstly, the conducted algorithms might not be
for a detailed analysis of the proposed hypotheses able to fit the existing structure within the data suf-
and deepen the understanding of well-being’s inter- ficiently. (2) Secondly, the existing dataset is too
nal structure and dependencies. small in order to differ between structural variance
and noise, so that cross-validation prevents from find-
6.1 Hypotheses ing existing structures. However, the accuracy anal-
ysis for smaller subsets does not indicate large accu-
6.1.1 Existence of Well-Being Baseline racy gains by larger samples. (3) And thirdly, the
(Hypothesis 1) linkages between personality as well as demograph-
ics and well-being are fairly linear and consequently
According to the first hypothesis it is assumed that
well-described by the generalized linear model (see
the available dataset would underpin the well-being
chapter 5.2). Nevertheless, non-parametric regres-
baseline theory (cf. Headey & Wearing, 1991). As de-
sion approaches on subsets of the predictor space
scribed the measured standard deviation within each
found non-linear structures within the data as for ex-
participant’s well-being trajectory (four weeks) was
ample for the maximizer-satisficer test. These might
2.5 times smaller than the standard deviation be-
result from the reduction of independent variables,
tween participants’ well-being average. Moreover,
since the remaining variables found to be non-linear
the averaged well-being value per participant ex-
will embody additional variance previously explained
plained 84% of the variance within the weekly in-
by predictors, which are then excluded from the
dividual well-being values on average (see linear re-
model.
gressions). Those high percentages of explained vari-
ance indicate that well-being is fairly stable over the Best fitting models achieved a cross-validated accu-
analyzed period within the dataset. However, this racy of RM SE = 0.68, which corresponds to 68%
analysis is obviously limited to the study period of of the dependent variable’s standard deviation (46%
four weeks, so that is has to be questioned whether of the variance). The model consequently explains
stable subjective well-being would be confirmed by 32% of the standard deviation (54% of the variance),
a longer study period and higher frequency of well- which cannot be regarded as sufficient prediction.
being measurement. Considering mentioned limita- However, this finding is in line with the achieved ac-
tion, the proposed well-being baseline has been found curacy in the feasibility study by Hall et al. (2013).
and the theory by Headey and Wearing (1991) is sup- For a full list of the accuracy results see Figure 5.28.
ported. The first hypothesis therefore is accepted.
According to the algorithms performed, neuroticism
is the predominant variable, followed by extraver-
6.1.2 Predictability of Well-Being
sion and conscientiousness, which is in accordance
Baseline (Hypothesis 2)
with the existing literature. As novel measures in
Hypothesis 2 addressing the predictability of the well-being literature, the maximizer-satisficer scale
identified well-being baseline is this study’s main as- and the participants’ fairness perception have been
pect. As outlined in the results chapter several differ- tested for influences. The first mentioned is found
ent machine-learning algorithms have been applied to provide reasonable contribution to the well-being

55
56 6. Evaluation

baseline explanation, particularly if analyzed by non- be mentioned in this regard is the found negative
parametric algorithms, since a local U-shaped curve correlation of physical health and well-being, which
has been found in some analyses. The latter did not is conflicting with common well-being literature (cf.
explain additional variance and should consequently Diener et al., 1999; Jorm & Ryan, 2014; Lacey et al.,
not be considered as relevant any further. The same 2008). First assumptions that the influence might
concerns for most of the demographic variables, ex- result from the fact that more important variables as
cept for gender and age. The participant’s education, for example neuroticism1 and conscientiousness are
employment and location did not provide any value controlled first could not be confirmed. Even with
added, whereby it has to be noted that this study’s health as the only predictor, a negative influence was
sample is not sufficiently representative in regards to found.
location. Generally, the predictors can be clustered
Moreover, the U-shaped dependency between age
by importance in three groups as outlined in Table
and well-being widely discussed in literature (cf.
5.4.
Blanchflower & Oswald, 2008; Clark & Oswald, 2006)
was not observed. However, the found slightly pos-
6.1.3 Characterization of Well-Being itive correlation corresponds with the findings by
Trajectory (Hypothesis 3 & 4) Blanchflower and Oswald (2008) before accounting
for different cohorts. Moreover, Blanchflower and
While the existence of a well-being baseline per par-
Oswald (2008) controlled for several variables not
ticipant could at least partially be confirmed, each
present in this study such as income and children.
participant’s well-being trajectory was not found to
Distinct consideration of different cohorts has not
follow certain rules. However, the trajectory obvi-
been conducted in this study; instead, the U-shape
ously floats around the mentioned baseline, as con-
would have become visible by the applied kernel
structed using the average over the four weeks consid-
smoothing algorithms, if it was available within the
ered in this study. The third hypothesis can there-
data. However, this study demonstrates that age
fore be accepted, but does not provide large scien-
generally is by far less important than psychomet-
tific value. A prediction of the trajectory on the
ric predictors.
other hand was not possible. All tested approaches
to explain variance of the trajectories standard de- The maximizer-satisficer scale has been found to be a
viations and linear slope between participants failed. reliable addition to the important big five personality
The models explained less than 1% of the variance traits (especially neuroticism, extraversion and con-
in the dataset. Thus, the forth hypothesis could not sciousness). The predictor is only moderately cor-
be confirmed. The in-person well-being trajectory’s related with the big five (r < 0.3 ∀ big five) and
standard deviation is not dependent on considered explains the same amount of variance as openness
personality factors and demographic variables. (∼ 10%). However, the identified U-shaped internal
Algorithms tested include a simple generalized linear structure should be considered in further analysis,
model and the extreme learning machine. Both, each because fitting with linear models is not adequate to
participant’s in-trajectory standard deviation as well map those dependencies.
as each participant’s in-trajectory linear regression Furthermore, a significant influence of the gender
coefficient have been tested as dependent variable for variable was observed, indicating that male partici-
explained variance by the 13 independent variables pants experience lower well-being values than female.
(psychometrics and demographics). No variance be- This contradicts previous well-being studies (e.g. Di-
tween participants could be explained. ener et al., 1991, 1999) and was moreover not ob-
served within the smaller 2013 dataset for the fea-
sibility study by Hall et al. (2013). However, no
6.2 Further Findings
conclusive findings about gender have been made to
Besides proposed hypotheses, the applied machine- 1 Neuroticism and health are relatively highly correlated (r =
learning algorithms provided additional findings. To 0.33).

56
6.3. Limitations 57

date, so that further research is advised in order to measured has no demand for objectivity and the re-
identify the independent effect of gender on personal sults do not indicate whether a certain participant
well-being. is actually well in different regards. Instead, the in-
dividual perception regarding the participant’s own
well-being is measured according to the state of re-
6.3 Limitations
search. Consequently, well-being is measured on an
Empirical research is exposed to certain limitations. individual scale per participant in this study and is
Most important in this regard is the selection of par- not intended to contribute to generalized national
ticipants, which has been conducted via social net- measures, such as a gross national well-being score.
works and direct contact. This study’s participants
might therefore not necessarily represent the opti- The well-being prediction problems dealt with in the
mal sociological sample. However, an effort has been study are limited to demographic and psychomet-
made to achieve a widespread sample, since the com- ric predictors. These are mostly discussed and have
pleteness of the data is more important than statisti- been found to have the highest influence on sub-
cally representativeness in order to evaluate determi- jective well-being (Diener et al., 1999). However,
nants and their importance for well-being prediction. many other correlations of subjective well-being have
Moreover, the underlying sample is a combination of been identified (see e.g. Veenhoven, 1984), but are
two different sources using the same questionnaires not covered by this study. Considering that applied
and procedures on different participants with an in- machine learning algorithms explained roughly half
terval of one year. Possible influences of this com- of the variance only, it has to be estimated that
bination have not been found, but can also not be other factors account for a significant proportion of
precluded generally. the variance, too. Especially, if analyzed upon non-
parametric tools in combination with the important
Conducted measurement of well-being is furthermore predictors identified by this study. Significant cor-
limited, since this study is based on four single well- relations identified in literature and not sufficiently
being measures over a period of four weeks only. Ac- considered by this study include for example religion,
tivities, events in the participants’ lifes and other culture and marriage status.
short-term influences are not measured, such that
those influences widely reported in literature (see e.g.
Diener et al., 1999; Veenhoven, 1984) are not cov-
ered by this study (cf. also the Day Reconstruction
Method, Kahneman et al., 2004). The well-being
trajectory obtained from each participant is conse-
quently a sequence of snapshots only and does not
reflect the entire well-being curve. However, tem-
poral analysis of the participants’ responses did not
result in any significant well-being differences over
the hour of the reporting day and the measurements
have been conducted on Wednesdays, so that short-
term variation is avoided. Moreover, the differences
between in-trajectory and between-participant well-
being variance allow for an identification of the well-
being baseline. Nevertheless, four data points might
be not enough in order to gather a well descriptive
image of each participant’s well-being trajectory.

Obtained well-being data is furthermore subjective,


since the participants rate their own perception on
provided Likert scales. Consequently, the well-being

57
7. Implications and Further Research

Machine learning has been proven to provide rea- pants choose the hour of the day for completing the
sonable insight on the structure and dependencies questionnaire themselves.
of well-being. Even though it was not achieved to
Concerning the predictors’ importance relevant for
explain higher proportions of the variance between
a well-being definition, further attention on the
participants’ well-being than by the general linear
maximizer-satisficer scale is recommended. The
model, the findings on predictor importance and ap-
mentioned predictor has been found to explain a sig-
plicability of several machine learning algorithms add
nificant amount of variance, which is even more rele-
to the scientific goal of a robust well-being definition.
vant when analyzed locally. A U-shaped structure is
According to these findings ordinary least squares
observed indicating that satisficers as well as extreme
regression provides the most appropriate prediction
maximizers benefit from higher well-being than the
approximation upon psychometric and demographic
average participant. Generally, the predominant im-
variables.
portance of neuroticism is confirmed, followed by ex-
traversion and consciousness.
However, since complex non-parametric regression
algorithms upon psychometric and demographic vari- Applied non-parametric machine learning algorithms
ables do not achieve accuracies larger 54% of ex- significantly increased the developed picture of the
plained variance, it can be concluded that other de- well-being dependencies’ internal structures. Today,
pendencies independent from the controlled variables most analyses on social problems do not challenge
must exist. Most important in this regard is probably significances found by variance analysis and linear
the participant’s life situation and activities, which regression for underlying non-parametric structures,
are not covered adequately in this study. Moreover, although those would probably add additional value
it has to be questioned, whether a four weeks pe- to the ongoing scientific discussion. Applied meth-
riod is sufficient in order to obtain the mid-term well- ods – even if developed for big data assessment –
being baseline assumed to be predominantly depen- have been proven to reveal interesting and new facets
dent on psychometrics and demographics. Further of this study’s well-being prediction problem upon
development of mobile applications and social net- comparably small datasets. The topic of ’small data’
works might add to future well-being data collection analysis including small samples with high dimen-
on a larger scale, especially for long time studies with sionality recently evolved from increased availability
high frequent data retrieval. The human flourishing of individual, personal data gained for example from
index seems to be a valuable tool in this regard. smart phones and social media activity. Social data
availability will simplify the understanding of depen-
This study’s initial question for the reasons why some dencies and underlying structures, but it will also
people judge the glass half full and others half empty demand for easy-to-use, well-interpretable, but nev-
can consequently not be answered in a more pro- ertheless powerful analysis procedures. It is conse-
found way than already described in literature. Fur- quently proposed that non-parametric tools and fea-
ther research on well-being prediction is advised to ture selection methods should be further developed
broaden the predictor space including for example and more often be utilized in order to question pop-
participants’ important life events and a more gen- ular, but simple regression results.
eral perspective on the social background. It has to
be questioned, whether this information is accessi-
ble via anonymous online questionnaires. However,
the found independence between HFI values and the
hour of the day supports the use of online surveys.
Obtained results are comparable, even if the partici-

58
References
Aitchison, J., & Aitken, C. G. G. (1976). Multi- Blanchflower, D. G., & Oswald, A. J. (2004). Well-
variate Binary Discrimination by the Kernel Being over Time in Britain and the USA. Jour-
Method. Biometrika, 63 (3), 413–420. nal of Public Economics, 88 , 1359–1386.
Anthony, M., & Bartlett, P. L. (2009). Neural net- Blanchflower, D. G., & Oswald, A. J. (2008). Is
work learning: Theoretical foundations. Cam- well-being U-shaped over the life cycle? Social
bridge, England: Cambridge University Press. Science and Medicine, 66 (8), 1733–49.
Aristotle. (2002). Nicomachean Ethics (S. Broadie Bouckaert, R., & Frank, E. (2004). Evaluating the
& C. Rowe, Eds.). Oxford, England: Oxford Replicability of Significance Tests for Compar-
University Press. ing Learning Algorithms. In H. Dai, R. Srikant,
Arlot, S., & Celisse, A. (2010). A Survey of & C. Zhang (Eds.), Advances in knowledge dis-
Cross-Validation Procedures for Model Selec- covery and data mining (pp. 3–12). Berlin, Hei-
tion. Statistics Surveys, 4 , 40–79. delberg, Germany: Springer.
Basak, D., Pal, S., & Patranabis, D. C. (2007). Braga-Neto, U. M., & Dougherty, E. R. (2004).
Support Vector Regression. Neural Informa- Is Cross-Validation Valid for Small-Sample
tion Processing – Letters and Reviews, 11 (10), Microarray Classification? Bioinformatics,
203–224. 20 (3), 374–80.
Beaujean, A. A. (2013). Factor Analysis using R. Brickman, P., & Campbell, D. T. (1971). Hedonic
Practical Assessment, Research & Evaluation, Relativism and Planning the Good Society. In
18 (4), 1–11. M. H. Appley (Ed.), Adaptation - level theory
Belsley, D. A. (1991). A Guide to Using the (pp. 287–305). New York: Academic Press.
Collinearity Diagnostics. Computer Science in Brunstein, J. C., Schultheiss, O. C., & Grässmann,
Economics and Management, 4 , 33–50. R. (1998). Personal Goals and Emotional Well-
Bentler, P. M. (2007). On tests and indices for eval- Being: The Moderating Role of Motive Dispo-
uating structural models. Personality and In- sitions. Journal of Personality and Social Psy-
dividual Differences, 42 (5), 825–829. chology, 75 (2), 494–508.
Bergmeir, C., & Benitez, J. M. (2012). Neural Net- Burman, P. (1989). A Comparative Study of Ordi-
works in R using the Stuttgart Neural Network nary Cross-Validation, v-Fold Cross-Validation
Simulator: RSNNS. Journal of Statistical Soft- and the Repeated Learning-Testing Methods.
ware, 46 (7), 1–26. Biometrika, 76 (3), 503–514.
Bergmeir, C., & Benitez, J. M. (2013). Pack- Campbell, A., Converse, P. E., & Rodgers, W. L.
age ‘RSNNS’ (R package manual Ver- (1976). The Quality of American Life: Per-
sion 0.4-4). R-project.org. Retrieved ceptions, Evaluations, and Satisfactions. New
from http://cran.r-project.org/web/ York, NY: Russell Sage Foundation.
packages/RSNNS/index.html Cattell, R. B. (1947). Conformation and Clarification
Berrueta, L. A., Alonso-Salces, R. M., & Héberger, of primary Personality Factors. Psychometrika,
K. (2007). Supervised pattern recognition in 12 (3), 197–220.
food analysis. Journal of chromatography A, Cawley, G. C., & Talbot, N. L. C. (2010). On Over-
1158 , 196–214. fitting inModel Selection and Subsequent Se-
Blanchflower, D. G. (2001). Unemployment, Well- lection Bias in Performance Evaluation. The
Being, and Wage Curves in Eastern and Cen- Journal of Machine Learning Research, 11 ,
tral Europe. Journal of the Japanese and In- 2079–2107.
ternational Economies, 15 (4), 364–402. Chittaranjan, G., Blom, J., & Gatica-Perez, D.

59
60 References

(2011). Who’s Who with Big-Five: Analyzing de Boor, C. (2001). A Practical Guide to Splines
and Classifying Personality Traits with Smart- (Revised ed.). New York, NY: Springer.
phones. In 15th Annual International Sym- DeNeve, K. M., & Cooper, H. (1998). The Happy
posium on Wearable Computers (pp. 29–36). Personality: A Meta-Analysis of 137 Personal-
IEEE. ity Traits and Subjective Well-Being. Psycho-
Clark, A. E. (2003). Unemployment as a So- logical Bulletin, 124 (2), 197–229.
cial Norm: Psychological Evidence from Panel Deutskens, E., de Ruyter, K., Wetzels, M., & Oost-
Data. Journal of Labor Economics, 21 (2), 323– erveld, P. (2004). Response Rate and Response
351. Quality of Internet-Based Surveys: An Exper-
Clark, A. E., Frijters, P., & Shields, M. A. (2008). imental Study. Marketing Letters, 15 (1), 21–
Relative Income, Happiness, and Utility: An 36.
Explanation for the Easterlin Paradox and Diener, E. (1984). Subjective Well-Being. Psycho-
Other Puzzles. Journal of Economic Litera- logical Bulletin, 95 (3), 542–575.
ture, 46 (1), 95–144. Diener, E. (1994). Assessing Subjective Well-Being:
Clark, A. E., & Oswald, A. J. (2006). Progress and Opportunities. Social Indicators
The curved relationship between sub- Research, 31 , 103–157.
jective well-being and age (Tech. Rep. Diener, E. (2000). Subjective Well-Being: The Sci-
No. 26). Paris: PSE. Retrieved from ence of Happiness and a Proposal for a National
http://halshs.archives-ouvertes.fr/ Index. American Psychologist, 55 (1), 34–43.
halshs-00590404/ Diener, E. (2013). The Remarkable Changes in the
Cleveland, W. S. (1979). Robust Locally and Science of Subjective Well-Being. Perspectives
Smoothing Weighted Regression Scatterplots. on Psychological Science, 8 (6), 663–666.
Journal of the American Statistical Associa- Diener, E., & Chan, M. Y. (2011). Happy Peo-
tion, 74 (368), 829–836. ple Live Longer: Subjective Well-Being Con-
Cleveland, W. S., & Devlin, S. J. (1988). Lo- tributes to Health and Longevity. Applied Psy-
cally Weighted Regression: An Approach to chology: Health and Well-Being, 3 (1), 1–43.
Regression Analysis by Local Fifing. Journal of Diener, E., Emmons, R., Larsen, R., & Griffin, S.
the American Statistical Association, 83 (403), (1985). Satisfaction with Life Scale. Journal of
596–610. Personality Assessment, 49 (1), 71–75.
Collins, J. M., & Clark, M. R. (1993). An Appli- Diener, E., & Lucas, R. E. (1999). Value as a Moder-
cation of the Theory of Neural Computation ator in Subjective Well-being. Journal of Per-
To the Prediction of Workplace Behavior: An sonality, 67 (1), 158–184.
Illustration and Assessment of Network Anal- Diener, E., Sandvik, E., & Pavot, W. (1991). Happi-
ysis. Personnel Psychology, 46 (3), 503–524. ness is the Frequency, not the Intensity of Pos-
Cortes, C., & Vapnik, V. (1995). Support-Vector itive Versus Negative Affect. In N. Schwarz,
Networks. Machine Leaming, 20 , 273–297. F. Strack, & M. Argyle (Eds.), Subjective well-
Cummins, R. (2009). Subjective Wellbeing, Home- being: An interdisciplinary perspective (pp.
ostatically Protected Mood and Depression: 119–139). Oxford, England: Pergamon Press.
A Synthesis. Journal of Happiness Studies, Diener, E., Sandvik, E., Seidlitz, L., & Diener, M.
11 (1), 1–17. (1993). The Relationship between Income and
Davies, J. C. (1962). Towards a Theory of Revo- Subjective Well-being: Relative or Absolute?
lution. American Sociological Review , 27 (1), Social Indicators Research, 28 , 195–223.
5–19. Diener, E., & Seligman, M. (2004). Beyond Money:
Deaton, A. (2007). Income, Aging, Health and Well- Toward an Economy of Well-Being. Psycholog-
being around the World: Evidence from the ical Science in the Public Interest, 5 (1), 1–31.
Gallup World Poll (Tech. Rep.). Cambridge, Diener, E., & Suh, E. (1997). Measuring Quality of
MA: NBER. Life: Economic, Social, and Subjective Indica-

60
References 61

tors. Social Indicators Research, 40 , 189–216. tional approach to the study of personality and
Diener, E., Suh, E., Lucas, R. E., & Smith, H. L. emotion. Journal of Personality, 54 , 371–384.
(1999). Subjective Well-Being: Three Decades Enquete-Kommission. (2013). Schlussbericht:
of Progress. Psychological Bulletin, 125 (2), Wachstum, Wohlstand, Lebensqualität –
276–302. Wege zu nachhaltigem Wirtschaften und
Diener, E., Suh, E., Smith, H. L., & Shao, L. (1995). gesellschaftlichem Fortschritt in der Sozialen
National Differences in Reported Subjective Marktwirtschaft (Report, Germany). Berlin:
Well-Being: Why Do They Occur? Social In- Deutscher Bundestag.
dicators Research, 34 (1), 7–32. Frey, B. S., & Stutzer, A. (2002). Happiness and
Diener, E., & Tay, L. (2013). A Scientific Review of Economics: How the Economy and Institutions
the Remarkable Benefits of Happiness for Suc- Affect Human Well-Being. In Contemporary
cessful and Healthy Living (Report of the Well- sociology. Princeton, NJ: Princeton University
Being Working Group). Royal Government of Press.
Bhutan. Friedman, J. H. (2006). Recent Advances in Predic-
Dodge, R., Daly, A., Huyton, J., & Sanders, L. tive (Machine) Learning. Journal of Classifica-
(2012). The Challenge of Defining Wellbeing. tion, 23 (2), 175–197.
International Journal of Well-being, 2 (3), 222– Fujita, F., & Diener, E. (2005). Life Satisfaction
235. Set Point: Stability and Change. Journal of
Dolan, P., Peasgood, T., & White, M. (2008). Do Personality and Social Psychology, 88 (1), 158–
we really know what makes us happy? A re- 64.
view of the economic literature on the factors Fujita, F., Diener, E., & Sandvik, E. (1991). Gender
associated with subjective well-being. Journal Differences in Negative Affect and Well-Being:
of Economic Psychology, 29 , 94–122. The Case for Emotional Intensity. Journal of
Easterlin, R. (1974). Does Economic Growth Im- Personality and Social Psychology, 61 (3), 427–
prove the Human Lot? Some Empirical Evi- 434.
dence. In P. A. David & M. W. Reder (Eds.), Furse, D. H., & Stewart, D. W. (1982). Mone-
Nations and households in economic growth: tary Incentives Versus Promised Contribution
Essays in honor of moses abramovitz (pp. 89– to Charity: New Evidence on Mail Survey Re-
125). New York, NY: Academic Press. sponse. Journal of Marketing Research, 19 (3),
Easterlin, R. (1995). Will Raising the Incomes of 375–380.
All Increase the Happiness of All? Journal of Goldberg, L. R. (1993). The Structure of Pheno-
Economic Behavior and Organization, 27 , 35– typic Personality Traits. American Psycholo-
47. gist, 48 (1), 26–34.
Efron, B. (1979). Bootstrap Methods: Another Gosso, A. (2013). Package ‘elmNN’ (R pack-
Look at the Jackknife. The Annals of Statis- age manual Version 1.0). R-project.org. Re-
tics, 7 (1), 1–26. trieved from http://cran.r-project.org/
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. web/packages/elmNN/elmNN.pdf
(2004). Least Angle Regression. The Annals of Hainmueller, J., & Hazlett, C. (2013). Kernel Reg-
Statistics, 32 (2), 407–451. ularized Least Squares: Reducing Misspecifi-
Efron, B., & Tibshirani, R. (1997). Improvements cation Bias with a Flexible and Interpretable
on Cross-Validation: The .632 + Bootstrap Machine Learning Approach. Political Analy-
Method. Journal of the American Statistical sis, 2013 , 1–26.
Association, 92 (438), 548–560. Hall, M., Caton, S., & Weinhardt, C. (2013). Well-
Ellison, C. G. (1991). Religious Involvement and being’s Predictive Value. In A. A. Ozok &
Subjective Well-Being. Journal of Health and P. Zaphiris (Eds.), Proceedings of the 15th In-
Social Behavior , 32 (1), 80–99. ternational Conference on Human-Computer
Emmons, R. A., & Diener, E. (1986). An interac- Interaction (HCII) (pp. 13–22). Berlin, Ger-

61
62 References

many: Springer. Headey, B. W., & Wearing, A. J. (1991). Subjective


Hall, M., Kimbrough, S. O., Haas, C., Weinhardt, Well-Being: A Stocks and Flows Framework.
C., & Caton, S. (2012). Towards the Gamifi- In N. Schwarz, F. Strack, & M. Argyle (Eds.),
cation of Well-Being Measures. In 2nd Work- Subjective wellbeing – an interdisciplinary per-
shop on Analyzing and Improving Collabora- spective (pp. 49–73). Oxford, England: Perga-
tive eScience with Social Networks (eSon), Pro- mon Press.
ceedings of the 8th IEEE International Con- Hechenbichler, K., & Schliep, K. (2004). Weighted k-
ference on eScience (eScience 2012) (pp. 1–8). Nearest-Neighbor Techniques and Ordinal Clas-
Chicago, IL: IEEE. sification (Discussion Paper). Munich, Ger-
Harter, J. K., Schmidt, F. L., & Keyes, C. L. M. many: LMU.
(2003). Well-being in the Workplace and its Heckerman, D. (1996). A Tutorial on Learning With
Relationship to Business Outcomes: A Review Bayesian Networks (Tech. Rep.). Redmond,
of the Gallup Studies. In C. L. M. Keyes & WA: Microsoft Research.
J. Haidt (Eds.), Flourishing: The positive per- Hofmann, T., Schölkopf, B., & Smola, A. J. (2008).
son and the good life (pp. 205–224). Wash- Kernel Methods in Machine Learning. The An-
ington D.C.: American Psychological Associa- nals of Statistics, 36 (3), 1171–1220.
tion. Huang, G.-B., Chen, L., & Siew, C.-K. (2006). Uni-
Haslam, N., Whelan, J., & Bastian, B. (2009). Big versal approximation using incremental con-
Five traits mediate associations between val- structive feedforward networks with random
ues and subjective well-being. Personality and hidden nodes. IEEE Transactions on Neural
Individual Differences, 46 (1), 40–42. Networks, 17 (4), 879–92.
Hastie, T. (2013). Package ‘gam’ (R pack- Hubbard, R., & Little, E. L. (1988). Promised
age manual Version 1.09). R-project.org. Contribution to Charity and Mail Survey Re-
Retrieved from http://cran.r-project.org/ sponses - Replication with Extension. Public
web/packages/gam/gam.pdf Opinion Quarterly, 52 , 223–230.
Hastie, T., & Efron, B. (2013). Package ‘lars’ (R Huppert, F., & So, T. T. C. (2011). Flourishing
package manual Version 1.2). R-project.org. Across Europe: Application of a New Concep-
Retrieved from http://cran.r-project.org/ tual Framework for Defining Well-Being. Social
web/packages/lars/lars.pdf Indicators Research, 110 (3), 837–861.
Hastie, T., & Tibshirani, R. (1986). Generalized Ad- Hurvich, C. M., Simonoff, J. S., & Tsai, C.-l. (1998).
ditive Models. Statistical Science, 1 (3), 297– Smoothing parameter selection in nonparamet-
318. ric regression using an improved Akaike infor-
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). mation criterion. Journal of the Royal Statis-
The Elements of Statistical Learning (2nd ed.). tical Society. Series B (Methodological), 60 (2),
New York, NY: Springer. 271–293.
Hayfield, T., & Racine, J. S. (2007). The John, O. P., Donahue, E. M., & Kentle, R. L. (1991).
np Package. RNews, 27 (5), 1-32. Re- The Big Five Inventory – Versions 4a and 54.
trieved from http://cran.r-project.org/ (Questionnaire). Berkeley, CA: University of
web/packages/np/vignettes/np.pdf California, Institute of Personality and Social
Hayfield, T., & Racine, J. S. (2013). Package ‘np’ (R Research.
package manual Version 0.50-1). R-project.org. John, O. P., Naumann, L. P., & Soto, C. J. (2008).
Retrieved from http://cran.r-project.org/ Paradigm Shift to the Integrative Big Five
web/packages/np/np.pdf Trait Taxonomy. In O. P. John, R. W. Robins,
Headey, B. W., Veenhoven, R., & Wearing, A. J. & L. A. Pervin (Eds.), Handbook of personality:
(1991). Top Down Versus Bottom Up Theo- Theory and research (3rd ed., pp. 114–158).
ries of Subjective Well-being. Social Indicators New York, NY: Guilford Press.
Research, 24 , 81–100. John, O. P., & Srivastava, S. (1999). The Big

62
References 63

Five Trait Taxonomy: History, Measurement, Social Indicators Research, 17 (1), 1–17.
and Theoretical Perspectives. In O. P. John, Li, Q., & Racine, J. S. (2004). Cross-Validation
R. W. Robins, & L. A. Pervin (Eds.), Hand- Local Linear Nonparametric Regression. Sta-
book of personality: Theory and research (pp. tistica Sinica, 14 , 485–512.
102–138). New York, NY: Guilford Press. Likert, R. (1974). The Method of Constructing an
Jorm, A. F., & Ryan, S. M. (2014). Cross-National Attitute Scale. In G. M. Maranell (Ed.), Scal-
and Historical Differences in Subjective Well- ing: A sourcebook for behavioral scientists (pp.
Being. International Journal of Epidemiology, 233–243). Chicago, IL: Aldine.
43 (1), 1–11. Lucas, R. E., Clark, A. E., Georgellis, Y., & Diener,
Kahneman, D., & Krueger, A. B. (2006). De- E. (2004). Unemployment Alters the Set-Point
velopments in the Measurement of Subjective for Life Satisfaction Andrew. Psychological Sci-
Well-Being. Journal of Economic Perspectives, ence, 15 (1), 8–13.
20 (1), 3–24. Lucas, R. E., Diener, E., & Suh, E. (1996). Discrimi-
Kahneman, D., Krueger, A. B., Schkade, D. A., nant Validity of Well-Being Measures. Journal
Schwarz, N., & Stone, A. A. (2004). A Survey of Personality and Social Psychology, 71 (3),
Method for Characterizing Daily Life Experi- 616–628.
ence: The Day Reconstruction Method. Sci- Luttmer, E. F. P. (2005). Neighbors as Negatives:
ence, 306 (5702), 1776–80. Relative Earnings and Well-Being. The Quar-
Kohavi, R. (1995). A study of Cross-Validation and terly Journal of Economics, 120 (3), 963–1002.
Bootstrap for Accuracy Estimation and Model Lyubomirsky, S., King, L., & Diener, E. (2005). The
Selection. In Proceedings of the 14th interna- Benefits of Frequent Positive Affect: Does Hap-
tional joint conference on artificial intelligence piness Lead to Success? Psychological Bulletin,
- volume 2 (pp. 1137–1145). San Francisco, 131 (6), 803–55.
CA: Morgan Kaufmann Publishers Inc. Marquardt, D. W. (1963). An Algorithm for Least-
Kuhn, M. (2008). Building Predictive Models in R Squares Estimation of Nonlinear Parameters.
Using the caret Package. Journal of Statistical Journal of the Society for Industrial and Ap-
Software, 28 (5), 1–26. plied Mathematics, 11 (2), 431–441.
Kuhn, M. (2014). A Short Introduction Martı́nez, L. G., Rodrı́guez-dı́az, A., Licea, G.,
to the caret Package (R package in- & Castro, J. R. (2010). Big Five Pat-
troduction). R-project.org. Retrieved terns for Software Engineering Roles Using
from http://cran.r-project.org/web/ an ANFIS Learning Approach with RAM-
packages/caret/vignettes/caret.pdf SET. In G. Idorov, A. Hernández Aguirre, &
Kuhn, M., & Johnson, K. (2013). Applied Predictive C. A. Reyes Garcia (Eds.), Advances in Soft
Modeling. New York, NY: Springer. Computing (pp. 428–439). Berlin, Heidelberg,
Kuhn, M., Wing, J., Weston, S., Williams, A., Germany: Springer.
Keefer, C., Engelhardt, A., . . . Mayer, Z. Mason, C. H., & Perreault Jr., W. D. (1991).
(2014). Package ‘caret’ (R package man- Collinearity, Power, and Interpretation of Mul-
ual Version 6.0-22). R-project.org. Re- tiple Regression Analysis. Journal of Marketing
trieved from http://cran.r-project.org/ Research, 28 (3), 268–280.
web/packages/caret/caret.pdf McCrae, R. R., & Costa Jr., P. T. (1985). Updat-
Lacey, H. P., Fagerlin, A., Loewenstein, G., Smith, ing Norman’s ”Adequate Taxonomy”: Intelli-
D. M., Riis, J., & Ubel, P. a. (2008). Are they gence and Personality Dimensions in Natural
really that happy? Exploring scale recalibra- Language and in Questionnaires. Journal of
tion in estimates of well-being. Health psychol- Personality and Social Psychology, 49 (3), 710–
ogy, 27 (6), 669–75. 721.
Larsen, R., Diener, E., & Emmons, R. (1985). An McKee-Ryan, F., Song, Z., Wanberg, C. R., &
evaluation of subjective well-being measures. Kinicki, A. J. (2005). Psychological and

63
64 References

Physical Well-Being During Unemployment: A Research in Personality, 41 (3), 700–706.


Meta-Analytic Study. The Journal of Applied Racine, J. S., & Li, Q. (2004). Nonparametric esti-
Psychology, 90 (1), 53–76. mation of regression functions with both cate-
Minbashian, A., Bright, J. E. H., & Bird, K. D. gorical and continuous data. Journal of Econo-
(2009). A Comparison of Artificial Neural Net- metrics, 119 (1), 99–130.
works and Multiple Regression in the Con- Rajesh, R., & Prakash, J. S. (2011). Extreme Learn-
text of Research on Personality and Work Per- ing Machines - A Review and State-of-the-art.
formance. Organizational Research Methods, International Journal of Wisdom Based Com-
13 (3), 540–561. puting, 1 (1), 35–49.
Møller, M. F. (1993). A Scaled Conjugate Gradient Raykov, T. (1998). On the Use of Confirmatory Fac-
Algorithm for Fast Supervised Learning. Neu- tor Analysis in Personality Research. Personal-
ral Networks, 6 (4), 525–533. ity and Individual Differences, 24 (2), 291–293.
Nadaraya, E. A. (1964). On Estimating Regres- Read, S. J., Monroe, B. M., Brownstein, A. L., Yang,
sion. Theory of Probability and its Applica- Y., Chopra, G., & Miller, L. C. (2010). A
tions, 9 (1), 141–142. Neural Network Model of the Structure and
Nelder, J. A., & Wedderburn, R. W. M. (1972). Gen- Dynamics of Human Personality. Psychologi-
eralized Linear Models. Journal of the Royal cal Review , 117 (1), 61–92.
Statistical Society. Series A (General), 135 (3), Revelle, W. (2014). Package ‘psych’ (R pack-
370–384. age manual Version 1.4.3). R-project.org.
Nenkov, G. Y., Morrin, M., Ward, A., Hulland, J., Retrieved from http://cran.r-project.org/
& Schwartz, B. (2008). A short form of the web/packages/psych/psych.pdf
Maximization Scale: Factor structure, reliabil- Robertson, D. H., & Bellenger, D. N. (1978). A New
ity and validity studies. Judgment and Decision Method of Increasing Mail Survey Responses:
Making, 3 (5), 371–388. Contributions to Charity. Journal of Market-
Ng, Y.-K. (1997). A case for happiness, cardinal- ing, 15 , 632–633.
ism, and interpersonal comparability. The Eco- Rojas, R. (1996). The Backpropagation Algorithm.
nomic Journal , 197 (445), 1848-1858. In Neural networks (pp. 152–184).
Nilsson, N. J. (2005). Introduction to Ma- Rosseel, Y., Oberski, D., Byrnes, J., Vanbrabant,
chine Learning (Unpublished Textbook). Stan- L., Savalei, V., Merkle, E., . . . Barendse,
ford, CA: Robotics Laboratory, Department M. (2014). Package ‘lavaan’ (R pack-
of Computer Science, Stanford University. age manual Version 0.5-16). R-project.org.
Retrieved from http://robotics.stanford Retrieved from http://cran.r-project.org/
.edu/~nilsson/MLBOOK.pdf web/packages/lavaan/lavaan.pdf
Norman, W. T. (1963). Toward an adequate taxon- Rumelhart, D. E., Hinton, G. E., & Williams, R. J.
omy of personality attributes: Replicated fac- (1986). Learning Representations by Back-
tor structure in peer nomination personality Propagating Errors. Nature, 323 (9), 533–536.
ratings. The Journal of Abnormal and Social Ryan, R. M., & Deci, E. L. (2001). On Happiness
Psychology, 66 (6), 574–583. and Human Potential: A Review of Research
Oswald, A. J. (1997). Happiness and Economic Per- on Hedonic and and Eudaimonic Well-Being.
formance. The Economic Journal , 107 (445), Annual Review of Psychology, 52 , 141–166.
1815–1831. Ryff, C. D. (1989). Happiness Is Everything, or Is It?
Page, K. M., & Vella-Brodrick, D. a. (2008). The Explorations on the Meaning of Psychological
‘What’, ‘Why’ and ‘How’ of Employee Well- Well-Being. Journal of Personality and Social
Being: A New Model. Social Indicators Re- Psychology, 57 (6), 1069–1081.
search, 90 (3), 441–458. Ryff, C. D., & Keyes, C. L. M. (1995). The Structure
Quek, M., & Moskowitz, D. S. (2007). Testing Neu- of Psychological Well-Being revisited. Journal
ral Network Models of Personality. Journal of of Personality and Social Psychology, 69 (4),

64
References 65

719–27. Stevenson, B., Becker, G., Blanchflower, D. G.,


Samuel, R., Bergman, M. M., & Hupka-Brunner, Deaton, A., Easterlin, R., Graham, C., . . .
S. (2013). The Interplay between Educational Rayo, L. (2008). Economic Growth and Sub-
Achievement, Occupational Success, and Well- jective Well-being: Reassessing the Easerlin
Being. Social Indicators Research, 111 (1), 75– Paradox (Working Paper No. 14282). Cam-
96. bridge, MA: NBER. Retrieved from http://
Schliep, K., & Hechenbichler, K. (2014). www.nber.org/papers/w14282
Package ‘kknn’ (R package manual Ver- Stevenson, B., & Wolfers, J. (2013). Subjective
sion 1.2-5). R-project.org. Retrieved Well-Being and Income: Is There Any Evi-
from http://cran.r-project.org/web/ dence of Satiation? American Economic Re-
packages/kknn/kknn.pdf view , 103 (3), 598–604.
Schmitt, M., & Dörfel, M. (1999). Procedural injus- Stiglitz, J., Sen, A., & Fitoussi, J.-P. (2009).
tice at work, justice sensitivity, job satisfaction Report by the commission on the mea-
and psychosomatic well-being. European Jour- surement of economic performance
nal of Social Psychology, 29 , 443–453. and social progress (Report). Cam-
Schmitt, M., Gollwitzer, M., Maes, J., & Arbach, D. bridge, MA: CMEPSP. Retrieved from
(2005). Justice Sensitivity. European Journal http://www.stiglitz-sen-fitoussi.fr/
of Psychological Assessment, 21 (3), 202–211. documents/rapport_anglais.pdf
Schwartz, B., Ward, A., Monterosso, J., Stone, A. a., Schwartz, J. E., Broderick, J. E., &
Lyubomirsky, S., White, K., & Lehman, Deaton, A. (2010). A snapshot of the age distri-
D. R. (2002). Maximizing Versus Satisficing: bution of psychological well-being in the United
Happiness Is a Matter of Choice. Journal States. Proceedings of the National Academy
of Personality and Social Psychology, 83 (5), of Sciences of the United States of America,
1178–1197. 107 (22), 9985–90.
Scitovsky, T. (1976). The joyless economy: An Suh, E., Diener, E., & Fujita, F. (1996). Events and
inquiry into human satisfaction and consumer Subjective Well-Being: Only Recent Events
dissatisfaction (17th ed.). New York, NY: Ox- Matter. Journal of Personality and Social Psy-
ford University Press. chology, 70 (5), 1091–102.
Sheldon, K. M., & Hoon, T. H. (2006). The mul- Telfer, E. (1980). Happiness. New York, NY: St.
tiple determination of well-being: Independent Martin’s Press.
effects of positive traits, needs, goals, selves, Thinley, J. (2011). Gross National Happiness: A
social supports, and cultural contexts. Journal Holistic Paradigm for Sustainable Development
of Happiness Studies, 8 (4), 565–592. (Speech). New Delhi, India: Indian Parlament.
Shields, M. A., & Price, S. W. (2005). Exploring the Tibshirani, R. (1996). Regression Shrikage and Selec-
Economic and Social Determinants of Psycho- tion via the Lasso. Journal of the Royal Statis-
logical Well-Being and Perceived Social Sup- tical Society. Series B (Methodological), 58 (1),
port in England. Journal of the Royal Statis- 267–288.
tical Society. Series A (Statistics in Society), Titterington, D. M. (1980). A Comparative Study of
168 (3), 513–537. Kernel-Based Density Estimates for Categori-
Şimşek, O. F., & Koydemir, S. (2012). Linking Meta- cal Data. Technometrics, 22 (2), 259–268.
traits of the Big Five to Well-Being and Ill- Vapnik, V., Golowich, S. E., & Smola, A. (1997).
Being: Do Basic Psychological Needs Matter? Support Vector Method for Function Approx-
Social Indicators Research, 112 (1), 221–238. imation, Regression Estimation, and Signal
Steel, P., Schmidt, J., & Shultz, J. (2008). Refin- Processing. In M. C. Mozer, M. I. Jordan,
ing the Relationship Between Personality and & T. Petsche (Eds.), Advances in Neural In-
Subjective Well-Being. Psychological Bulletin, formation Processing Systems 9 (pp. 281–287).
134 (1), 138–61. Boston, MA: MIT Press.

65
66 References

Veenhoven, R. (1984). Conditions of Happiness. Dor- ware manual Version 4.2). Stuttgart, Ger-
drecht, Netherlands: D. Reidel Publishing. many: University of Stuttgart. Retrieved
Veenhoven, R. (2010). Greater Happiness for a from http://www.ra.cs.uni-tuebingen.de/
Greater Number. Journal of Happiness Stud- downloads/SNNS/SNNSv4.2.Manual.pdf
ies, 11 (5), 605–629. Zou, H., & Hastie, T. (2005). Regularization and
Veenhoven, R. (2013). World Database of Variable Selection via the Elastic Net. Journal
Happiness. Retrieved 20.11.2013, from of the Royal Statistical Society. Series B (Sta-
http://worlddatabaseofhappiness.eur tistical Methodological), 67 (2), 301–320.
.nl/hap_cor/cor_fp.htm Zou, H., & Hastie, T. (2013). Package
Vidaurre, D., Bielza, C., & Larrañaga, P. (2011). ‘elasticnet’ (R package manual Ver-
Lazy lasso for local regression. Computational sion 1.1). R-project.org. Retrieved
Statistics, 27 (3), 531–550. from http://cran.r-project.org/web/
Vittersø, J. (2001). Personality traits and subjective packages/elasticnet/elasticnet.pdf
well-being: emotional stability, not extraver-
sion, is probably the important predictor. Per-
sonality and Individual Differences, 31 , 903–
914.
Waldron, S. (2010). Measuring Subjective Wellbeing
in the UK (Tech. Report). London, England:
Office for National Statistics, UK.
Waterman, A. S. (1993). Two Conceptions of
Happiness: Contrasts of Personal Expressive-
ness (Eudaimonia) and Hedonic Enjoyment.
Journal of Personality and Social Psychology,
64 (4), 678–691.
Watson, G. S. (1964). Smooth Regression Analy-
sis. Sankhyã: The Indian Journal of Statistics,
Series A, 26 (4), 359–372.
Witter, R. A., Okun, M. A., Stock, W. A., & Haring,
M. J. (1984). Education and Subjective Well-
Being: A Meta-Analysis. Educational Evalua-
tion and Policy Analysis, 6 (2), 165–173.
Wood, S. N. (2004). Stable and Efficient Estima-
tion Parameter Multiple Smoothing Models for
Generalized Additive. Journal of the American
Statistical Association, 99 (467), 673–686.
Wright, T. A., & Cropanzano, R. (2000). Psycho-
logical Weil-Being and Job Satisfaction as Pre-
dictors of Job Performance. Journal of Occu-
pational Health Psychology, 5 (1), 84–94.
Yin, P., & Fan, X. (2001). Estimating R2 Shrink-
age in Multiple Regression: A Comparison of
Different Analytical Methods. The Journal of
Experimental Education, 69 (2), 203–224.
Zell, A., Mamier, G., Vogt, M., Mache, N.,
Hübner, R., Döring, S., . . . Gatter, J.
(2013). SNNS - User Manual (Computer soft-

66

You might also like