Diaby Et Al. 2014
Diaby Et Al. 2014
Diaby Et Al. 2014
(2014) 4:227
DOI 10.1007/s13278-014-0227-z
ORIGINAL ARTICLE
Tristan Launay
Abstract This paper presents content-based recommender Keywords Job recommendation Facebook LinkedIn
systems which propose relevant jobs to Facebook and Link- Content-based recommender system Social recommender
edIn users. These systems have been developed at Work4, the systems SVM
Global Leader in Social and Mobile Recruiting. The profile of
a social network user contains two types of data: user data and
user friend data; furthermore, the profile of our users and the 1 Introduction
description of our jobs consist of text fields. The first exper-
iments suggest that to predict the interests of users for jobs Since the beginning of the 2000s, three famous social
using basic similarity measures together with data collected networks have been launched: LinkedIn in 2003, Facebook
by Work4 can be improved upon. The next experiments then in 2004 and Twitter in 2006; each of them is counting
propose a method to estimate the importance of users’ and millions or even billions of users around the world (Face-
jobs’ different fields in the task of job recommendation; book 2014; LinkedIn 2014; Brain 2014). A social network
taking into account these weights allow us to significantly site is often defined as a web-based service that allows
improve the recommendations. The third part of this paper users to construct profiles (public or semi-public), to share
analyzes social recommendation approaches, validating the connections with other users and to view and traverse lists
suitability for job recommendations for Facebook and Link- of connections made by others in the system (Boyd and
edIn users. The last experiments focus on machine learning Ellison 2008).
algorithms to improve the obtained results with basic simi- Personal information posted by users of a social network
larity measures. Support vector machines (SVM) shows that (which may involve personal description, posts, ratings,
supervised learning procedure increases the performance of likes, but also social links) can be exploited by a recom-
our content-based recommender systems; it yields best results mender system in order to propose or advertise relevant
in terms of AUC in comparison with other investigated items to them; as a result, social network data are becoming
methodologies such as Matrix Factorization and Collabora- very attractive to many companies around the world.
tive Topic Regression. Recommender systems are often defined as software that
elicit the interests or preferences of individual consumers
for products, either explicitly or implicitly, and make rec-
ommendations accordingly (Xiao and Benbasat 2007).
M. Diaby (&) E. Viennet
Université Paris 13, Sorbonne Paris Cité, L2TI, They have many practical applications (Linden et al. 2003;
93430 Villetaneuse, France Bennett et al. 2007) that help users deal with information
e-mail: [email protected]; overload, reason why they have become an active research
[email protected]
field for two decades.
E. Viennet This paper presents online recommender systems that
e-mail: [email protected]
extract social network users’ interests for job offers and
M. Diaby T. Launay then make recommendations to them accordingly; it is
R&D Department of Work4, 3 Rue Moncey, 75009 Paris, France focused on Facebook and LinkedIn. A variant of those
123
227 Page 2 of 17 Soc. Netw. Anal. Min. (2014) 4:227
recommender systems is used by Work4, the Global Leader systems into two main groups: rating-based systems and
in Social and Mobile Recruiting. Due to privacy concerns preference-based filtering techniques.
the proposed recommender systems only use the data that Rating-based recommender systems focus on predicting
social network users explicitly granted access to; they also the absolute values of ratings that individual users would
deal with noisy, missing and unstructured data from social give to the unseen items. For instance, someone who rated
network users and job descriptions. the movies ‘‘ Star Wars’’ 9=10 and ‘‘ The Matrix’’ 7=10
The contributions of this paper are fourfold: would rate ‘‘ X-Men Origins’’ 6=10.
Preference-based filtering techniques predict the correct
1. the first series of experiments on real-world data from
relative order of items for a given user; this is also known
Work4 reveals that Cosine similarity used to recom-
as top-k recommendations. For instance, let us assume the
mend jobs to social networks users (Sect. 4.2) gives
following preferences for a given user: iPad 3
results that can be improved;
Galaxy S III Galaxy tab 2; using the features of items
2. we estimate the importance of each field of users and
and the opinions of other users, a preference-based
jobs in the task of job recommendation (Sect. 4.3) and
system can predict that after iPhone 5 release, the user’s
the application of these importance improves the
new preferences would be: iPhone 5 iPad 3
results with Cosine similarity (Sect. 4.4).
Galaxy S III Galaxy tab 2.
3. our third series of experiments show that the use of
This paper is focused on rating-based recommender
basic social recommendation methods failed to
systems, therefore the term recommender system refers to a
improve our results; however, the use of Rocchio’s
rating-based recommender system in the rest of this
method to enrich users’ vectors with their related jobs’
document.
data improves the results (Sects. 4.5 and 4.6).
Recommender systems are generally classified into three
4. our experiments on machine learning show that the use
or four categories (Bobadilla et al. 2013; Kazienko et al.
of support vector machine significantly improves our
2011; Adomavicius and Tuzhilin 2005; Balabanovic and
results (Sect. 4.7). This method outperforms two
Shoham 1997): content-based methods, collaborative filter-
methods of the literature (Sect. 4.8): a collaborative
ing, demographic filtering systems and hybrid approaches.
filtering (CF) method and the collaborative topic
Content-based recommender systems (Lops et al. 2011;
regression (CTR) method proposed by Wang and Blei
Pazzani and Billsus 2007) use the ratings that users gave to
(2011).
items in the past to define their profiles (Adomavicius and
The rest of this paper is organized as follows: we sum- Tuzhilin 2005); if users have associated descriptions, this
marize work done on recommender systems over the last information is also taken into account. Content-based
two decades in the Sect. 2, the Sects. 3 and 4, respectively, systems use the description of items to extract their pro-
present the proposed recommender systems and the files. In this paper, we develop content-based recommender
obtained results and we sum up and conclude in Sect. 5. systems that use both users’ and jobs’ textual descriptions
to make recommendations.
In contrast to content-based systems, collaborative fil-
2 Related Work tering methods use the opinion of a community of users
similar to the active user to recommend items to him
Recommender systems help users deal with information (Jannach et al. 2011; Salakhutdinov and Mnih 2008b; Le-
overload by recommending items that would be of interest mire and Maclachlan 2005). Conversely, rating predictions
for them. There has been a lot of work done on designing of an item involves known ratings of similar users. Wang
recommender systems during the last two decades. Ama- and Blei (2011) classified collaborative filtering methods
zon.com (Linden et al. 2003) and Netflix (Bennett et al. into two main groups: matrix factorization methods and
2007) are two popular applications of recommender systems. neighborhood methods. Neighborhood methods determine
a set of most similar users to the active user and then
2.1 Recommender systems determine the interest of the active user for items by
aggregating the interests of the similar users. Matrix fac-
Recommender systems (Adomavicius and Tuzhilin 2005; torization (Salakhutdinov and Mnih 2008b; Wang and Blei
Jannach et al. 2011) are mainly related to information 2011) is a method of latent factor models in which users
retrieval (Baeza-Yates and Berthier 1999; Salton et al. and items are represented in low-dimensional space; the
1975), machine learning (de Campos 2010; Salakhutdinov new representations of users and items are commonly
and Mnih 2008a), data mining (Séguela 2012; Han 1996) computed by minimizing the regularized squared error.
and other research fields beyond the scope of this study. Recommendations are made in demographic filtering
Adomavicius and Tuzhilin (2005) classified recommender systems by using users personal attributes (age, gender,
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 3 of 17 227
income, country, survey responses, etc.) (Bobadilla et al. of the associated term for the document. This vector is
2013; Kazienko et al. 2011; Gao et al. 2007). generally constructed using weighting functions and the
Hybrid recommender systems combine two or more ‘‘bag-of-words’’ model and by filtering out unimportant
types of recommender systems into a single model; they terms for the task of recommendation. The main assump-
generally yield better results than simple recommendation tion of the ‘‘bag-of-words’’ model is that the relative order
techniques (Adomavicius and Tuzhilin 2005) but are much of terms in a document has a minor importance in text
more complex to design. In the particular case of the categorization or classification tasks. There are many ways
combination of a content-based system and a collaborative to filter out unimportant terms:
filtering technique, there are mainly four combina-
• define a list of stop words that will automatically be
tions (Adomavicius and Tuzhilin 2005):
removed from the corpus. We can notice that a list of
1. aggregating the two recommendations of a content- stop words depends on the problem one is solving; a list
based and collaborative filtering system (Claypool of all stop words for a specific task of recommendation
et al. 1999); is unknown most of time;
2. adding content-based characteristics to a collaborative • filter out the very high and very low frequency
filtering system: similarities between users or items are terms (Séguela 2012), which requires to define two
computed using content-based data (Balabanovic and thresholds: one for high-frequency terms and one for
Shoham 1997); low-frequency ones. To set these two thresholds, one
3. adding collaborative filtering to a content-based sys- need to conduct experiments on his datasets.
tem: dimensionality reduction techniques on users’ • another way is to filter out some grammatical categories
profiles obtained by combining content and collabora- of words which requires to determine the language and
tive data (Nicholas and Nicholas 1999); nature of words in the corpus; this makes this method
4. developing a single unifying recommendation model slower than the previous two ones.
that uses both content-based and collaborative data (- Weighting functions calculate the importance of a term for
Wang and Blei 2011). a given document; they are generally classified into three
A new emerging category of recommender systems is main categories (Séguela 2012): local functions, global
social recommender systems which use both the active functions and the combinations of local and global
user’s opinion and the opinions of his social connections weighting functions.
(friends on a social network, users with behavior or inter- Local weighting functions only calculate the weight of a
ests similar to the active user’s) to make recommendations given term inside a given document; a most often men-
to him. This category of recommender systems is gaining tioned local weighting function in the literature is the
popularity with the rapid growth of social networks in normalized Term Frequency (TF) defined as follows:
recent years. In the literature, many methods have been ft;d
developed to use both users’ and their friends’ data to make TFt;d ¼ ; ð1Þ
maxk fk;d
recommendations: combination of the matrix factorization
and friendship links to make recommendations (Aranda where ft;d is the frequency of the term t in the document d.
et al. 2007), models that use the matrix factorization Two other methods are the boolean weight (Bool) and
making sure that either the latent vectors of users are as Log Term Frequency (LTF) defined as follows:
close as possible to the weighted sum of those of their
1 if t 2 d
friends or the latent vectors of users are as close as possible Boolt;d ¼ ð2Þ
0 otherwise
to those of their friends individually (Ma et al. 2011).
Currently, the results about social recommender systems
seem mixed (Kantor 2009): some papers reported that
LTFt;d ¼ logð1 þ ft;d Þ: ð3Þ
social recommender systems are no more accurate than
classic ones except in special cases (Golbeck 2006), while Global weighting functions use the whole corpus (set of
others argued that they yield better results than the classic documents) to calculate the weight of a given term. The
ones (Groh and Ehmig 2007). first global weighting function we can cite is the Inverse
Document Frequency (IDF):
2.2 Data representation N
IDFt ¼ 1 þ log ð4Þ
nt
In recommender systems, textual description of a document
(user or item) is generally represented as a vector in which where N is the total number of documents in the corpus, nt
each component has a value that represents the importance is the number of documents that contain the term t.
123
227 Page 4 of 17 Soc. Netw. Anal. Min. (2014) 4:227
Another global weighting function is the Entropy In the literature, we can meet other similarity functions
defined as follows: like the mean squared difference (dissimilarity mea-
X pt logðpt Þ sure) (Shardanand and Maes 1995), the Gaussian and
d d
Entropyt ¼ 1 þ ð5Þ Exponential similarity functions (Séguela 2012) (based on
d
logðNÞ
the mean squared difference dissimilarity) and defined as
ft;d
where ptd ¼ P is the probability that the term t belongs follows:
k ft;k PK !
to the document d and ft;d is defined in (1). 2
k¼1 ðuk vk Þ
In the literature, combinations of a local weighting Gaussianðu; vÞ ¼ exp ð10Þ
2r2
function and a global weighting function generally give
better results than local weighting functions (Salton et al. 0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PK ffi1
2
ðu
k¼1 k v k Þ
1975). TF–IDF is the most famous combination; it is Exponentialðu; vÞ ¼ exp@ A ð11Þ
defined as follows: r
TF IDFt;d ¼ TFt;d IDFt ð6Þ where r is the parameter of the standard deviation to be set.
where t is a term and d is a document. Classic similarity measures (Cosine similarity, PCC, ...)
Another combination is Log-Entropy defined as follows: can work on some specific problems, but do not work on
others: Cosine similarity yields better results in item-item
Log Entropyt;d ¼ LTFt;d Entropyt ð7Þ
filtering systems (Jannach et al. 2011) but in content-based
We use TF–IDF as weighting function in our proposed recommender systems (see Sect. 2.1), if the user term
recommender systems for two reasons: the comparison of space is not completely equal to the item term space, the
different weighting functions shows that we obtain almost computed similarities between users and items using
the same performance for different weighting functions and Cosine similarity or PCC could be close to 0. In the liter-
TF–IDF is known to yield good results in information ature, learnt similarity models from underlying data have
retrieval (Salton et al. 1975). been successfully used because they neatly fit the problems
to be solved. Bayesian Networks (Pazzani and Billsus
2.3 Similarity functions 1997), SVMs (Diaby et al. 2013; Joachims 1998) are two
examples of methods used by researchers.
Recommender systems use many various similarity func-
tions to compute similarity between users, between items 2.4 Performance metrics
or between users and items: some similarity functions are
heuristic while others are learnt models from underlying For a reminder, we have two main groups of recommender
data using machine learning techniques. systems (Adomavicius and Tuzhilin 2005): rating-based
Two well-known similarity measures (Adomavicius and systems and preference-based filtering techniques. The
Tuzhilin 2005; Jannach et al. 2011) are cosine similarity group of a recommender system determines the set of
and Pearson correlation coefficient (PCC). performance metrics we can use to assess its performance.
Cosine similarity is mostly used in content-based rec- A rating-based system is a predictive system (see Sec-
ommender systems: it yields better results in item–item fil- tion 2.1); many performance metrics have been used to
tering systems (Jannach et al. 2011). It measures the cosine assess the performance of predictive systems Omary and
of the angle between two vectors and is defined as follows: Mtenzi (2010); among them we can cite the Precision,
PK Recall, Fb measure, RMSE (root mean square error) and
k¼1 uk vk
cosðu; vÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PK 2ffiqP ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð8Þ MAE (mean absolute error).
K 2
k¼1 k u k¼1 vk The Precision refers to the capacity of a predictive
system to be precise (in the prediction of different classes),
where u and v are the vectors of users or items and K is the
while the Recall refers to its ability to find all elements of a
number of dimensions of u and v.
specific class; they are defined as follows:
PCC is mainly used in collaborative filtering techniques
and is defined as follows: number of items correctly affected to c
PðcÞ ¼ ð12Þ
PK number of items affected to c
k¼1 ðuk uÞðvk vÞ
PCCðu; vÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PK ffi qP
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð9Þ number of items correctly affected to c
ðu 2 K 2 RðcÞ ¼ ð13Þ
k¼1 k uÞ k¼1 ðvk vÞ number of items that belong to c
where u and v are the vectors of users or items, u and v the where PðcÞ and RðcÞ are, respectively, the Precision and
mean values of u and v, respectively. Recall for the class c 2 C and C is the set of classes.
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 5 of 17 227
A predictive system can have a high Recall with a low number (K) of recommended items and eventually compute
Precision or vice versa, that is why Fb (Rijsbergen 1979) the area under this curve.
has been designed to take into account the Recall and the The MAP Aiolli (2013) metric has been used in the MSD
Precision; F1 is the most often mentioned in the literature. (Million Song Dataset) challenge1, it is defined as follows:
Fb for a class c is defined as follows:
1 X1X K
Cuk
MAP@K ¼ 1uk ð16Þ
ð1 þ b2 Þ PðcÞ RðcÞ jUj u2U K k¼1 k
Fb ðcÞ ¼ :
b2 PðcÞ þ RðcÞ
where U is the set of users and Cuk is the number of correct
The global Precision, Recall and Fb can be computed as the ranked in descending order of the ratings to the user u and
average or weighted sum of the performance of the dif-
1 if item at rank k is correct ðfor the user uÞ
ferent classes (Séguela 2012). 1uk ¼
0 otherwise
To compute the above performance metrics for a rec-
ommender system, we need to set a threshold: a given item
The discount cumulative gain (DCG) is defined as follows:
is recommended to a given user if his interest for this item
is greater than the threshold. Setting thresholds is some- 1 XX K
ruk
times tedious, that why we use the AUC-ROC (also known DCGðbÞ@K ¼ ð17Þ
jUj u2U k¼1 maxð1; logb ðkÞÞ
as AUC) as performance metric for our recommender
systems; AUC-ROC is the area under the curve of a ROC where ruk is the rating that the user u gave to the item at the
(receiver operating characteristic) (Omary and Mtenzi position k in the ranked list of K items recommended to u.
2010) obtained by plotting the TP rate (fraction of true The normalized discount cumulative gain (NDCG)
positives) as a function of FP rate (fraction of false posi- (Ravikumar et al. 2011) is therefore defined as follows:
tives). It is used in binary classification tasks. The higher DCG ðbÞ@K
the AUC of a classifier is, the better the system is. We can NDCG ðbÞ@K ¼ ð18Þ
IdealDCG ðbÞ@K
notice that the minimum value of AUC is 0 while its
maximum value is 1, but the AUC for a classifier that where the IdealDCG ðbÞ@K is the DCG ðbÞ@K for the
randomly assigns the different classes is close to 0.5. If the ideal ranking of top K items (for each user, his top K items
AUC of a system is below 0.5, one can inverse each of the are ranked in descending order of the ratings he gave to
predictions to obtain an AUC greater than 0.5. them).
The MAE and RMSE (Ma et al. 2011) are generally
used in regression recommendation problems and they are
defined as follows: 3 Social network-based job recommender systems
1 X
MAE ¼ jrij r^ij j ð14Þ This section presents the description of our proposed job
jCj i;j2C recommender systems: we first present the modeling of our
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi social network users and job offers and then describe the
1 X
RMSE ¼ ðrij r^ij Þ2 ð15Þ proposed recommender systems.
jCj i;j2C
3.1 Document modeling
where rij and r^ij are the original and predicted ratings,
respectively and C is the test set; in regression problems rij
The efficiency of a recommender system generally depends
and r^ij have continuous values while their values are dis-
on how users and items are represented. Our social network
crete in classification problems. users are defined by two types of data: user data and their
In the top-K recommender systems also known as
social connection data.
preference-based filtering recommender systems (in which
the system computes a list of K items to be recommended • user data are both those that users post to social
to each user), we have some interesting metrics like networks and those recorded while they were interact-
MAP@K (Mean Average Precision) and the NDCG@K ing with the platform. Publications, comments, likes,
(Normalized Discount Cumulative Gain). time spent reading or viewing resources are some
Sometimes in the literature we meet an adaptation of examples of interaction data.
predictive systems’ performance metrics to the top-K rec-
ommendations like the Recall@K (Recall for the top
K items recommended to users) used in Wang and Blei
(2011); one can also plot the Recall@K as a function of the 1
http://www.kaggle.com/c/msdchallenge.
123
227 Page 6 of 17 Soc. Netw. Anal. Min. (2014) 4:227
Fig. 1 An example of a Facebook profile; we note three fields (work, education, places lived) with their sub-fields: company name, position, start
date, end date, location, class of, college/university/school name, description and concentrations
• social connection data are the data of users’ connec- • position and description sub-fields of the Facebook
tions on a social network—for instance, the list of Work field,
friends of social network user. • concentrations and description sub-fields of the Face-
The proposed recommender systems (see Sects. 3.2, 3.4, book Education field,
3.5 and 3.7) only use user data and the description of jobs • Facebook Quote, Bio and Interests fields and LinkedIn
to predict users’ interests for jobs while the methods pro- Headline field (they have no sub-field),
posed in Sect. 3.6 also use the social connection data. Our • position and description sub-fields of the LinkedIn
Facebook users have authorized the Work4’s applications Positions field,
to access 5 fields data: Work, Education, Quote, Bio, and • degree, fieldOfStudy, notes and activities sub-fields of
Interests; LinkedIn users have only authorized 3 fields: the LinkedIn Educations field.
Headline, Educations, Positions; our jobs descriptions have It is also important to note that the vectors of users and jobs
3 fields: Title, Description, Responsibilities. Figures 1, 2 depend on their languages. We assign the language ‘‘ Other
and 3, respectively show an example of a Facebook profile, ’’ to all documents (users and jobs) whose languages are
a LinkedIn profile and a job description. different from English and French since we are focused on
For each document (user or job), we extract a vector for only these two languages.
each of its fields (which contain textual information) using
the ‘‘bag-of-words’’ model and TF-IDF as weighting 3.2 First recommender system: Engine-1
function; we also filter out stop words using lists defined by
Work4. The vector of a document is computed as a In Engine-1 the TF-IDF vectors of users and jobs are
weighted sum of the vectors of its different fields; each computed (following the method described in Sect. 3.1) by
weight is the importance of the associated field. Figure 4 assuming that all the fields have the same importance
shows how we aggregate the vectors of different fields for (a0w ¼ a0e ¼ a0b ¼ a0q ¼ a0i ¼ 1 and a1h ¼ a1e ¼ a1p ¼ 1 and
Facebook users, LinkedIn users and Jobs. bt ¼ bd ¼ br ¼ 1) on recommendation scores. We mea-
Users’ term space is not completely equal to the jobs’ sure the interest of a user for a given job by computing the
term space: users generally use an informal vocabulary cosine similarity (8) of the user’s vector and the vector of
(contrasted to job descriptions): users’ data contain many the job. One weakness of this system is that the assumption
typos, abbreviations, some teen text terms, etc. To mitigate of all the fields have the same importance is probably false.
this issue, we filter out some terms (stop words) and we use If the users’ term space is quite different from the jobs’
the data contained in only some sub-fields of users’ fields. term space, Engine-1 will fail to make proper recommen-
Thus we only use the data in the: dations; this could be another weakness of this system, but
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 7 of 17 227
Fig. 2 An example of a
LinkedIn profile with positions
and educations fields and their
sub-fields
we only use the data in some sub-fields to mitigate this where a ¼ ða0 ; a1 Þ, a0 ¼ ða01 ; ; a0f 0 Þ and a1 ¼
u
issue (as described in Sect. 3.1). ða11 ; ; a1f 1 Þ are, respectively, the importance of Facebook
u
and LinkedIn users’ fields, fu0 and fu1 are, respectively, the
3.3 Importance of fields numbers of Facebook and LinkedIn users fields in the training
set, u0f and u1f are, respectively, the vectors of the Facebook
We know that it is unlikely that all the users’ or jobs’
and LinkedIn field f for the user u and finally a0f and a1f are the
fields have the same importance on the recommendation
scores: some might be more important than others. importance of the Facebook and LinkedIn field f .
In order to determine the importance of different The vector vðbÞ of a job v using the importance vector b
fields of users and jobs, we propose the optimization of jobs fields is defined as the weighted sum of its fields
problem below. The vector uðaÞ of a user u using the vectors; formally it is defined as follows:
importance vector a of users fields is defined as the fj
X
weighted sum of his fields vectors; formally it is defined vðbÞ ¼ bf vf ð20Þ
f ¼1
as follows:
fu
X
0 1
fu
X where b ¼ ðb1 ; ; bfj Þ, fj is the number of jobs fields in
uðaÞ ¼ a0f u0f þ a1f u1f ð19Þ the training set, vf is the vector of the field f for the job v
f ¼1 f ¼1 and bf is the importance of the field f .
123
227 Page 8 of 17 Soc. Netw. Anal. Min. (2014) 4:227
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 9 of 17 227
4 Results
123
227 Page 10 of 17 Soc. Netw. Anal. Min. (2014) 4:227
Sects. 3.2, 3.3, 3.4, 3.5, 3.6 and 3.7). The first part is languages mismatch: user’s and job’s languages do
dedicated to the description of our datasets while the sec- not match,
ond part presents the importance of different fields and the countries mismatch: user’s and job’s countries do
results of different proposed recommender systems. not match.
We use AUC-ROC (see Sect. 2.4) as performance Recommending an internship job to a person who has
metric for our different experiments and we compute 95% currently a full-time position or who has been gradu-
empirical confidence intervals [(2.5 %quantile, 97.5 % ated for years is generally considered by the annotators
quantile)] using bootstrapping (100 runs per experiment). as an experience mismatch.
• 1: cannot decide if the user matches the job or not;
4.1 Description of datasets
these entries are not used in this study.
The proposed Engines (see Sect. 3) in this study cannot
We evaluate the performance of our job recommender
detect some of these mismatches like experience mismatch
systems (see Sects. 3.2, 3.4, 3.5, 3.6 and 3.7) on 6
but Work4 is using advanced versions of Engines that can
datasets collected by the company Work4. Each entry in
detect these subtle mismatches.
our datasets is a 3-tuple (u, v, y) where u and v are the
We assumed in Candidate dataset that users only apply
vectors of a given user and job, respectively, and y 2 f0; 1g
to jobs that are relevant for them; this assumption could be
is their associated label; label 1 denotes a matching
false in some situations. The following examples show
between the user and the job while the label is 0 when the
some situations in which our assumption is not correct:
job does not correspond to the user. Here are the descrip-
tions of the 6 collected datasets: • people who are actively looking for a job can apply to
several jobs at same time even if they do not match
1. Candidate: users can use Work4’s applications to apply
their profiles.
to jobs, we assume that users only apply to the jobs that
• people who are changing careers can apply to jobs that
are relevant for them (label =1); this dataset contains
do not match their profiles.
applications’ data.
2. Feedback: contains the feedback from users of It is worth noting that each job is associated to a job page
Work4’s applications. which generally represents a page of a company (in this
3. Random: this dataset contains couples of users–jobs context, the pages are sets of job offers published by the
that have been randomly drawn from Work4’s dat- same company). Hence jobs from the same job page are
abases and manually annotated. generally similar since they are likely from the same
4. Review: it contains recommendations made by company. Table 1 shows the basic statistics from our
Work4’s systems that have been manually validated datasets while Table 2 shows the percentage of empty
by two different teams of the company. fields in each dataset. We can notice that most social net-
5. Validation: it contains recommendations made by works users do not fill completely the corresponding forms;
Work4’s systems that have been manually validated this problem is more severe for Facebook than LinkedIn
by only one team of the company. profiles. Our recommender systems must thus deal with
6. ALL: a sixth dataset has been artificially created: it is incomplete (and very noisy) data.
the union of the 5 previous datasets. After filtering out stop words using lists defined by Work4,
we obtained a dictionary with 218; 533 terms (English terms,
As explained above, the entries in Review and Validation
French terms and other languages’ terms); we focus on
datasets are obtained by manually annotating some rec-
English and French. We use the TF-IDF scores to select the
ommendations made by our systems. To annotate a rec-
most interesting terms in users’ and jobs’ profiles as done in
ommendation of a job to a user, the annotators have all the
Wang and Blei (2011); Blei and Lafferty (2009); Figure 5
available information about the job (title, company,
shows the cumulative weights of users’ and jobs’ terms.
industry, ...) and the information the user has explicitly
authorized our applications to access to (education back-
ground, work history, age, etc.) and they can annotate the 4.2 Engine-1
recommendation as follows:
Figure 6 shows the results of Engine-1 on different datasets
• 1: the user matches the job,
for all users, Facebook users and LinkedIn users; we can
• 0: the user does not match the job; in this case, they can
notice that the results on Feedback dataset are different
justify their decision by:
from the others but not significant (see the associated
experience mismatch: user’s and job’s required confidence intervals). We also notice that results are bad
experience do not match, (AUC \0:5) for Facebook users and all users on ALL
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 11 of 17 227
Table 1 Basic statistics from our datasets: number of instances, proportion of label 0/1 and instances linked to Facebook/LinkedIn users
Datasets
ALL Candidate Feedback Random Review Validation
Facebook
Bio 66.5 81.3 61.8 85.0 69.4 48.4
Education 81.6 88.2 52.7 81.4 78.3 73.9
Interests 76.9 79.1 90.9 94.5 79.7 72.7
Quotes 93.6 87.7 92.7 95.9 97.5 98.8
Work 60.8 73.3 23.6 80.1 45.5 46.2
LinkedIn
Our Facebook users do not
completely fill the fields that are Educations 14.1 12.9 8.4 12.5 13.1 14.0
interesting for our systems. It is Headline 0.3 1.6 0.0 0.3 0.4 0.2
worth noting that the percentage Positions 0.2 3.8 0.0 1.0 0.2 0.1
of empty Responsibilities fields
Jobs
for jobs is very high, this due to
the fact that this field Description 0.0 0.0 0.0 0.5 0.0 0.0
information is sometimes Responsibilities 57.6 52.8 78.6 74.7 65.4 82.0
merged with the Description Title 0.0 0.0 0.0 0.0 0.0 0.1
field one
dataset; these bad results are due to the fact that users’ 1
the costs c0 and c1 as in Anand et al. (2010): c0 ¼ and
terms space is not exactly equal to jobs’ terms space n0
despite the fact that we selected the most interesting sub- 1
c1 ¼ where n0 and n1 are the number of entries with
fields for social networks users. The Sect. 5 explains in n1
details this out-performance. If we consider the AUC-ROC label 0 and label 1 in the training sample, respectively.
scores for Facebook users and LinkedIn users separately, Figure 7 shows the importance of Facebook users’,
we notice that we have better results for LinkedIn users LinkedIn users’ and jobs’ fields with their confidence
(see Fig. 6): LinkedIn users’ terms space seems closer to intervals, respectively. It also suggests that the important
jobs’ terms space than Facebook one. fields in the task of job recommendation are:
• Work field for Facebook users;
4.3 Importance of fields • Headline field for LinkedIn users;
• Title field for jobs.
To compute the importance of users and jobs fields on the
recommendation scores, we solve the constrained optimi- These results seem to make sense since the field work
zation problem stated in the Sect. 3.3 using the function contains useful information to determine the interests of
minimize (from Scipy.optimize module) with the method Facebook users for jobs, the field Headline sums up
SLSQP (Sequential Least Squares Programming). We set LinkedIn users careers and the field title contains needed
123
227 Page 12 of 17 Soc. Netw. Anal. Min. (2014) 4:227
information about a given job to globally determine if it is To compare Engine-2 to Engine-5 which uses the relevance
relevant or not. feedback, we vary the proportion of datasets to be used as
feedback sets from 0 to 0.6; for each user in feedback sets,
4.4 Comparison between Engine-1 and Engine-2 we enrich his vector with the vectors of his linked jobs as
explained in Sect. 3.7; we also set a ¼ b ¼ c ¼ 1. Table 4
Figure 8 depicts the performance of Engine-2 for Facebook shows that the use relevance feedback drastically improves
and LinkedIn users; we note that we obtain higher per- our results; this is very interesting and shows that if we
formance for LinkedIn users than for Facebook users as in have a sufficient set of feedback for a user (jobs that match
Engine-1 (see Sect. 4.2). or not the user), we can make accurate job recommenda-
We compare Engine-1 to Engine-2 to see if the appli- tions to him. Unfortunately this has currently no direct
cation of the importance of users’ and jobs’ fields improves application at Work4 since our users generally do not give
or not our results. Figure 9 shows that the application of the any feedback about the jobs we recommend to them.
importance of fields (computed in Sect. 4.3) significantly
improves our results on all of our datasets except the 4.7 Comparison between Engine-2 and Engine-3
Random dataset, but the results on this dataset are not
significant (see the associated confidence intervals). The To learn our SVM linear model, we set the costs of dif-
performance of Engine-2 is better than Engine-1’s one but ferent classes c0 and c1 as previously (see Sect. 4.3). In our
is not still sufficient for us. context, the ideal procedure of splitting our datasets into
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 13 of 17 227
123
227 Page 14 of 17 Soc. Netw. Anal. Min. (2014) 4:227
Table 3 Social recommendation: comparison between Engine-2, Engine-4a, Engine-4b and Engine-4c in terms of AUC; for a reminder, users’
social vectors/scores = a users vector ? ð1 aÞ users’ friends vectors/scores
Methods Importance a of users’ data (from 0 to 1)
0 0.2 0.4 0.6 0.8 1.0
ALL
Engine-2 0.47 0.01
Engine-4a 0.30 0.01 0.38 0.01 0.39 0.01 0.41 0.01 0.43 0.01 0.47 0.01
Engine-4b 0.30 0.01 0.33 0.01 0.37 0.01 0.40 0.01 0.43 0.01 0.47 0.01
Engine-4c 0.30 0.01 0.34 0.01 0.37 0.01 0.41 0.01 0.43 0.01 0.47 0.01
Validation
Engine-2 0.74 0.01
Engine-4a 0.53 0.01 0.63 0.02 0.67 0.01 0.70 0.01 0.73 0.02 0.74 0.01
Engine-4b 0.53 0.01 0.59 0.02 0.65 0.01 0.71 0.01 0.74 0.02 0.74 0.01
Engine-4c 0.54 0.02 0.61 0.01 0.67 0.01 0.72 0.02 0.74 0.01 0.74 0.01
Review
Engine-2 0.69 0.01
Engine-4a 0.52 0.01 0.61 0.01 0.63 0.01 0.66 0.01 0.68 0.01 0.69 0.01
Engine-4b 0.52 0.01 0.57 0.01 0.61 0.01 0.65 0.01 0.67 0.01 0.69 0.01
Engine-4c 0.53 0.01 0.59 0.01 0.63 0.01 0.66 0.01 0.67 0.01 0.69 0.01
We can note that we obtain the highest AUCs for a ¼ 1 (no use of social data) for social recommendation methods. Section 5 explains in details
why some AUCs are below 0.5
Table 4 Relevance Feedback: comparison between Engine-2 and Engine-5 in terms of AUC
Methods Proportion a of the datasets used as Feedback sets
0 0.2 0.4 0.6
ALL
Engine-2 0.47 0.01
Engine-5 0.47 0.01 0.92 0.01 0.95 0.01 0.97 0.00
Validation
Engine-2 0.74 0.01
Engine-5 0.74 0.01 0.95 0.01 0.98 0.01 0.99 0.00
Review
Engine-2 0.69 0.01
Engine-5 0.69 0.01 0.95 0.01 0.97 0.01 0.99 0.01
We can note that the use of relevance feedback drastically increases the performance of our recommendation engines
Engine-3 outperforms Engine-2 on these datasets; these use the code provided by Wang and Blei (2011) to compare
results show that it is possible to learn from our data a our methods to CF and CTR. For the CTR, we use the
similarity function that yields better results than cosine 25; 000 terms for users and 25; 000 terms for jobs with the
similarity. higher TF-IDF weights (this represents more than 60% of
total inertia for users and jobs; see Fig. 5). Here are the
4.8 Comparison between Engine-2 and Engine-3 parameters we use for CTR and CF:
and two methods of the literature
• the confidence parameter for the label 1: a ¼ 1;
• the confidence parameter for the label 0: b ¼ 0:01 (the
In this section we compare Engine-2 and Engine-3 to two
label 0 can be interpreted into two ways: don’t match or
methods of the literature. The first method is a simple
don’t know);
Collaborative Filtering (CF) based on matrix factorization
• the number of latent dimensions num factors ¼ 50;
and the second method is the Collaborative Topic
• the max number of iterations max iter ¼ 50.
Regression (CTR) proposed by Wang and Blei (2011). We
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 15 of 17 227
Table 5 Comparison between Engine-2, Engine-3, CF and CTR. CF description, if one tries to tell whether this user matches
and CTR are the methods proposed and implemented by Wang and with this job, one will probably first compare the user’s
Blei (2011) using tenfold cross-validation
work history (Facebook) or headline (LinkedIn) field
Methods Different datasets information to the title of the job. The annotators of Work4
ALL Validation Review manually reviewed entries in Review and Validation
datasets (see Sect. 4.1); they probably used information in
Engine-2 0.47 0.01 0.74 0.01 0.69 0.01 these fields to make their reviews.
Engine-3 0.93 0.01 0.89 0.01 0.89 0.01 The application of the importance of different fields (see
CF 0.67 0.03 0.73 0.03 0.84 0.03 Fig. 7) (Engine-2) significantly improves the results
CTR 0.80 0.02 0.78 0.04 0.88 0.01 (compared to Engine-1), but the results could be improved.
Unfortunately, the use of basic methods of social rec-
ommendation (Engine-4) failed to improve our results
Table 5 shows that Engine-3 outperforms CTR on our (compared to Engine-2) and we could not use complex
three biggest datasets. Not surprisingly we obtain better methods of social recommendation due to the nature of our
results for CTR than for CF: CTR is an improvement of CF data. However, the use of relevance feedback drastically
(Wang and Blei 2011); we can also note that CF outper- improves the results; this is very interesting and shows that
forms our Engine-2. The main weakness of CTR is the fact we can improve the performance of heuristic-based job
that it only uses the data of relevant jobs for users while our recommender systems by enriching users’ profiles using
Engine-3 use both the data of relevant and non relevant their feedback.
jobs for users. Our linear SVM-based recommender system (Engine-3)
outperforms the cosine-based methods (Engine-1 and
Engine-2); it also outperforms the CTR and CF proposed
5 Discussion and future work by Wang and Blei (2011).
We have assumed in this paper that users only apply to
We conducted many experiments on real-world data pro- jobs relevant for them (in Candidate dataset), this
vided by Work4 in order to test and improve our proposed assumption could be partially false for many reasons (users
recommender systems. We used AUC as performance were testing Work4 applications when applied to jobs,
metric for our different Engines. etc.); we will try to precisely measure the validity of this
The first series of experiments concludes that Facebook assumption in our future work.
users and LinkedIn users seem to have a vocabulary (set of Users’ term space seems to be not completely equal to the
terms in their profiles) a bit different from the vocabulary jobs’ term space, to solve this issue, we recently used an
of jobs descriptions. However, LinkedIn users vocabulary ontology, O*NET-SOC taxonomy2 (that defines the set of
seems closer to the vocabulary of jobs than Facebook one occupations across the world of work) to develop a new
(see Fig. 6). On ALL dataset, some Engines obtain an AUC taxonomy-based vector model for social networks users and
below 0.5, this could be explained by the combination of jobs’ descriptions suited to the task of job recommendation.
the following facts: We are applying this vector model to predict the audi-
ence of jobs posted by social network users. We are also
• ALL dataset includes all entries in Candidate dataset
developing a much more sophisticated vector model for
[applications of users to jobs (label=1)],
social networks users and jobs that is based on ontologies
• 97 % of entries in Candidate dataset is coming from
and will contains users’ experience/experience required by
Facebook users (see Table 1),
jobs for different occupations, their keywords and their
• Users’ vocabulary is quite different from the vocabu-
related occupations. In this model, we will be using
lary of jobs (this problem is more severe for Facebook
stemming, lemmatization and new lists of stop words to
than LinkedIn profiles),
better preprocess users’ and jobs’ textual documents.3
• and Facebook users’ profiles are less filled than
LinkedIn ones (see Table 2). Acknowledgments This work is supported by Work4, ANRT3 (the
To sum up, the poor performance of some Engines in French National Research and Technology Association), the French
FUI Project AMMICO and the project Open Food System. The
predicting applications of users to jobs (entries with label authors thank Guillaume Leseur, software architect at Work4, Ben-
1) leads them to obtain an AUC below 0.5. jamin Combourieu and all the Work4Engines team. We thank all the
Not surprisingly, we find out that the most relevant
fields for job recommendations are Work (for Facebook 2
http://www.onetcenter.org/taxonomy.html.
user), Headline (for LinkedIn users) and the Title for jobs. 3
ANRT: Association Nationale de la Recherche et de la
For instance, given a Facebook or LinkedIn user and a job Technologie.
123
227 Page 16 of 17 Soc. Netw. Anal. Min. (2014) 4:227
anonymous reviewers who spent their time and energy reviewing this Groh G, Ehmig C (2007) Recommendations in taste related domains:
paper, thanks for your reviews and helpful comments and remarks. collaborative filtering vs. social filtering. In: Proceedings of the
2007 International ACM Conference on Supporting Group
Work, ser. GROUP ’07. ACM, New York, NY, USA,
pp 127–136.
References Han J (1996) Data mining techniques. SIGMOD Rec 25(2): 545
Jannach D, Zanker M, Felfernig A, Friedrich G (2011) Recommender
Adomavicius G, Tuzhilin A (2005) Toward the next generation of systems: an introduction. Cambridge University Press,
recommender systems: a survey of the state-of-the-art and Cambridge
possible extensions. IEEE Trans Knowl Data Eng 17(6): Joachims T (1998) Text categorization with suport vector machines:
734–749 learning with many relevant features. In: Proceedings of the 10th
Aiolli F (2013) Efficient top-n recommendation for very large scale European Conference on Machine Learning, ser. ECML ’98.
binary rated datasets. In: Proceedings of the 7th ACM Confer- Springer, London, UK, pp 137–142.
ence on Recommender Systems, ser. RecSys ’13. ACM, New Kantor PB (2009) Recommender systems handbook. Springer, New
York, NY, USA, pp 273–280. York; London
Anand A, Pugalenthi G, Fogel GB, Suganthan PN (2010) An Kazienko P, Musiał K, Kajdanowicz T (2011) Multidimensional
approach for classification of highly imbalanced data using social network in the social recommender system, vol. 41, no. 4,
weighting and undersampling. Amino Acids 39(5):1385–91 pp. 746–759
Aranda J, Givoni I, Handcock J, Tarlow D (2007) An online social Lemire D, Maclachlan A (2005) Slope one predictors for online
network-based recommendation system. Department of Com- rating-based collaborative filtering. In: Proceedings of SIAM
puter Science-University of Toronto, Computer Sciences Tech- Data Mining (SDM’05)
nical Report Linden G, Smith B, York J (2003) Amazon.com recommendations:
Balabanovic M, Shoham Y (1997) Fab: content-based, collaborative item-to-item collaborative filtering. IEEE Internet Comput
recommendation. Comm. ACM 40(3):66–72 7(1):76–80
Baeza-Yates RA, Berthier R-N (1999) Modern Information Retrieval. LinkedIn (2014). Available at http://press.linkedin.com/about
Addison-Wesley Longman Publishing Co., Inc, Boston, MA, Lops P, de Gemmis M, Semeraro G (2011) Content-based recom-
USA mender systems: state of the art and trends. In: Ricci F, Rokach
Bennett J, Lanning S, Netflix N (2007) The netflix prize. In: In KDD L, Shapira B, Kantor PB (Eds) Recommender systems handbook.
Cup and Workshop in conjunction with KDD Springer, Berlin, pp 73–105.
Blei DM, Lafferty JD (2009) Topic models. In: Srivastava AN, Ma H, Zhou D, Liu C, Lyu MR, King I (2011) Recommender systems
Sahami M (Eds) CRC Press with social regularization. In: Proceedings of the fourth ACM
Bobadilla J, Ortega F, Hernando A, Gutiérrez A (2013) Recom- international conference on Web search and data mining, ser.
mender systems survey. Knowl Based Systems 46:109–132 WSDM ’11. ACM, New York, NY, USA, pp 287–296.
Boyd DM, Ellison NB (2008) Social network sites: definition, history, Nicholas ISC, Nicholas CK (1999) Combining content and collab-
and scholarship. J Comput Mediat Commun 13(1):210–230 oration in text filtering. In: Proceedings of the IJCAI 99
Brain S (2014). Available at http://www.statisticbrain.com/twitter- Workshop on machine learning for information filtering,
statistics/ pp 86–91.
Chang C-C, Lin C-J (2011) Libsvm: a library for support vector Omary Z, Mtenzi F (2010) Machine learning approach to identifying
machines. ACM Trans Intell Syst Technol 2(3), pp 27:1–27:27. the dataset threshold for the performance estimators in super-
Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm vised learning. Int J Infon (IJI) 3
Claypool M, Gokhale A, Miranda T, Murnikov P, Netes D, Sartin M Pazzani MJ, Billsus D (2007) The adaptive web. In: Brusilovsky P,
(1999) Combining content-based and collaborative filters in an Kobsa A, Nejdl W (Eds) Content-based Recommendation
online newspaper Systems. Springer, Berlin, Heidelberg, pp 325–341.
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn Pazzani M, Billsus D (1997) Learning and revising user profiles: the
20(3):273–297 identification of interesting web sites. Mach Learn 27:313–331
Cover TM (1965) Geometrical and statistical properties of systems of Ravikumar P, Tewari A, Yang E (2011) On ndcg consistency of
linear inequalities with applications in pattern recognition. listwise ranking methods. Available at http://www.cs.utexas.edu/
Electron Comput IEEE Trans, vol. EC-14, no. 3, pp 326–334 users/ai-lab/?RTY11
de Campos LM, Fernández-Luna JM, Huete JF, Rueda-Morales MA Rijsbergen CJV (1979) Information Retrieval, 2nd edn. Butterworth-
(2010) Combining content-based and collaborative recommen- Heinemann, Newton, MA, USA
dations: a hybrid approach based on bayesian networks. Int J Rocchio JJ (1971) Relevance feedback in information retrieval. In:
Approx Reason 51(7):785–799 Salton G (Ed) The SMART retrieval system: experiments in
Diaby M, Viennet E, Launay T (2013) Toward the next generation of automatic document processing, ser. Prentice-Hall Series in
recruitment tools: an online social network-based job recom- Automatic Computation. Prentice-Hall, Englewood Cliffs NJ,
mender system. In: Proceedings of the 2013 IEEE/ACM ch. 14, pp 313–323.
International Conference on Advances in Social Networks Salakhutdinov R, Mnih A (2008a) Bayesian probabilistic matrix
Analysis and Mining ASONAM 2013, pp 821–828. factorization using markov chain monte carlo
Facebook (2014) Available at http://newsroom.fb.com/company-info/ Salakhutdinov R, Mnih A (2008b) Probabilistic matrix factorization
Gao F, Xing C, Du X, Wang S (2007) Personalized service system based Salton G, Wang A, Yang C (1975) A vector space model for
on hybrid filtering for digital library. Tsinghua Sci Technol 12(1):1–8 information retrieval. J Am Soc Inf Sci 18(11):613–620
Golbeck J (2006) Generating predictive movie recommendations Séguela J (2012) Fouille de données textuelles et systèmes de
from trust in social networks. In: Proceedings of the 4th recommandation appliqués aux offres d’emploi diffusées sur le
International Conference on Trust Management, ser. iTrust’06. web. Ph.D. dissertation, Conservatoire National des Arts et
Springer, Berlin, Heidelberg, pp 93–104. Métiers (CNAM), Paris, France
123
Soc. Netw. Anal. Min. (2014) 4:227 Page 17 of 17 227
Shardanand U, Maes P (1995) Social information filtering: algorithms Emmanuel Viennet received
for automating ‘‘word of mouth’’. In: Proceedings of the SIGCHI his Ph.D. in computer science in
Conference on human factors in computing systems, ser. CHI 1992 from Universtité Paris 11
’95. ACM Press/Addison-Wesley Publishing Co., New York, (France). He is currently a pro-
NY, USA, pp 210–217. fessor at Université Paris 13 and
Wang C, Blei DM (2011) Collaborative topic modeling for recom- the Ph.D. supervisor of Mama-
mending scientific articles. In: Proceedings of the 17th ACM dou Diaby. His research focuses
SIGKDD international conference on Knowledge discovery and on data mining methods for
data mining, ser. KDD ’11. New York, NY, USA: ACM, social networks and multimedia
pp 448–456. data analysis. He collaborates
Xiao B, Benbasat I (2007) E-commerce product recommendation with the company Work4 on the
agents: use, characteristics, and impact. In: MIS Quarterly, vol. Engines project, an R&D pro-
31, no. 1. Society for Information Management and The ject which aims at developing
Management Information Systems Research Center Minneapo- the next generation of recruit-
lis, MN, USA, pp 137–209. ment tools in social networks.
123