N2VSCDNNR: A Local Recommender System

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

1

N2VSCDNNR: A Local Recommender System


Based on Node2vec and Rich Information Network
Jinyin Chen, Yangyang Wu, Lu Fan, Xiang Lin, Haibin Zheng, Shanqing Yu, Qi Xuan, Member, IEEE

Abstract—Recommender systems are becoming more and more items increases. Recommender systems [1], [2] thus have been
important in our daily lives. However, traditional recommenda- developed to help people find the items they are interested in
tion methods are challenged by data sparsity and efficiency, as and save their time in the searching process. Recommender
the numbers of users, items, and interactions between the two in
many real-world applications increase fast. In this work, we pro- systems could efficiently avoid information overload, a prob-
arXiv:1904.12605v1 [cs.IR] 12 Apr 2019

pose a novel clustering recommender system based on node2vec lem caused by the increasing amount of data overwhelmingly.
technology and rich information network, namely N2VSCDNNR, It can efficiently predict [3]–[5] the likely preferences of the
to solve these challenges. In particular, we use a bipartite network users, recommend related items for them to facilitate further
to construct the user-item network, and represent the interactions decision.
among users (or items) by the corresponding one-mode projection
network. In order to alleviate the data sparsity problem, we en- One of the most critical issues for recommender systems is
rich the network structure according to user and item categories, the data sparsity. With the increasing scale of the system, the
and construct the one-mode projection category network. Then, number of items often reaches millions, even billion, leading
considering the data sparsity problem in the network, we employ to the quite less possibility of two users focusing on the same
node2vec to capture the complex latent relationships among users items. The common strategy to alleviate the data sparsity prob-
(or items) from the corresponding one-mode projection category
network. Moreover, considering the dependency on parameter lem is using clustering-based recommendation [6], also known
settings and information loss problem in clustering methods, we as local recommendation. Clustering-based recommendation
use a novel spectral clustering method, which is based on dynamic approaches tackle the sparsity challenge by compressing the
nearest-neighbors (DNN) and a novel automatically determining sparse network into a series of subsets. They are much more
cluster number (ADCN) method that determines the cluster general and could be easily implemented across domains. The
centers based on the normal distribution method, to cluster the
users and items separately. After clustering, we propose the two- first clustering-based recommendation method was proposed
phase personalized recommendation to realize the personalized by Ungar et al. [7], where the authors proposed a statistical
recommendation of items for each user. A series of experiments clustering model and determine suitable parameters based on
validate the outstanding performance of our N2VSCDNNR over comparing different methods. In recent years, clustering-based
several advanced embedding and side information based rec- recommendation methods [8], [9] attracts lots of attention
ommendation algorithms. Meanwhile, N2VSCDNNR seems to
have lower time complexity than the baseline methods in online from researchers. Typically, there are two kinds of clustering-
recommendations, indicating its potential to be widely applied in based methods, i.e., single-set clustering and co-clustering.
large-scale systems. Single-set clustering methods, such like user clustering [10]–
Index Terms—Spectral Clustering, Node2vec, Recommender [12] and item clustering [13], cluster variables separately;
System, Bipartite Network, Projection Network. while co-clustering recommenders [14]–[16] cluster users and
items simultaneously. By comparison, single-set clustering
I. I NTRODUCTION methods are more feasible to exploit side-information, while
co-clustering methods focus on transaction information [17].

D UE to the fast development of E-commerce, nowadays


more and more items are sold online. Although conve-
nient, it is becoming more time-consuming as the diversity of
In this work, we thus propose a single-set clustering-based rec-
ommendation framework to integrate both network topology
and side-information.
This work is partially supported by Zhejiang Natural Science Founda- Recently, network analysis technologies are becoming more
tion(LY19F020025), Integration and Application of Human-computer Fusion and more popular for complex systems [18]. Such technologies
Intelligent Image Recognition Algorithm (2018B10063), National Natural can alleviate the data sparsity problem [19] to certain extent
Science Foundation of China (61502423, 61572439), Engineering Research
Center of Cognitive Healthcare of Zhejiang Province, Zhejiang Science and capture latent information beyond explicit features. In
and Technology Plan Project (LGF18F030009, 2017C33149), and Zhejiang recommender systems, it is natural to represent user-item re-
Outstanding Youth Fund (LR19F030001), Key technologies, system and lationships as bipartite networks. However, there are relatively
application of Cyberspace Big Search, Major project of Zhejiang Lab
(2019DH0ZX01). (Corresponding author: Qi Xuan.) few network analysis methods are designed to directly analyze
J. Chen, Y. Wu, X. Lin, H. Zheng, S. Yu and Q. Xuan are with the bipartite networks [20]. In this paper, similar to [21], we first
Institute of Cyberspace Security, and the College of Information Engi- construct the corresponding one-mode projection category net-
neering, Zhejiang University of Technology, Hangzhou 310023, China (e-
mail: {chenjinyin, 2111603080, 201403080215, 201303080231, yushanqing, works which only contains users and items, respectively. Then
xuanqi}@zjut.edu.cn). we apply a special network embedding method on these one-
J. Chen, S. Yu and Q. Xuan are also with the Zhejiang Lab, Hangzhou mode networks. In particular, we employed node2vec [22],
311121, China.
L. Fan is with the Hong Kong Polytechnic University, 11 Yuk Choi Road, an advanced network representational learning algorithm, to
Hung Hom, Kowloon, Hong Kong SAR (e-mail: [email protected]). automatically extract low-dimensional vectors for nodes. This
2

method can capture both local and global structural informa- a target user is selected based on the partition that the user
tion. Moreover, to overcome information loss in the process belongs to. Puntheeranurak and Tsuji [25] proposed a hybrid
of projection and representation, we integrate category infor- recommender system, where they clustered users by adopting
mation to build a more informative network. a fuzzy K-means clustering algorithm. The recommendation
In particular, we first construct two basic bipartite networks, results for both original and clustered data are combined to
i.e., user-item network and item-category network, based on improve the traditional collborative filtering (CF) algorithms.
which we further generate a user-category network. Then, Rana proposed a dynamic recommender system (DRS) to
we transform the three bipartite networks to two one-mode cluster users via an evolutionary algorithm. Wang clustered
projection networks. Next, based on the two related one-mode users by using K-means algorithm and then estimated the
projection networks, we use node2vec algorithm [22] to map absent rating in the user-item matrix to predict the preference
each node into a vector, and use our SCDNN [23] to cluster of a target user. Ji et al. [26] paid more attention to discover the
users and items separately, aiming to deal with the information implicit similarity among users and items, where the authors
loss problem of the projection. Finally, we proposed the two- first clustered user (or item) latent factor vectors into user (or
phase personalized recommendation to realize the personal- item) cluster-level factor vectors. After that, they compressed
ized recommendation of items for each user. By comparing the original approximation into a cluster-level rating-pattern
with [?], the main contributions of this paper are as follows. based on the cluster-level factor vectors.
• We first propose a novel network embedding based clus-
tering recommender system by integrating item category B. Embedding-based Recommender Systems
as side information, namely N2VSCDNNR, which can
alleviate the data sparsity problem to certain extent. In network science, an important question is how to properly
• Then, we proposed a novel automatically determining represent the network information. Network representation
cluster number method in SCDDN, which uses the normal learning, used to learn low-dimensional representations for
distribution method to extract the information of the data nodes or links in the network, is capable to benefit a wide
points and determines the cluster centers based on the range of real-world applications, such as recommender sys-
confidence interval principle. tem [27]–[33].
• Finally, our experimental results demonstrate the out- Recently, the DeepWalk algorithm [34] was proposed to
standing performance of N2VSCDNNR over several ad- transform each node in a network to a vector automatically,
vanced embedding and side information based recom- which takes full advantage of the information of the random
mendation algorithms, and meanwhile N2VSCDNNR has walk sequence in the network. Another network representation
relatively lower time complexity than the others, making learning algorithm based on simple neural network is the LINE
it suitable for online recommendations. algorithm [35] which can be applied to large-scale directed
The remainder of the paper is organized as follows. In weighted networks. Moreover, Grover and Leskovec [22]
Sec. II, we present the related works on clustering-based suggested that increasing the flexibility in searching for ad-
and embedding-based recommender systems; In Sec. III, we jacent nodes is the key to enhance network feature learning.
propose our N2VSCDNNR, which is based on both network They thus proposed the node2vec algorithm, which learns
embedding and clustering algorithms while integrates item low-dimensional representations for nodes by optimizing an
category as side information; In Sec. IV, we compare our objective of neighborhood preserving. It designs a flexible
N2VSCDNNR with the previous version N2VSCDNN [23] neighborhood sampling and a flexible biased random walk
and several advanced network embedding or side informa- procedure that can explore neighborhoods through breadth-
tion based recommendation algorithms on multiple real-world first sampling (BFS) [36] or depth-first sampling (DFS) [37].
datasets; Finally, we conclude the paper in Sec. V. It defines a second-order random walk with two parameters
guiding the walk. One controls how fast the walk explores
and the other controls how fast it leaves the neighborhood
II. R ELATED W ORKS
of the starting node. These two parameters allow our search
A. Clustering-based Recommender Systems to interpolate between BFS and DFS and thereby reflect an
During the past two decades, a large number of studies on affinity for different notions of node equivalences.
recommendation have emerged. Many recommendation meth- Then, with the development of network embedding, a num-
ods suffer sparsity and scalability problems. Clustering-based ber of embedding-based recommender systems were proposed
recommendation methods thus have been widely developed in recently years. For instance, Palumbo et al. [28] proposed
to overcome such shortcomings to certain extent, where users an entity2rec algorithm to learning user-item relatedness from
(items) are grouped into multiple classes, providing a novel knowledge graphs, so as to realize item recommendation. Kiss
way to identify the nearest-neighbors. and Filzmoser [31] proposed a method to map the users and
Most of clustering-based recommendation methods compute items to the same two-dimensional embedding space to make
the similarity based on rating data, and then employ some ba- the recommendations. Swami [29] introduced a heterogeneous
sic clustering algorithm, such as K-means method, to generate representation learning model, called Metapath2vec++, which
the users (items) groups. Sarwar et al. [24] proposed the bisect- uses meta-path-based random walks to construct the heteroge-
ing K-means method clustering algorithm to divide the users neous neighbors of a node and then leverages a heterogeneous
into multiple clusters. In this method, the nearest-neighbors of skip-gram model to perform node embeddings, and further
3

User Item
A B C

Item
DATABASE
User Item
User Category
One-mode

User
Check-in projection

Item Category
Category

User
Rating

Item
node2vec

H G F E User Vectors
D Item Vectors
User u u
d 11
...
d1m d11b ...
b
d1m
.. .. .. ..
Recommended . . . .

Algorithm
Item d iu1 ...
dimu d ib1 ...
dimb
.. .. .. .. .. ..
. . . . . .

u u b b
User Item User clusters Item clusters Clustering
d n1
...
d nm d n1
...
dnm

Fig. 1. The framework of N2VSCDNNR.

make recommendations based on the network representation. clusters to each user-cluster based on K-means method,
Gao et al. [27] proposed a network embedding method for as shown in Fig. 1 F; Second, realize the personalized
bipartite networks, namely BiNE. It generates node sequences recommendation of items for each user in the user cluster,
that can well preserve the long-tail distribution of nodes in as shown in Fig. 1 G and H.
the bipartite networks by performing biased random walks
purposefully. The authors make recommendations with the
A. One-mode Projection of Bipartite Networks
generated network representation. Wen et al. [30] proposed
an embedding based recommendation method. In this model, In recommender systems, constructing user-item bipartite
they use a network embedding method to map each user into networks based on their relationships is ubiquitous. But there
a low dimensional space at first, and then incorporate user is always data sparsity problem, i.e., part of users have very
vectors into a matrix factorization model for recommendation. little records, making the constructed bipartite network not
sufficient to capture the real relationship between users and
III. T HE F RAMEWORK OF N2VSCDNNR items. We thus introduce item categories to effectively solve
In this paper, in order to recommend the items for the users the sparsity problem. In particular, the one-mode projections of
more accurately, we propose N2VSCDNNR, with its whole bipartite networks are performed by the following four steps.
framework shown in Fig. 1, which consists of the following 1) Build two bipartite networks, i.e., the user-item bipar-
four steps. tite network and the item-category bipartite network, as
1) Construct two bipartite networks, i.e., user-item network shown in Fig. 1 A.
and item-category network, based on which we further 2) Build the user-category bipartite network by integrating
generate a user-category network. Then, compress the the user-item bipartite network and the item-category
three bipartite networks by one-mode projection to gener- bipartite network, as shown in Fig. 1 A, where the weight
ate user-user projection network and item-item projection between a user and a category is the total number of times
network, as shown in Fig. 1 A, B, and C. that the user check the items in this category.
2) Apply the node2vec algorithm to generate the user vectors 3) Project the user-item network into two separate networks,
and the item vectors according to the user-user projection i.e., a user-user network and an item-item network, where
network and item-item projection network, respectively, the weight CKij is the number of the common neighbors
as shown in Fig. 1 D. between user (or item) i and user (or item) j in the
3) Use the SCDNN algorithm to cluster the user vectors corresponding bipartite network. Similarly, we obtain
and the item vectors into multiple clusters, respectively, another user-user network and item-item network from
as shown in Fig. 1 E. the two corresponding bipartite network with category,
4) Use the two-phase personalized recommendation to rec- respectively, with the weight denoted by CAij . This
ommend suitable items to users. First, recommend item- process is shown in Fig. 1 B.
4

4) For either users or items, the two projection networks are


integrated as one network, as shown in Fig. 1 C, where
the link weight Wij between user (or item) i and user
(item) j is defined as
d a
Wij = CKij × CAij . (1)
b c
This indicates that, by comparing with the traditional
one-mode projection network just based on user-item
relationships, our method can naturally integrate more
information about items or users. Sparse Cluster
Dense Cluster

Fig. 2. An example of multi-scale dataset.


B. Network Representation Using Node2vec
Though we enrich the network structure according to item Therefore, we can find that the similarity can be indeed
categories, it is difficult to capture appropriate network fea- affected by the density difference between data points. The
tures using traditional network analysis methods. We thus density of the data point i can be calculated as follows:
adopt node2vec to learn continuous feature representations of X
nodes in a network. Since its flexible neighborhood sampling ρi = fi (d(i, j)), (3)
strategy, node2vec can learn rich representation in a network, j

and meanwhile reduce the effect of data sparsity on the 1 x ∈ Hi
recommendation algorithm. Here, we use it to automatically fi (x) = (4)
0 x∈/ Hi
capture network features of the generated projection networks
to transform each user (or item) into a vector. where Hi is a set containing the first p shortest distances
between i and other data points.
Considering the above analysis, we propose a novel simi-
C. Clustering Users and Items by SCDNN larity function, namely DNNs similarity function, defined by:
(
−d2 (i,j)
After transforming the users and items into vectors, the exp( 2[max{σ 2) j ∈ Ti
Sij = i ,σj }] (5)
next important step is clustering them. In this paper, we use 0 j∈/ Ti
the SCDNN method, which is based on DNNs and automati- X d(i, j)
cally cluster number determination algorithm, to cluster users σi = . (6)
|Ti |
(items) into several clusters based on the related user (item) j∈Ti

vectors, as is described from the following two aspects. The local-scale parameter σi of data point i is determined by
First, we construct the DNNs similarity matrix. Many real- the average distance between the data point i and its DNNs
world datasets are multi-scale datasets, which are of quite j ∈ Ti defined as:
different distribution densities of the user (item) in different
clusters. Many clustering methods are difficult to obtain good Ti = {j ∈ Ni |d(i, j) < min (d(k, i))}, (7)
k∈Ji
clustering results on such multi-scale datasets.
Ji = {j ∈ Ni ||ρi − ρj | > θ}, (8)
Considering the above problem, Zelnik and Perona [38]
proposed a novel spectral clustering method, called self- where Ni is the initial neighbor set [39] for the data point i,
tuning spectral clustering (STSC) method, where local-scale and θ represents the density difference threshold.
parameters are adopted. Its Gaussian similarity function is In Fig. 2, we can find that b and d are both located in the
defined as: dense cluster, while a and c are both located in the sparse
 2 
−d (i, j) cluster. Assuming that |ρa − ρb | > θ, |ρa − ρd | > θ, |ρa −
Sij = exp , (2)
2σi σj ρc | < θ, and |ρd − ρb | < θ, based on the definition of the
DNN set, we can conclude that a ∈ / Tb , a ∈/ Td , c ∈ Ta ,
where σi = d(i, t) is the distance between the data point i and and b ∈ Td . Therefore, according to Eqs. (5) and (6), we get
its t-th nearest-neighbors. Sac > Sab = 0 and Sbd > Sad = 0, which are now consistent
In Fig. 2, based on the definition of σ, we can see σc > with the fact, indicating the effectiveness of this new definition
σb > 0 and σa σc > σa σb > 0. Then, based on Eq. (2), of similarity.
we have Sab < Sac , consistent with the fact here. However, Second, for automatically determining the cluster number,
assuming d(d, b) = d(d, a) in Fig. 2, based on the definition of we use the automatically determining cluster number (ADCN)
the local-scale parameters, we can find that σa > σb > 0 and method. Based on the fast density clustering algorithm [40],
σa σd > σb σd > 0. Thus, according to Eq. (2), the similarity in our proposed ADCN method, the cluster centers are auto-
between a and d is larger than that between b and d, which matically determined by constructing a normal distribution for
however is inconsistent with the fact. density-distance mapping to figure out all the cluster centers.
5

 = +5
N3 N1

N1 N3
N2 N2

N2 N3 N1

(a) (b) (c)

Fig. 3. (a) The original data distribution; (b) the density-distance ρ − δ distribution; (c) and the density distribution of γ, for a simple dataset.

Definition 3: The minimum distance of data point i is 1) First, we use the number of user-item relationships be-
defined as the minimum distance between it and the data points tween each item clusters and the target user cluster to
of higher density, as defined by: quantify the item cluster. Then, we use a basic clustering

min{Dh (i)} ρi 6= max(ρ) method, K-means method, to divide all item clusters into
δi = (9) two classes, and recommend the item clusters in the class
max(δ) ρi = max(ρ)
with the larger average weights to the user cluster.
where the set Dh (i) is composed of the data points that have 2) Based on above clustering recommendation results, some
higher density than data point i. traditional recommendation algorithms are adopted to
Now, based on Eq. (3) and Eq. (9), we can calculate the recommend items in the related item clusters to the users
density-distance (ρ − δ) distribution. As an example, Fig. 3 (a) in each user cluster, based on the related rating records.
and (b) give the original data distribution and ρ−δ distribution,
respectively, for a simple dataset, where we can see that cluster
IV. E XPERIMENTS
centers, represented by N1 , N2 and N3 , have relatively larger
ρ and δ than the other data points. The proposed method is tested on multiple real-world
Based on the density-distance distribution, we further intro- datasets. In this paper, the node2vec [22] method is imple-
duce a variable γi for each data point i, defined as mented in Python, the SCDNN method is implemented in Mat-
lab and the conventional recommendation methods are imple-
γi = ρi × δ i , (10) mented in R. In this section, we first introduce the datasets and
to determine the cluster centers more automatically. The the recommendation algorithms for comparison. Meanwhile,
density distribution of γ is shown in Fig. 3 (c), where we can we also visualize the networks established in our framework
see that it is close to a normal distribution. The cluster centers to help readers better understand our N2VSCDNNR. Finally,
have relatively larger γ than the other data points, which thus we give the experimental results with explanations.
can be considered as the exceptional data points. Suppose,
the mean value of γi is µ, and the variance is ω 2 . Based on A. Datasets
the pauta criterion [41], i.e., the probability that γi falls into
the confidence interval [µ − 3ω, µ + 3ω] is 99.73%. Since the The real-world datasets used in the experiments are de-
number of users or number of items in a recommender system scribed as follows, with their basic statistics summarized in
is typically quite large, we enlarge the confidence interval to TABLE I.
[µ − 5ω, µ + 5ω] so that the probability that γi falls into • Yelp: The Yelp dataset includes the user reviews on Yelp
this interval is close to 99.99999999%. In this case, we have from 11 cities across 4 countries. Here, two American
high confidence to consider that almost all the data points cities, i.e., Pittsburgh and Madison, are chosen, and the
are contained in the interval [µ − 5ω, µ + 5ω], while those reviews are utilized to define user-item interactions. There
exceptions, i.e., γi > µ + 5ω, are cluster centers. In particular, are 161 different categories of items for Pittsburgh, and
we use the following two steps to automatically determine the 150 categories of items for Madison.
cluster centers: 1) Calculate the mean value µ and the variance • Amazon: the Amazon dataset contains the user reviews
ω 2 of γ; 2) Treat the data points with γi > µ+5ω as the cluster on Amazon over 26 types of goods. Here, Musical
centers, e.g., N1 , N2 , N3 in Fig. 3 (c). Instruments dataset is chosen and the reviews are uti-
lized to define user-item interaction. There are total 912
D. Two-phase Personalized Recommendation categories of items.
• MovieLens: MovieLens dataset contains the user ratings
After clustering users and items according to their vectors
and the SCDNN algorithm, we first recommend item clusters over movies on MovieLens. the ratings are utilized to
to user clusters, and then further realize the personalized define user-item interaction. Note that it only contains
recommendation of items for each user. We call it a two-phase the users with more than 20 ratings and demographic
personalized recommendation, which is described as follows. information. There are total 50 categories of movies.
6

(b) (a) (c)

Fig. 4. Based on the one-mode projection of (a) the original user-item bipartite network, we obtain (b) the user-user projection network and (c) the item-item
projection network.

TABLE I • Metapath2vec++ [29]: This is the state-of-the-art method


S TATISTICS OF THE FOUR DATASETS . for embedding heterogeneous networks. The meta-path
Dataset #User #Item #Link #Category
scheme chosen in our experiments is item-user-item.
• BiNE [27]: As a novel network representation method,
Yelp (Pittsburgh) 466 1,672 10,373 161
Yelp (Madison) 332 1172 5,597 150 it was proposed to learn the representations for bipartite
Amazon 6831 32054 71,661 912 networks. It jointly models both the explicit relations and
MovieLens 943 1,682 100,000 50 high-order implicit relations in learning the representation
for nodes.
• CoFactor [49]: This is a co-factorization model in-
B. Models and Algorithms for Comparison spired by word2vec [50], [51], which jointly decomposes
the user-item interaction matrix and the item-item co-
In order to evaluate the proposed framework, we compare occurrence matrix with shared item latent factors.
our method with the following two models. For each network representation learning method, we use
• Original: We directly use the traditional recommendation the released implementations of the authors for our experi-
algorithms. ments, and adopt the inner product kernel uTi vj to estimate
• N2VSCDNN: This method was proposed in [23]. As the the preference of user ui on item vj , and evaluate performance
previous version of N2VSCDNNR, it is purely based on on the top-ranked results.
user-item interactions without any category information. In this paper, we use 5-fold cross-validation to evaluate the
performances of the methods, based on the four basic measure-
In the two-phase personalized recommendation, we use the
ments in Top-N recommendation, including precision, recall,
following four popular recommendation algorithms.
Hit Rank (HR) and Average-Reciprocal Hit Rank (ARHR).
• UBCF [42]: User-based collaborative filtering (UBCF) As indicated in [52], when the parameter p in the ADCN
first finds out the users similar to the target user and then method is 1% ∼ 2% of the dataset, we can obtain good
recommend items based on these similar users. clustering result. Therefore, we determine the clusters number
• IBCF [43]: Item-based collaborative filtering (IBCF) is when p = 2%. Moreover, to better reflect the network
a kind of collaborative filtering algorithms based on the structure, we set the feature representation dimension d = 100,
similarity between items calculated by using the ratings and set the in-out and return parameters both to be 1 in the
of these items. node2vec algorithm.
• NMF [44]: Non-negative matrix factorization (NMF) is
a group of algorithms based on multivariate analysis. It C. Network Visualization
seeks to approximate the input matrix with the multipli- In order to provide a more intuitive view of our method, we
cation of the d-dimensional low-rank representations. In visualize the networks generated in the process of analyzing
this paper, we set the dimension d = 40. the Yelp (Madison) dataset as an example.
• Popular: It divides the subset of users and items accord- First, the user-item bipartite network is shown in Fig. 4 (a),
ing to certain rules or attributes, and then recommends where users and items are denoted by yellow and blue nodes,
the items with the highest popularity to the users. respectively. The node size is proportional to its degree in
To validate the effectiveness of our N2VSCDNNR, we the network. Then, the bipartite network is compressed by
choose two advanced embedding-based, as well as a side- one-mode projection to get the corresponding user projection
information-based [27], [45]–[48], recommendation algo- network and item projection network, as shown in Fig. 4 (b)
rithms for comparison, which are briefly described as follows. and (c), respectively.
7

TABLE II
T HE AVERAGE RELATIVE IMPROVEMENTS OF PERFORMANCES (%)
INTRODUCED BY N2VSCDNN AND N2VSCDNNR ON Y ELP
(P ITTSBURGH ), BY ADOPTING THE NMF AND THE UBCF
RECOMMENDATION ALGORITHMS .

Algorithm Model ARHR HR Precision Recall


N2VSCDNNR 46.53 28.48 22.72 52.89
NMF
N2VSCDNN 0.00 6.99 3.41 8.94
N2VSCDNNR 93.84 71.40 77.92 81.45
UBCF
(a) (b) N2VSCDNN 43.73 50.59 42.00 57.75

TABLE III
T HE AVERAGE RELATIVE IMPROVEMENTS OF PERFORMANCES (%)
INTRODUCED BY N2VSCDNN AND N2VSCDNNR ON Y ELP
(M ADISON ), BY ADOPTING THE NMF AND THE UBCF
RECOMMENDATION ALGORITHMS .

Algorithm Model ARHR HR Precision Recall


N2VSCDNNR 3.96 2.49 4.33 8.99
NMF
N2VSCDNN -5.69 -4.25 -2.99 -3.03
N2VSCDNNR 68.58 24.57 40.47 15.91
(c) (d) UBCF
N2VSCDNN 40.26 22.84 36.94 12.68

Fig. 5. (a) The original user-item bipartite network with the user and item
nodes clustered based on the corresponding node vectors. (b) The weighted TABLE IV
bipartite network between user clusters and item clusters. (c) The user-item T HE AVERAGE RELATIVE IMPROVEMENTS OF PERFORMANCES (%)
bipartite subnetwork by only considering the user nodes in the cluster U1. (d) INTRODUCED BY N2VSCDNN AND N2VSCDNNR ON A MAZON , BY
The weighted bipartite network between the user cluster U1 and all the item ADOPTING THE NMF AND THE UBCF RECOMMENDATION ALGORITHMS .
clusters.
Algorithm Model ARHR HR Precision Recall
N2VSCDNNR 29.00 17.92 18.94 21.77
NMF
Next, the node2vec algorithm is applied to transform the N2VSCDNN 16.57 2.19 7.48 9.42
nodes in user (item) projection category network into user N2VSCDNNR 93.84 71.40 77.92 81.45
UBCF
N2VSCDNN 43.73 50.59 42.00 57.75
(item) vectors. And the SCDNN algorithm is used to cluster
users (items), as shown in Fig. 5 (a), based on which a
weighted bipartite network is generated between user (item)
Amazon, by adopting the NMF and UBCF recommendation
clusters, as shown in Fig. 5 (b), where user (item) clusters are
algorithms. However, they are significant for the MovieLens
denoted by yellow (blue) nodes. The node size is proportional
only when the UBCF is adopted. Meanwhile, when comparing
to the number of the users (items) in the cluster. Each link is
the four basic recommendation algorithms, the NMF behaves
weighted by the total number of the relationships between the
the best, the UBCF follows, while the IBCF and the Popular
users (items) in the corresponding user (item) clusters.
behave the worst, by using any model on any dataset.
Take user cluster U1 for example, the relationships between
In the following, we thus mainly focus on the NMF and
it and all the item clusters are shown in Fig. 5 (c)-(d), where
UBCF recommendation algorithms, and try to reveal the
we can see that U1 has relatively stronger relationships with
relatively performance improvements by using N2VSCDNN
the item clusters I2 and I7 than the others. K-means algorithm
and N2VSCDNNR, respectively, comparing with the Original
then is used to divide all item clusters into two classes and
model. The results are presented in TABLE II-V. Overall,
recommend those in the class with larger average weight to
larger relatively improvements are obtained by using our
the target user cluster U1, i.e., here we recommend I2 and I7
N2VSCDNNR, in most cases, for each performance metric.
to U1. Finally, different recommendation algorithms are used
Consistent with the intuitive pictures of Fig 6, for the three
to recommend items in I2 and I7 to each user in U1, according
datasets including Yelp (Pittsburgh), Yelp (Madison), and
to their ratings.
Amazon, such improvements are relatively significant for both
the NMF and UBCF recommendation algorithms, while for
D. Results the MovieLens, such improvements are remarkable only when
In this section, at first, we compare the results obtained our N2VSCDNNR model and the UBCF recommendation
by the three models, including Original, N2VSCDNN and algorithm are adopted together.
N2VSCDNNR, based on the four basic recommendation algo- Quite impressively, when we adopt our N2VSCDNNR
rithms, including NMF, UBCF, IBCF and Popular, as shown in model based on the UBCF recommendation algorithm, we
Fig. 6. We can see that, in general, our method N2VSCDNNR can get huge improvements based on any performance metric,
behaves better than N2VSCDNN, both of which behaves e.g., they are even close to 100% for the Yelp (Pittsburgh) and
better than Original, in almost all the cases, by adopting any Amazon datasets. This indicates that our model is especially
basic recommendation algorithm and using any performance useful to enhance the efficiency of user-based collaborative
measurement. Such superiorities are quite significant for the filtering method. Although the improvements introduced by
first three datasets, i.e., Yelp (Pittsburgh), Yelp (Madison), and our N2VSCDNNR model seems relatively small when the
8

Precision Recall Precision Recall

ARHR HR ARHR HR

Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original

(a) Yelp (Pittsburgh) (b) Yelp (Madison)

Precision Recall Precision Recall

ARHR HR ARHR HR

Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original

(c) Amazon (d) MovieLens

Fig. 6. The recommendation results of NMF-based recommender systems with proposed models and three baselines on the four datasets: (a) Yelp (Pittsburgh),
(b) Yelp (Madison), (c) Amazon, and (d) MovieLens.

Precision Recall Precision Recall

ARHR HR ARHR HR

N2VSCDDR-NMF Metapath2vec BiNE CoFactor N2VSCDDR-NMF Metapath2vec BiNE CoFactor

(a) Yelp (Pittsburgh) (b) Yelp (Madison)

Precision Recall Precision Recall

ARHR HR ARHR HR

N2VSCDDR-NMF Metapath2vec BiNE CoFactor N2VSCDDR-NMF Metapath2vec BiNE CoFactor

(c) Amazon (d) MovieLens

Fig. 7. The recommendation results of various baselines on the four datasets: (a) Yelp (Pittsburgh), (b) Yelp (Madison), (c) Amazon, and (d) MovieLens.

NMF is adopted, but they are still larger than those introduced recommendation algorithm itself behaves quite well even
by the N2VSCDNN model. This is mainly because the NMF based on the Original model, and the potential for further
9

TABLE V complexity than other recommender systems when making


T HE AVERAGE RELATIVE IMPROVEMENTS OF PERFORMANCES (%) online recommendations, especially when we divide items into
INTRODUCED BY N2VSCDNN AND N2VSCDNNR ON M OVIE L ENS , BY
ADOPTING THE NMF AND THE UBCF RECOMMENDATION ALGORITHMS . more clusters while only recommend a small number of them
to the target user cluster. This indicates that N2VSCDNNR
Algorithm Model ARHR HR Precision Recall could be more suitable to be applied in large-scale systems.
N2VSCDNNR 4.16 1.22 5.60 5.07
NMF
N2VSCDNN -2.77 -0.65 -2.17 -2.36
N2VSCDNNR 56.32 27.40 54.74 78.25 TABLE VII
UBCF T HE TIME COMPLEXITY OF THE CONSIDERED RECOMMENDER SYSTEMS .
N2VSCDNN -2.56 -0.55 -2.51 -2.88

Algorithm Time complexity


TABLE VI N2VSCDNNR-NMF O(n · Avg )
T HE AVERAGE PERFORMANCES OF N2VSCDNNR-NMF AND THE THREE Metapath2vec++ O(2(n + m) · w · Iter + n · m)
BASELINES ON THE FOUR DATASETS . BiNE O(2(n + m) · w · Iter + n · m)
CoFactor O(n · m + C)
Algorithm Model ARHR HR Precision Recall
N2VSCDNNR 0.0644 0.2008 0.0153 0.0818
Metapath2vec++ 0.0426 0.1164 0.0095 0.0326
Yelp (Pittsburgh)
BiNE 0.0526 0.1352 0.0115 0.0399
CoFactor 0.0602 0.1832 0.0144 0.0735 V. C ONCLUSION
N2VSCDNNR 0.0686 0.2166 0.0166 0.1042
Metapath2vec++ 0.0381 0.1214 0.0101 0.0341 In this paper, we enrich the network structure based on
Yelp (Madison)
BiNE 0.0481 0.1512 0.0128 0.0542
CoFactor 0.0657 0.2088 0.0160 0.0951 item categories. Then, we establish one-mode user and item
N2VSCDNNR 0.0528 0.1237 0.0093 0.0608 projection networks, and further use node2vec technology to
Metapath2vec++ 0.0223 0.0643 0.0038 0.0257
Amazon transform each user (or item) node to a user (or item) vector.
BiNE 0.0366 0.0875 0.0059 0.0389
CoFactor 0.0494 0.1141 0.0087 0.0575 After that, we cluster users (or items) based on these vectors,
N2VSCDNNR 1.2241 0.9350 0.2933 0.2865
Metapath2vec++ 1.1761 0.8957 0.2810 0.2750 according to which we establish a bipartite cluster network
MovieLens
BiNE 0.7559 0.7108 0.1721 0.1236 using an improved spectral clustering algorithm SCDNN.
CoFactor 1.1761 0.8957 0.2810 0.2750
Based on this bipartite cluster network, for each user cluster,
we keep the item clusters with the most frequent relation-
ships with the user cluster. Finally, we use four different
improvement thus is relatively low.
recommendation algorithms to recommend the items in these
As we can see, the recommendation results obtained by
item clusters to each user in the user cluster. By comparing
N2VSCDNNR based on NMF are better than those based on
with several advanced embedding and side information based
the other basic recommendation algorithms, we thus further
recommendation algorithms, the experiments on four real-
compare the results obtained by N2VSCDNNR-NMF with
world datasets validate the outstanding performance of our
the results obtained by the three advanced embedding and
framework, in terms of both higher precision and recall.
side information based recommendation algorithms, including
Moreover, we also analyze the time complexity of these
Metapath2vec++, BiNE and CoFactor. The results are pre-
recommendation algorithms, and find that our N2VSCDNNR
sented in Fig. 7 and TABLE VI, where we can see that the
has relatively lower time complexity than the others in online
N2VSCDNNR-NMF outperforms all of the baseline methods
recommendation, indicating its potential to be widely applied
in all the cases, while by comparison, Metapath2vec++ per-
in large-scale systems.
forms the worst in most cases. This may be because Metap-
In the future, we are interested in utilizing more network
ath2vec++ just treats the explicit and implicit relations equally
representation methods, besides the node2vec algorithm, in
while ignores their weights which are useful to distinguish the
recommender systems, and also try to find the optimal pa-
importance of various relations.
rameters using some optimization algorithms to obtain more
comprehensive results.
E. Time Complexity
Now, let’s analyze the time complexities of N2VSCDNNR R EFERENCES
and the baselines. We regard the procedures before person-
[1] X. Yang, C. Liang, M. Zhao, H. Wang, H. Ding, Y. Liu, Y. Li, and
alized recommendation as pre-training, and only focus on J. Zhang, “Collaborative filtering-based recommendation of online social
the time complexity of online personalized recommendation. voting,” IEEE Transactions on Computational Social Systems, vol. 4,
In particular, since N2VSCDNNR behaves the best when no. 1, pp. 1–13, 2017.
it is based on NMF, here, we just give the complexity of [2] Y.-Y. Lo, W. Liao, C.-S. Chang, and Y.-C. Lee, “Temporal matrix
factorization for tracking concept drift in individual user preferences,”
N2VSCDNNR-NMF for simplicity. Suppose the number of IEEE Transactions on Computational Social Systems, vol. 5, no. 1, pp.
users is n, the number of items is m, the average number 156–168, 2018.
of items in the selected item clusters is Avg (Avg < m), and [3] C. Fu, M. Zhao, L. Fan, X. Chen, J. Chen, Z. Wu, Y. Xia, and Q. Xuan,
“Link weight prediction using supervised learning methods and its
the number of user consumption records is C. In BiNE and application to yelp layered network,” IEEE Transactions on Knowledge
Metapath2vec++, the window size is w, and the iterations and Data Engineering, vol. 30, no. 8, pp. 1507–1518, 2018.
number is Iter. The time complexities of all the considered [4] J. Chen, Y. Wu, X. Xu, Y. Chen, H. Zheng, and Q. Xuan, “Fast gradient
attack on network embedding,” arXiv preprint arXiv:1809.02797, 2018.
recommender systems are presented in TABLE VII, where we [5] J. Chen, Z. Shi, Y. Wu, X. Xu, and H. Zheng, “Link prediction
can find that our N2VSCDNNR-NMF has much lower time adversarial attack,” arXiv preprint arXiv:1810.01110, 2018.
10

[6] J. D. West, I. Wesley-Smith, and C. T. Bergstrom, “A recommendation [28] E. Palumbo, G. Rizzo, and R. Troncy, “Entity2rec: Learning user-item
system based on hierarchical clustering of an article-level citation relatedness from knowledge graphs for top-n item recommendation,” in
network,” IEEE Transactions on Big Data, vol. 2, no. 2, pp. 113–123, Proceedings of the Eleventh ACM Conference on Recommender Systems.
2016. ACM, 2017, pp. 32–36.
[7] L. H. Ungar and D. P. Foster, “Clustering methods for collaborative [29] Y. Dong, N. V. Chawla, and A. Swami, “metapath2vec: Scalable rep-
filtering,” in AAAI workshop on recommendation systems, vol. 1, 1998, resentation learning for heterogeneous networks,” in Proceedings of the
pp. 114–129. 23rd ACM SIGKDD international conference on knowledge discovery
[8] G.-R. Xue, C. Lin, Q. Yang, W. Xi, H.-J. Zeng, Y. Yu, and Z. Chen, and data mining. ACM, 2017, pp. 135–144.
“Scalable collaborative filtering using cluster-based smoothing,” in Pro- [30] Y. Wen, L. Guo, Z. Chen, and J. Ma, “Network embedding based
ceedings of the 28th annual international ACM SIGIR conference on recommendation method in social networks,” in Companion of the The
Research and development in information retrieval. ACM, 2005, pp. Web Conference 2018 on The Web Conference 2018. International
114–121. World Wide Web Conferences Steering Committee, 2018, pp. 11–12.
[9] X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and [31] L. Grad-Gyenge, A. Kiss, and P. Filzmoser, “Graph embedding based
J. Han, “Personalized entity recommendation: A heterogeneous informa- recommendation techniques on the knowledge graph,” in Adjunct Pub-
tion network approach,” in Proceedings of the 7th ACM international lication of the 25th Conference on User Modeling, Adaptation and
conference on Web search and data mining. ACM, 2014, pp. 283–292. Personalization. ACM, 2017, pp. 354–359.
[10] I. Esslimani, A. Brun, A. Boyer et al., “A collaborative filtering approach [32] D. Wang, G. Xu, and S. Deng, “Music recommendation via hetero-
combining clustering and navigational based correlations.” in WEBIST, geneous information graph embedding,” in 2017 International Joint
2009, pp. 364–369. Conference on Neural Networks (IJCNN). IEEE, 2017, pp. 596–603.
[11] K. Joseph, C. H. Tan, and K. M. Carley, “Beyond local, categories and [33] E. Palumbo, G. Rizzo, R. Troncy, E. Baralis, M. Osella, and E. Ferro,
friends: clustering foursquare users with latent topics,” in Proceedings “Knowledge graph embeddings with node2vec for item recommenda-
of the 2012 ACM conference on ubiquitous computing. ACM, 2012, tion,” in European Semantic Web Conference. Springer, 2018, pp.
pp. 919–926. 117–120.
[12] C. Rana and S. K. Jain, “An evolutionary clustering algorithm based [34] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning
on temporal features for dynamic recommender systems,” Swarm and of social representations,” in Proceedings of the 20th ACM SIGKDD
Evolutionary Computation, vol. 14, pp. 21–30, 2014. international conference on Knowledge discovery and data mining.
[13] M. OConnor and J. Herlocker, “Clustering items for collaborative ACM, 2014, pp. 701–710.
filtering,” in Proceedings of the ACM SIGIR workshop on recommender [35] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line:
systems, vol. 128. UC Berkeley, 1999. Large-scale information network embedding,” in Proceedings of the 24th
[14] T. George and S. Merugu, “A scalable collaborative filtering framework international conference on world wide web. International World Wide
based on co-clustering,” in Fifth IEEE International Conference on Data Web Conferences Steering Committee, 2015, pp. 1067–1077.
Mining (ICDM’05). IEEE, 2005, pp. 4–pp. [36] J. Yang and J. Leskovec, “Overlapping communities explain core–
[15] Y. Zhang, M. Zhang, Y. Liu, S. Ma, and S. Feng, “Localized matrix fac- periphery organization of networks,” Proceedings of the IEEE, vol. 102,
torization for recommendation based on matrix block diagonal forms,” no. 12, pp. 1892–1902, 2014.
in Proceedings of the 22nd international conference on World Wide Web. [37] K. Henderson, B. Gallagher, T. Eliassi-Rad, H. Tong, S. Basu,
ACM, 2013, pp. 1511–1520. L. Akoglu, D. Koutra, C. Faloutsos, and L. Li, “Rolx: structural role
[16] B. Xu, J. Bu, C. Chen, and D. Cai, “An exploration of improving collab- extraction & mining in large graphs,” in Proceedings of the 18th ACM
orative recommender systems via user-item subgroups,” in Proceedings SIGKDD international conference on Knowledge discovery and data
of the 21st international conference on World Wide Web. ACM, 2012, mining. ACM, 2012, pp. 1231–1239.
pp. 21–30.
[38] L. Zelnik-Manor and P. Perona, “Self-tuning spectral clustering,” in
[17] M. Deodhar and J. Ghosh, “A framework for simultaneous co-clustering
Advances in neural information processing systems, 2005, pp. 1601–
and learning from complex data,” in Proceedings of the 13th ACM
1608.
SIGKDD international conference on Knowledge discovery and data
[39] T. Xiang and S. Gong, “Spectral clustering with eigenvector selection,”
mining. ACM, 2007, pp. 250–259.
Pattern Recognition, vol. 41, no. 3, pp. 1012–1029, 2008.
[18] Q. Xuan, M. Zhou, Z.-Y. Zhang, C. Fu, Y. Xiang, Z. Wu, and V. Filkov,
“Modern food foraging patterns: Geography and cuisine choices of [40] C. Jinyin, L. Xiang, Z. Haibing, and B. Xintong, “A novel cluster
restaurant patrons on yelp,” IEEE Transactions on Computational Social center fast determination clustering algorithm,” Applied Soft Computing,
Systems, vol. 5, no. 2, pp. 508–517, 2018. vol. 57, pp. 539–555, 2017.
[19] Q. Xuan, H. Fang, C. Fu, and V. Filkov, “Temporal motifs reveal [41] M. Zhang and H. Yuan, “The pauta criterion and rejecting the abnormal
collaboration patterns in online task-oriented networks,” Physical Review value,” Journal of Zhengzhou University of Technology, vol. 18, no. 1,
E, vol. 91, no. 5, p. 052813, 2015. pp. 84–88, 1997.
[20] Q. Xuan, F. Du, and T.-J. Wu, “Empirical analysis of internet telephone [42] Z.-D. Zhao and M.-S. Shang, “User-based collaborative-filtering rec-
network: From user id to phone,” Chaos: An Interdisciplinary Journal ommendation algorithms on hadoop,” in 2010 Third International Con-
of Nonlinear Science, vol. 19, no. 2, p. 023101, 2009. ference on Knowledge Discovery and Data Mining. IEEE, 2010, pp.
[21] T. Zhou, J. Ren, M. Medo, and Y.-C. Zhang, “Bipartite network 478–481.
projection and personal recommendation,” Physical Review E, vol. 76, [43] B. M. Sarwar, G. Karypis, J. A. Konstan, J. Riedl et al., “Item-based
no. 4, p. 046115, 2007. collaborative filtering recommendation algorithms.” Www, vol. 1, pp.
[22] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for 285–295, 2001.
networks,” in Proceedings of the 22nd ACM SIGKDD international [44] C. H. Ding, T. Li, and M. I. Jordan, “Convex and semi-nonnegative ma-
conference on Knowledge discovery and data mining. ACM, 2016, trix factorizations,” IEEE transactions on pattern analysis and machine
pp. 855–864. intelligence, vol. 32, no. 1, pp. 45–55, 2010.
[23] J. Chen, Y. Wu, L. Fan, X. Lin, H. Zheng, S. Yu, and Q. Xuan, [45] P. K. Gopalan, L. Charlin, and D. Blei, “Content-based recommenda-
“Improved spectral clustering collaborative filtering with node2vec tions with poisson factorization,” in Advances in Neural Information
technology,” in 2017 International Workshop on Complex Systems and Processing Systems, 2014, pp. 3176–3184.
Networks (IWCSN). IEEE, 2017, pp. 330–334. [46] I. Porteous, A. U. Asuncion, and M. Welling, “Bayesian matrix factor-
[24] B. M. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Recommender ization with side information and dirichlet process mixtures.” in AAAI,
systems for large-scale e-commerce: Scalable neighborhood formation 2010.
using clustering,” in Proceedings of the fifth international conference on [47] T. D. T. Do and L. Cao, “Coupled poisson factorization integrated with
computer and information technology, vol. 1, 2002, pp. 291–324. user/item metadata for modeling popular and sparse ratings in scalable
[25] S. Puntheeranurak and H. Tsuji, “A multi-clustering hybrid recom- recommendation,” AAAI2018, pp. 1–7, 2018.
mender system,” in 7th IEEE International Conference on Computer [48] C. Hu, P. Rai, and L. Carin, “Non-negative matrix factorization for dis-
and Information Technology (CIT 2007). IEEE, 2007, pp. 223–228. crete data with hierarchical side-information,” in Artificial Intelligence
[26] K. Ji, R. Sun, X. Li, and W. Shu, “Improving matrix approximation for and Statistics, 2016, pp. 1124–1132.
recommendation via a clustering-based reconstructive method,” Neuro- [49] D. Liang, J. Altosaar, L. Charlin, and D. M. Blei, “Factorization meets
computing, vol. 173, pp. 912–920, 2016. the item embedding:regularizing matrix factorization with item co-
[27] M. Gao, L. Chen, X. He, and A. Zhou, “Bine: Bipartite network occurrence,” in ACM Conference on Recommender Systems, 2016, pp.
embedding.” in SIGIR, 2018, pp. 715–724. 59–66.
11

[50] O. Levy and Y. Goldberg, “Neural word embedding as implicit matrix


factorization,” in Advances in neural information processing systems,
2014, pp. 2177–2185.
[51] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean,
“Distributed representations of words and phrases and their composi-
tionality,” in Advances in neural information processing systems, 2013,
pp. 3111–3119.
[52] A. Rodriguez and A. Laio, “Clustering by fast search and find of density
peaks,” Science, vol. 344, no. 6191, pp. 1492–1496, 2014.

You might also like