N2VSCDNNR: A Local Recommender System
N2VSCDNNR: A Local Recommender System
N2VSCDNNR: A Local Recommender System
Abstract—Recommender systems are becoming more and more items increases. Recommender systems [1], [2] thus have been
important in our daily lives. However, traditional recommenda- developed to help people find the items they are interested in
tion methods are challenged by data sparsity and efficiency, as and save their time in the searching process. Recommender
the numbers of users, items, and interactions between the two in
many real-world applications increase fast. In this work, we pro- systems could efficiently avoid information overload, a prob-
arXiv:1904.12605v1 [cs.IR] 12 Apr 2019
pose a novel clustering recommender system based on node2vec lem caused by the increasing amount of data overwhelmingly.
technology and rich information network, namely N2VSCDNNR, It can efficiently predict [3]–[5] the likely preferences of the
to solve these challenges. In particular, we use a bipartite network users, recommend related items for them to facilitate further
to construct the user-item network, and represent the interactions decision.
among users (or items) by the corresponding one-mode projection
network. In order to alleviate the data sparsity problem, we en- One of the most critical issues for recommender systems is
rich the network structure according to user and item categories, the data sparsity. With the increasing scale of the system, the
and construct the one-mode projection category network. Then, number of items often reaches millions, even billion, leading
considering the data sparsity problem in the network, we employ to the quite less possibility of two users focusing on the same
node2vec to capture the complex latent relationships among users items. The common strategy to alleviate the data sparsity prob-
(or items) from the corresponding one-mode projection category
network. Moreover, considering the dependency on parameter lem is using clustering-based recommendation [6], also known
settings and information loss problem in clustering methods, we as local recommendation. Clustering-based recommendation
use a novel spectral clustering method, which is based on dynamic approaches tackle the sparsity challenge by compressing the
nearest-neighbors (DNN) and a novel automatically determining sparse network into a series of subsets. They are much more
cluster number (ADCN) method that determines the cluster general and could be easily implemented across domains. The
centers based on the normal distribution method, to cluster the
users and items separately. After clustering, we propose the two- first clustering-based recommendation method was proposed
phase personalized recommendation to realize the personalized by Ungar et al. [7], where the authors proposed a statistical
recommendation of items for each user. A series of experiments clustering model and determine suitable parameters based on
validate the outstanding performance of our N2VSCDNNR over comparing different methods. In recent years, clustering-based
several advanced embedding and side information based rec- recommendation methods [8], [9] attracts lots of attention
ommendation algorithms. Meanwhile, N2VSCDNNR seems to
have lower time complexity than the baseline methods in online from researchers. Typically, there are two kinds of clustering-
recommendations, indicating its potential to be widely applied in based methods, i.e., single-set clustering and co-clustering.
large-scale systems. Single-set clustering methods, such like user clustering [10]–
Index Terms—Spectral Clustering, Node2vec, Recommender [12] and item clustering [13], cluster variables separately;
System, Bipartite Network, Projection Network. while co-clustering recommenders [14]–[16] cluster users and
items simultaneously. By comparison, single-set clustering
I. I NTRODUCTION methods are more feasible to exploit side-information, while
co-clustering methods focus on transaction information [17].
method can capture both local and global structural informa- a target user is selected based on the partition that the user
tion. Moreover, to overcome information loss in the process belongs to. Puntheeranurak and Tsuji [25] proposed a hybrid
of projection and representation, we integrate category infor- recommender system, where they clustered users by adopting
mation to build a more informative network. a fuzzy K-means clustering algorithm. The recommendation
In particular, we first construct two basic bipartite networks, results for both original and clustered data are combined to
i.e., user-item network and item-category network, based on improve the traditional collborative filtering (CF) algorithms.
which we further generate a user-category network. Then, Rana proposed a dynamic recommender system (DRS) to
we transform the three bipartite networks to two one-mode cluster users via an evolutionary algorithm. Wang clustered
projection networks. Next, based on the two related one-mode users by using K-means algorithm and then estimated the
projection networks, we use node2vec algorithm [22] to map absent rating in the user-item matrix to predict the preference
each node into a vector, and use our SCDNN [23] to cluster of a target user. Ji et al. [26] paid more attention to discover the
users and items separately, aiming to deal with the information implicit similarity among users and items, where the authors
loss problem of the projection. Finally, we proposed the two- first clustered user (or item) latent factor vectors into user (or
phase personalized recommendation to realize the personal- item) cluster-level factor vectors. After that, they compressed
ized recommendation of items for each user. By comparing the original approximation into a cluster-level rating-pattern
with [?], the main contributions of this paper are as follows. based on the cluster-level factor vectors.
• We first propose a novel network embedding based clus-
tering recommender system by integrating item category B. Embedding-based Recommender Systems
as side information, namely N2VSCDNNR, which can
alleviate the data sparsity problem to certain extent. In network science, an important question is how to properly
• Then, we proposed a novel automatically determining represent the network information. Network representation
cluster number method in SCDDN, which uses the normal learning, used to learn low-dimensional representations for
distribution method to extract the information of the data nodes or links in the network, is capable to benefit a wide
points and determines the cluster centers based on the range of real-world applications, such as recommender sys-
confidence interval principle. tem [27]–[33].
• Finally, our experimental results demonstrate the out- Recently, the DeepWalk algorithm [34] was proposed to
standing performance of N2VSCDNNR over several ad- transform each node in a network to a vector automatically,
vanced embedding and side information based recom- which takes full advantage of the information of the random
mendation algorithms, and meanwhile N2VSCDNNR has walk sequence in the network. Another network representation
relatively lower time complexity than the others, making learning algorithm based on simple neural network is the LINE
it suitable for online recommendations. algorithm [35] which can be applied to large-scale directed
The remainder of the paper is organized as follows. In weighted networks. Moreover, Grover and Leskovec [22]
Sec. II, we present the related works on clustering-based suggested that increasing the flexibility in searching for ad-
and embedding-based recommender systems; In Sec. III, we jacent nodes is the key to enhance network feature learning.
propose our N2VSCDNNR, which is based on both network They thus proposed the node2vec algorithm, which learns
embedding and clustering algorithms while integrates item low-dimensional representations for nodes by optimizing an
category as side information; In Sec. IV, we compare our objective of neighborhood preserving. It designs a flexible
N2VSCDNNR with the previous version N2VSCDNN [23] neighborhood sampling and a flexible biased random walk
and several advanced network embedding or side informa- procedure that can explore neighborhoods through breadth-
tion based recommendation algorithms on multiple real-world first sampling (BFS) [36] or depth-first sampling (DFS) [37].
datasets; Finally, we conclude the paper in Sec. V. It defines a second-order random walk with two parameters
guiding the walk. One controls how fast the walk explores
and the other controls how fast it leaves the neighborhood
II. R ELATED W ORKS
of the starting node. These two parameters allow our search
A. Clustering-based Recommender Systems to interpolate between BFS and DFS and thereby reflect an
During the past two decades, a large number of studies on affinity for different notions of node equivalences.
recommendation have emerged. Many recommendation meth- Then, with the development of network embedding, a num-
ods suffer sparsity and scalability problems. Clustering-based ber of embedding-based recommender systems were proposed
recommendation methods thus have been widely developed in recently years. For instance, Palumbo et al. [28] proposed
to overcome such shortcomings to certain extent, where users an entity2rec algorithm to learning user-item relatedness from
(items) are grouped into multiple classes, providing a novel knowledge graphs, so as to realize item recommendation. Kiss
way to identify the nearest-neighbors. and Filzmoser [31] proposed a method to map the users and
Most of clustering-based recommendation methods compute items to the same two-dimensional embedding space to make
the similarity based on rating data, and then employ some ba- the recommendations. Swami [29] introduced a heterogeneous
sic clustering algorithm, such as K-means method, to generate representation learning model, called Metapath2vec++, which
the users (items) groups. Sarwar et al. [24] proposed the bisect- uses meta-path-based random walks to construct the heteroge-
ing K-means method clustering algorithm to divide the users neous neighbors of a node and then leverages a heterogeneous
into multiple clusters. In this method, the nearest-neighbors of skip-gram model to perform node embeddings, and further
3
User Item
A B C
Item
DATABASE
User Item
User Category
One-mode
User
Check-in projection
Item Category
Category
User
Rating
Item
node2vec
H G F E User Vectors
D Item Vectors
User u u
d 11
...
d1m d11b ...
b
d1m
.. .. .. ..
Recommended . . . .
Algorithm
Item d iu1 ...
dimu d ib1 ...
dimb
.. .. .. .. .. ..
. . . . . .
u u b b
User Item User clusters Item clusters Clustering
d n1
...
d nm d n1
...
dnm
make recommendations based on the network representation. clusters to each user-cluster based on K-means method,
Gao et al. [27] proposed a network embedding method for as shown in Fig. 1 F; Second, realize the personalized
bipartite networks, namely BiNE. It generates node sequences recommendation of items for each user in the user cluster,
that can well preserve the long-tail distribution of nodes in as shown in Fig. 1 G and H.
the bipartite networks by performing biased random walks
purposefully. The authors make recommendations with the
A. One-mode Projection of Bipartite Networks
generated network representation. Wen et al. [30] proposed
an embedding based recommendation method. In this model, In recommender systems, constructing user-item bipartite
they use a network embedding method to map each user into networks based on their relationships is ubiquitous. But there
a low dimensional space at first, and then incorporate user is always data sparsity problem, i.e., part of users have very
vectors into a matrix factorization model for recommendation. little records, making the constructed bipartite network not
sufficient to capture the real relationship between users and
III. T HE F RAMEWORK OF N2VSCDNNR items. We thus introduce item categories to effectively solve
In this paper, in order to recommend the items for the users the sparsity problem. In particular, the one-mode projections of
more accurately, we propose N2VSCDNNR, with its whole bipartite networks are performed by the following four steps.
framework shown in Fig. 1, which consists of the following 1) Build two bipartite networks, i.e., the user-item bipar-
four steps. tite network and the item-category bipartite network, as
1) Construct two bipartite networks, i.e., user-item network shown in Fig. 1 A.
and item-category network, based on which we further 2) Build the user-category bipartite network by integrating
generate a user-category network. Then, compress the the user-item bipartite network and the item-category
three bipartite networks by one-mode projection to gener- bipartite network, as shown in Fig. 1 A, where the weight
ate user-user projection network and item-item projection between a user and a category is the total number of times
network, as shown in Fig. 1 A, B, and C. that the user check the items in this category.
2) Apply the node2vec algorithm to generate the user vectors 3) Project the user-item network into two separate networks,
and the item vectors according to the user-user projection i.e., a user-user network and an item-item network, where
network and item-item projection network, respectively, the weight CKij is the number of the common neighbors
as shown in Fig. 1 D. between user (or item) i and user (or item) j in the
3) Use the SCDNN algorithm to cluster the user vectors corresponding bipartite network. Similarly, we obtain
and the item vectors into multiple clusters, respectively, another user-user network and item-item network from
as shown in Fig. 1 E. the two corresponding bipartite network with category,
4) Use the two-phase personalized recommendation to rec- respectively, with the weight denoted by CAij . This
ommend suitable items to users. First, recommend item- process is shown in Fig. 1 B.
4
vectors, as is described from the following two aspects. The local-scale parameter σi of data point i is determined by
First, we construct the DNNs similarity matrix. Many real- the average distance between the data point i and its DNNs
world datasets are multi-scale datasets, which are of quite j ∈ Ti defined as:
different distribution densities of the user (item) in different
clusters. Many clustering methods are difficult to obtain good Ti = {j ∈ Ni |d(i, j) < min (d(k, i))}, (7)
k∈Ji
clustering results on such multi-scale datasets.
Ji = {j ∈ Ni ||ρi − ρj | > θ}, (8)
Considering the above problem, Zelnik and Perona [38]
proposed a novel spectral clustering method, called self- where Ni is the initial neighbor set [39] for the data point i,
tuning spectral clustering (STSC) method, where local-scale and θ represents the density difference threshold.
parameters are adopted. Its Gaussian similarity function is In Fig. 2, we can find that b and d are both located in the
defined as: dense cluster, while a and c are both located in the sparse
2
−d (i, j) cluster. Assuming that |ρa − ρb | > θ, |ρa − ρd | > θ, |ρa −
Sij = exp , (2)
2σi σj ρc | < θ, and |ρd − ρb | < θ, based on the definition of the
DNN set, we can conclude that a ∈ / Tb , a ∈/ Td , c ∈ Ta ,
where σi = d(i, t) is the distance between the data point i and and b ∈ Td . Therefore, according to Eqs. (5) and (6), we get
its t-th nearest-neighbors. Sac > Sab = 0 and Sbd > Sad = 0, which are now consistent
In Fig. 2, based on the definition of σ, we can see σc > with the fact, indicating the effectiveness of this new definition
σb > 0 and σa σc > σa σb > 0. Then, based on Eq. (2), of similarity.
we have Sab < Sac , consistent with the fact here. However, Second, for automatically determining the cluster number,
assuming d(d, b) = d(d, a) in Fig. 2, based on the definition of we use the automatically determining cluster number (ADCN)
the local-scale parameters, we can find that σa > σb > 0 and method. Based on the fast density clustering algorithm [40],
σa σd > σb σd > 0. Thus, according to Eq. (2), the similarity in our proposed ADCN method, the cluster centers are auto-
between a and d is larger than that between b and d, which matically determined by constructing a normal distribution for
however is inconsistent with the fact. density-distance mapping to figure out all the cluster centers.
5
= +5
N3 N1
N1 N3
N2 N2
N2 N3 N1
Fig. 3. (a) The original data distribution; (b) the density-distance ρ − δ distribution; (c) and the density distribution of γ, for a simple dataset.
Definition 3: The minimum distance of data point i is 1) First, we use the number of user-item relationships be-
defined as the minimum distance between it and the data points tween each item clusters and the target user cluster to
of higher density, as defined by: quantify the item cluster. Then, we use a basic clustering
min{Dh (i)} ρi 6= max(ρ) method, K-means method, to divide all item clusters into
δi = (9) two classes, and recommend the item clusters in the class
max(δ) ρi = max(ρ)
with the larger average weights to the user cluster.
where the set Dh (i) is composed of the data points that have 2) Based on above clustering recommendation results, some
higher density than data point i. traditional recommendation algorithms are adopted to
Now, based on Eq. (3) and Eq. (9), we can calculate the recommend items in the related item clusters to the users
density-distance (ρ − δ) distribution. As an example, Fig. 3 (a) in each user cluster, based on the related rating records.
and (b) give the original data distribution and ρ−δ distribution,
respectively, for a simple dataset, where we can see that cluster
IV. E XPERIMENTS
centers, represented by N1 , N2 and N3 , have relatively larger
ρ and δ than the other data points. The proposed method is tested on multiple real-world
Based on the density-distance distribution, we further intro- datasets. In this paper, the node2vec [22] method is imple-
duce a variable γi for each data point i, defined as mented in Python, the SCDNN method is implemented in Mat-
lab and the conventional recommendation methods are imple-
γi = ρi × δ i , (10) mented in R. In this section, we first introduce the datasets and
to determine the cluster centers more automatically. The the recommendation algorithms for comparison. Meanwhile,
density distribution of γ is shown in Fig. 3 (c), where we can we also visualize the networks established in our framework
see that it is close to a normal distribution. The cluster centers to help readers better understand our N2VSCDNNR. Finally,
have relatively larger γ than the other data points, which thus we give the experimental results with explanations.
can be considered as the exceptional data points. Suppose,
the mean value of γi is µ, and the variance is ω 2 . Based on A. Datasets
the pauta criterion [41], i.e., the probability that γi falls into
the confidence interval [µ − 3ω, µ + 3ω] is 99.73%. Since the The real-world datasets used in the experiments are de-
number of users or number of items in a recommender system scribed as follows, with their basic statistics summarized in
is typically quite large, we enlarge the confidence interval to TABLE I.
[µ − 5ω, µ + 5ω] so that the probability that γi falls into • Yelp: The Yelp dataset includes the user reviews on Yelp
this interval is close to 99.99999999%. In this case, we have from 11 cities across 4 countries. Here, two American
high confidence to consider that almost all the data points cities, i.e., Pittsburgh and Madison, are chosen, and the
are contained in the interval [µ − 5ω, µ + 5ω], while those reviews are utilized to define user-item interactions. There
exceptions, i.e., γi > µ + 5ω, are cluster centers. In particular, are 161 different categories of items for Pittsburgh, and
we use the following two steps to automatically determine the 150 categories of items for Madison.
cluster centers: 1) Calculate the mean value µ and the variance • Amazon: the Amazon dataset contains the user reviews
ω 2 of γ; 2) Treat the data points with γi > µ+5ω as the cluster on Amazon over 26 types of goods. Here, Musical
centers, e.g., N1 , N2 , N3 in Fig. 3 (c). Instruments dataset is chosen and the reviews are uti-
lized to define user-item interaction. There are total 912
D. Two-phase Personalized Recommendation categories of items.
• MovieLens: MovieLens dataset contains the user ratings
After clustering users and items according to their vectors
and the SCDNN algorithm, we first recommend item clusters over movies on MovieLens. the ratings are utilized to
to user clusters, and then further realize the personalized define user-item interaction. Note that it only contains
recommendation of items for each user. We call it a two-phase the users with more than 20 ratings and demographic
personalized recommendation, which is described as follows. information. There are total 50 categories of movies.
6
Fig. 4. Based on the one-mode projection of (a) the original user-item bipartite network, we obtain (b) the user-user projection network and (c) the item-item
projection network.
TABLE II
T HE AVERAGE RELATIVE IMPROVEMENTS OF PERFORMANCES (%)
INTRODUCED BY N2VSCDNN AND N2VSCDNNR ON Y ELP
(P ITTSBURGH ), BY ADOPTING THE NMF AND THE UBCF
RECOMMENDATION ALGORITHMS .
TABLE III
T HE AVERAGE RELATIVE IMPROVEMENTS OF PERFORMANCES (%)
INTRODUCED BY N2VSCDNN AND N2VSCDNNR ON Y ELP
(M ADISON ), BY ADOPTING THE NMF AND THE UBCF
RECOMMENDATION ALGORITHMS .
Fig. 5. (a) The original user-item bipartite network with the user and item
nodes clustered based on the corresponding node vectors. (b) The weighted TABLE IV
bipartite network between user clusters and item clusters. (c) The user-item T HE AVERAGE RELATIVE IMPROVEMENTS OF PERFORMANCES (%)
bipartite subnetwork by only considering the user nodes in the cluster U1. (d) INTRODUCED BY N2VSCDNN AND N2VSCDNNR ON A MAZON , BY
The weighted bipartite network between the user cluster U1 and all the item ADOPTING THE NMF AND THE UBCF RECOMMENDATION ALGORITHMS .
clusters.
Algorithm Model ARHR HR Precision Recall
N2VSCDNNR 29.00 17.92 18.94 21.77
NMF
Next, the node2vec algorithm is applied to transform the N2VSCDNN 16.57 2.19 7.48 9.42
nodes in user (item) projection category network into user N2VSCDNNR 93.84 71.40 77.92 81.45
UBCF
N2VSCDNN 43.73 50.59 42.00 57.75
(item) vectors. And the SCDNN algorithm is used to cluster
users (items), as shown in Fig. 5 (a), based on which a
weighted bipartite network is generated between user (item)
Amazon, by adopting the NMF and UBCF recommendation
clusters, as shown in Fig. 5 (b), where user (item) clusters are
algorithms. However, they are significant for the MovieLens
denoted by yellow (blue) nodes. The node size is proportional
only when the UBCF is adopted. Meanwhile, when comparing
to the number of the users (items) in the cluster. Each link is
the four basic recommendation algorithms, the NMF behaves
weighted by the total number of the relationships between the
the best, the UBCF follows, while the IBCF and the Popular
users (items) in the corresponding user (item) clusters.
behave the worst, by using any model on any dataset.
Take user cluster U1 for example, the relationships between
In the following, we thus mainly focus on the NMF and
it and all the item clusters are shown in Fig. 5 (c)-(d), where
UBCF recommendation algorithms, and try to reveal the
we can see that U1 has relatively stronger relationships with
relatively performance improvements by using N2VSCDNN
the item clusters I2 and I7 than the others. K-means algorithm
and N2VSCDNNR, respectively, comparing with the Original
then is used to divide all item clusters into two classes and
model. The results are presented in TABLE II-V. Overall,
recommend those in the class with larger average weight to
larger relatively improvements are obtained by using our
the target user cluster U1, i.e., here we recommend I2 and I7
N2VSCDNNR, in most cases, for each performance metric.
to U1. Finally, different recommendation algorithms are used
Consistent with the intuitive pictures of Fig 6, for the three
to recommend items in I2 and I7 to each user in U1, according
datasets including Yelp (Pittsburgh), Yelp (Madison), and
to their ratings.
Amazon, such improvements are relatively significant for both
the NMF and UBCF recommendation algorithms, while for
D. Results the MovieLens, such improvements are remarkable only when
In this section, at first, we compare the results obtained our N2VSCDNNR model and the UBCF recommendation
by the three models, including Original, N2VSCDNN and algorithm are adopted together.
N2VSCDNNR, based on the four basic recommendation algo- Quite impressively, when we adopt our N2VSCDNNR
rithms, including NMF, UBCF, IBCF and Popular, as shown in model based on the UBCF recommendation algorithm, we
Fig. 6. We can see that, in general, our method N2VSCDNNR can get huge improvements based on any performance metric,
behaves better than N2VSCDNN, both of which behaves e.g., they are even close to 100% for the Yelp (Pittsburgh) and
better than Original, in almost all the cases, by adopting any Amazon datasets. This indicates that our model is especially
basic recommendation algorithm and using any performance useful to enhance the efficiency of user-based collaborative
measurement. Such superiorities are quite significant for the filtering method. Although the improvements introduced by
first three datasets, i.e., Yelp (Pittsburgh), Yelp (Madison), and our N2VSCDNNR model seems relatively small when the
8
ARHR HR ARHR HR
Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original
ARHR HR ARHR HR
Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original Recommender: UBCF IBCF Popular NMF Model: N2VSCDDR N2VSCDNN Original
Fig. 6. The recommendation results of NMF-based recommender systems with proposed models and three baselines on the four datasets: (a) Yelp (Pittsburgh),
(b) Yelp (Madison), (c) Amazon, and (d) MovieLens.
ARHR HR ARHR HR
ARHR HR ARHR HR
Fig. 7. The recommendation results of various baselines on the four datasets: (a) Yelp (Pittsburgh), (b) Yelp (Madison), (c) Amazon, and (d) MovieLens.
NMF is adopted, but they are still larger than those introduced recommendation algorithm itself behaves quite well even
by the N2VSCDNN model. This is mainly because the NMF based on the Original model, and the potential for further
9
[6] J. D. West, I. Wesley-Smith, and C. T. Bergstrom, “A recommendation [28] E. Palumbo, G. Rizzo, and R. Troncy, “Entity2rec: Learning user-item
system based on hierarchical clustering of an article-level citation relatedness from knowledge graphs for top-n item recommendation,” in
network,” IEEE Transactions on Big Data, vol. 2, no. 2, pp. 113–123, Proceedings of the Eleventh ACM Conference on Recommender Systems.
2016. ACM, 2017, pp. 32–36.
[7] L. H. Ungar and D. P. Foster, “Clustering methods for collaborative [29] Y. Dong, N. V. Chawla, and A. Swami, “metapath2vec: Scalable rep-
filtering,” in AAAI workshop on recommendation systems, vol. 1, 1998, resentation learning for heterogeneous networks,” in Proceedings of the
pp. 114–129. 23rd ACM SIGKDD international conference on knowledge discovery
[8] G.-R. Xue, C. Lin, Q. Yang, W. Xi, H.-J. Zeng, Y. Yu, and Z. Chen, and data mining. ACM, 2017, pp. 135–144.
“Scalable collaborative filtering using cluster-based smoothing,” in Pro- [30] Y. Wen, L. Guo, Z. Chen, and J. Ma, “Network embedding based
ceedings of the 28th annual international ACM SIGIR conference on recommendation method in social networks,” in Companion of the The
Research and development in information retrieval. ACM, 2005, pp. Web Conference 2018 on The Web Conference 2018. International
114–121. World Wide Web Conferences Steering Committee, 2018, pp. 11–12.
[9] X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and [31] L. Grad-Gyenge, A. Kiss, and P. Filzmoser, “Graph embedding based
J. Han, “Personalized entity recommendation: A heterogeneous informa- recommendation techniques on the knowledge graph,” in Adjunct Pub-
tion network approach,” in Proceedings of the 7th ACM international lication of the 25th Conference on User Modeling, Adaptation and
conference on Web search and data mining. ACM, 2014, pp. 283–292. Personalization. ACM, 2017, pp. 354–359.
[10] I. Esslimani, A. Brun, A. Boyer et al., “A collaborative filtering approach [32] D. Wang, G. Xu, and S. Deng, “Music recommendation via hetero-
combining clustering and navigational based correlations.” in WEBIST, geneous information graph embedding,” in 2017 International Joint
2009, pp. 364–369. Conference on Neural Networks (IJCNN). IEEE, 2017, pp. 596–603.
[11] K. Joseph, C. H. Tan, and K. M. Carley, “Beyond local, categories and [33] E. Palumbo, G. Rizzo, R. Troncy, E. Baralis, M. Osella, and E. Ferro,
friends: clustering foursquare users with latent topics,” in Proceedings “Knowledge graph embeddings with node2vec for item recommenda-
of the 2012 ACM conference on ubiquitous computing. ACM, 2012, tion,” in European Semantic Web Conference. Springer, 2018, pp.
pp. 919–926. 117–120.
[12] C. Rana and S. K. Jain, “An evolutionary clustering algorithm based [34] B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning
on temporal features for dynamic recommender systems,” Swarm and of social representations,” in Proceedings of the 20th ACM SIGKDD
Evolutionary Computation, vol. 14, pp. 21–30, 2014. international conference on Knowledge discovery and data mining.
[13] M. OConnor and J. Herlocker, “Clustering items for collaborative ACM, 2014, pp. 701–710.
filtering,” in Proceedings of the ACM SIGIR workshop on recommender [35] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei, “Line:
systems, vol. 128. UC Berkeley, 1999. Large-scale information network embedding,” in Proceedings of the 24th
[14] T. George and S. Merugu, “A scalable collaborative filtering framework international conference on world wide web. International World Wide
based on co-clustering,” in Fifth IEEE International Conference on Data Web Conferences Steering Committee, 2015, pp. 1067–1077.
Mining (ICDM’05). IEEE, 2005, pp. 4–pp. [36] J. Yang and J. Leskovec, “Overlapping communities explain core–
[15] Y. Zhang, M. Zhang, Y. Liu, S. Ma, and S. Feng, “Localized matrix fac- periphery organization of networks,” Proceedings of the IEEE, vol. 102,
torization for recommendation based on matrix block diagonal forms,” no. 12, pp. 1892–1902, 2014.
in Proceedings of the 22nd international conference on World Wide Web. [37] K. Henderson, B. Gallagher, T. Eliassi-Rad, H. Tong, S. Basu,
ACM, 2013, pp. 1511–1520. L. Akoglu, D. Koutra, C. Faloutsos, and L. Li, “Rolx: structural role
[16] B. Xu, J. Bu, C. Chen, and D. Cai, “An exploration of improving collab- extraction & mining in large graphs,” in Proceedings of the 18th ACM
orative recommender systems via user-item subgroups,” in Proceedings SIGKDD international conference on Knowledge discovery and data
of the 21st international conference on World Wide Web. ACM, 2012, mining. ACM, 2012, pp. 1231–1239.
pp. 21–30.
[38] L. Zelnik-Manor and P. Perona, “Self-tuning spectral clustering,” in
[17] M. Deodhar and J. Ghosh, “A framework for simultaneous co-clustering
Advances in neural information processing systems, 2005, pp. 1601–
and learning from complex data,” in Proceedings of the 13th ACM
1608.
SIGKDD international conference on Knowledge discovery and data
[39] T. Xiang and S. Gong, “Spectral clustering with eigenvector selection,”
mining. ACM, 2007, pp. 250–259.
Pattern Recognition, vol. 41, no. 3, pp. 1012–1029, 2008.
[18] Q. Xuan, M. Zhou, Z.-Y. Zhang, C. Fu, Y. Xiang, Z. Wu, and V. Filkov,
“Modern food foraging patterns: Geography and cuisine choices of [40] C. Jinyin, L. Xiang, Z. Haibing, and B. Xintong, “A novel cluster
restaurant patrons on yelp,” IEEE Transactions on Computational Social center fast determination clustering algorithm,” Applied Soft Computing,
Systems, vol. 5, no. 2, pp. 508–517, 2018. vol. 57, pp. 539–555, 2017.
[19] Q. Xuan, H. Fang, C. Fu, and V. Filkov, “Temporal motifs reveal [41] M. Zhang and H. Yuan, “The pauta criterion and rejecting the abnormal
collaboration patterns in online task-oriented networks,” Physical Review value,” Journal of Zhengzhou University of Technology, vol. 18, no. 1,
E, vol. 91, no. 5, p. 052813, 2015. pp. 84–88, 1997.
[20] Q. Xuan, F. Du, and T.-J. Wu, “Empirical analysis of internet telephone [42] Z.-D. Zhao and M.-S. Shang, “User-based collaborative-filtering rec-
network: From user id to phone,” Chaos: An Interdisciplinary Journal ommendation algorithms on hadoop,” in 2010 Third International Con-
of Nonlinear Science, vol. 19, no. 2, p. 023101, 2009. ference on Knowledge Discovery and Data Mining. IEEE, 2010, pp.
[21] T. Zhou, J. Ren, M. Medo, and Y.-C. Zhang, “Bipartite network 478–481.
projection and personal recommendation,” Physical Review E, vol. 76, [43] B. M. Sarwar, G. Karypis, J. A. Konstan, J. Riedl et al., “Item-based
no. 4, p. 046115, 2007. collaborative filtering recommendation algorithms.” Www, vol. 1, pp.
[22] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for 285–295, 2001.
networks,” in Proceedings of the 22nd ACM SIGKDD international [44] C. H. Ding, T. Li, and M. I. Jordan, “Convex and semi-nonnegative ma-
conference on Knowledge discovery and data mining. ACM, 2016, trix factorizations,” IEEE transactions on pattern analysis and machine
pp. 855–864. intelligence, vol. 32, no. 1, pp. 45–55, 2010.
[23] J. Chen, Y. Wu, L. Fan, X. Lin, H. Zheng, S. Yu, and Q. Xuan, [45] P. K. Gopalan, L. Charlin, and D. Blei, “Content-based recommenda-
“Improved spectral clustering collaborative filtering with node2vec tions with poisson factorization,” in Advances in Neural Information
technology,” in 2017 International Workshop on Complex Systems and Processing Systems, 2014, pp. 3176–3184.
Networks (IWCSN). IEEE, 2017, pp. 330–334. [46] I. Porteous, A. U. Asuncion, and M. Welling, “Bayesian matrix factor-
[24] B. M. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Recommender ization with side information and dirichlet process mixtures.” in AAAI,
systems for large-scale e-commerce: Scalable neighborhood formation 2010.
using clustering,” in Proceedings of the fifth international conference on [47] T. D. T. Do and L. Cao, “Coupled poisson factorization integrated with
computer and information technology, vol. 1, 2002, pp. 291–324. user/item metadata for modeling popular and sparse ratings in scalable
[25] S. Puntheeranurak and H. Tsuji, “A multi-clustering hybrid recom- recommendation,” AAAI2018, pp. 1–7, 2018.
mender system,” in 7th IEEE International Conference on Computer [48] C. Hu, P. Rai, and L. Carin, “Non-negative matrix factorization for dis-
and Information Technology (CIT 2007). IEEE, 2007, pp. 223–228. crete data with hierarchical side-information,” in Artificial Intelligence
[26] K. Ji, R. Sun, X. Li, and W. Shu, “Improving matrix approximation for and Statistics, 2016, pp. 1124–1132.
recommendation via a clustering-based reconstructive method,” Neuro- [49] D. Liang, J. Altosaar, L. Charlin, and D. M. Blei, “Factorization meets
computing, vol. 173, pp. 912–920, 2016. the item embedding:regularizing matrix factorization with item co-
[27] M. Gao, L. Chen, X. He, and A. Zhou, “Bine: Bipartite network occurrence,” in ACM Conference on Recommender Systems, 2016, pp.
embedding.” in SIGIR, 2018, pp. 715–724. 59–66.
11