Abstract
Recommendation systems are obtaining more attention in various application fields
and Engineering, Vellore Institute
especially e-commerce, social networks and tourism etc. The top items are recom-
mended based on the ability of recommender system which predict the future
preference out of the available items. Because of the internet, the people in the current
available at the end of the article
society has too many options that’s why the recommendation system is very essential.
The recommendation is achieved by the particular users who predict the ratings for
numerous items and recommend those items to other users. Majorly, content and
collaborative filtering techniques are employed in typical recommendation systems to
find user preferences and provide final recommendations. But, these systems com-
monly lacks to take growing user preferences in various contextual factors. Context
aware recommendation systems consider various contextual parameters into account
and attempt to catch user preferences appropriately. The majority of the work in the
recommender system domain focuses on increasing the recommendation accuracy by
employing several proposed approaches where the main motive remains to maxi-
mize the accuracy of recommendations while ignoring other design objectives, such
as a user’s an item’s context. Therefore, in this paper an effective deep learning based
context aware recommendation model is proposed which can be act as an efficient
recommender system by showing minimum error during recommendation. Initially,
the dataset is pre-processed using Natural Language Tool Kit (NLTK) in Python platform.
After pre-processing, the TF–IDF and word embedding model is used for every pre-
processed reviews to extract the features and contextual information. The extracted
feature is considered as an input of density based clustering to group the negative,
neutral and positive sentiments of user reviews. Finally, deep recurrent neural Network
(DRNN) is employed to get the most preferable user from every cluster. The recurrent
neural network model parameter values are initialized through the fitness computa-
tion of Bald Eagle Search (BES) algorithm. The proposed model is implemented using
NYC Restaurant Rich Dataset using Python programming platform and performance is
evaluated based on the metrics of accuracy, precision, recall and compared with exist-
ing models. The proposed recommendation model achieves 99.6% accuracy which is
comparatively higher than other machine learning models.
Keywords: Context aware recommendation, Web crawling, User preference vector,
Similarity measure, Deep recurrent neural network
Nowadays, recommendation systems (RSs) are more eminent and they are used in dif-
ferent web application areas. The recommendation system is a kind of software tool that
provides opinions to the users based on their needs also it is known as an information
filtering system. The opinions are like what to purchase, which song to heed, which book
to recite, etc. [1]. The information overloading is a difficult problem on the internet
because of the explosive growth in the count of existing data and the count of visitors
to visit a web page frequently. The common examples for the application areas of rec-
ommendation systems are recommending books in Amazon and recommending mov-
ies in Netflix [2, 3]. The internet store gets personalized for every customers through
recommendation of books from the popular website Amazon.com. Several consumers
or consumer groups benefit through different personalized ideas because many of the
recommendations are personalized also it delivered a ranked list of items. Based on that
ranking, RSs predict apt products and facilities for the users [4].
The most prominent strategies of the recommendation systems are the Content and
Collaborative filtering method, Hybrid recommendation, Knowledge-based filtering,
Demographic method and, Model-based technique. Some researchers use both the
combination of these methods for recommendation systems [5]. In the content-based
filtering, the fundamental process relies on consumer descriptions and their needed
item. Then a profile is managed representing the items by means of signifying it to the
target user who previously adored the same [6]. The Collaborative filtering method is the
most renowned method also widely used in products, services and travel recommenda-
tions [7]. This is also a common method for designing the recommendation system. It
uses a massive volume of data collected from the behaviour of the user in an earlier time
and predicts which item the users like most [8].
Hybrid recommendation approach is the combination of two or more recommenda-
tion method which is used to enhance the quality of recommendations to overcome the
restrictions of outdated recommendation methods and the best example for this method
is Netflix [9]. Knowledge-based filtering recommended the items built on either sugges-
tion related to user preferences or particular domain information regarding how items
connect to user preferences [10]. In the demographic recommender system, it offers rec-
ommendations based on a demographic profile of the user like gender, age, nationality
etc. [11]. The model-based approach is a kind of collaborated filtering technique that
involves constructing a model relevant to dataset rankings. It is known by extracting the
data from the dataset and utilize that as a model to provide recommendations also it
possibly deliver the benefits of scalability and speed [7].
Most of the recommender systems overlook the succeeding information and mainly
concentrated on the content information. Still, the successive data offers more evidence
about the behaviour of the user [12–15]. After many years, web service becomes the
standard technology for sharing software, information, and computing resources on
numerous amount of web pages. It is a process of recognizing useful services and recom-
mending those services to end users [16]. In web services, the web page recommender
is essential for websites. The knowledge representation and integrating the web are the
challenging problems to make effective web page recommendation. So many of the rec-
ommender system uses web usage and domain knowledge through the semantics [17].
The remainder of this paper is structured as follows: the recent related works with
problem definition is provided in “Related works” section. “Proposed methodology
for graph based recommendation model” section presents the details of proposed
context aware recommendation model. “Results and discussion” section illustrates
the simulation and the performance of our proposed model, and lastly, “Conclusion”
section provides the conclusion of the paper.
Related works
Some of the recent related studies related to recommendation models is discussed
Web crawling
The major intention of web crawler is to extract the information from web sources in
customized way. The usage of web crawler is not problematic when obtaining the reviews
from the web sources. It is used to save the web sheets also link them in to the local
sources because which are the central phase of search engines. In an integrated setting,
gathering and analysing the whole contents of web is the common goal of web crawlers.
In this work, Beautiful Soup based web Crawling technique is utilized for scraping the
data’s for websites [28].
Several websites are used to collect the data about web pages. Initially, the pre-pro-
cessing is done to the datasets. The unwanted noise in the dataset is eliminated by this
process also it have direct influence over the calculation of the output. The following dif-
ficulties are processed by the pre-processing step: (i) punctuation removal, (ii) removal
of stop words like prepositions, articles, etc. and (iii) alteration of upper case letters to
lower case letters.
The procedure of stemming is defined by transforming all the words in a text to their
stem or root. The morphological attaches from the words are eliminated by this pro-
cess. The word is formed by stem of a word in which it is known as root. For exam-
ple, words “stems”, “stemmed”, “stemmer”, and “stemming” take a mutual stem that is
“stem”. The different forms of a words are recognized by this stemming and combine
those words form together. Without stemming process, various forms of a single word
is considered as different words. The terms are converted to their stem with the appli-
cation of certain heuristics to remove the suffixes.
During stemming process suffix words like ‘ed’, ‘ing’, ‘ly’, ‘ment’ were eliminated from
every user reviews.
Example for stemming word removal on user review:
Feature extraction
In recommendation system, feature extraction process plays an important role to
reduce the dimensionality of the input data to ensure the prediction accuracy and
enhance the time efficiency. In this recommendation model, related features are
extracted from the terms returned through the pre-processing phase utilizing a nor-
malized TF–IDF, and word embedding method.
where ni is the number of reviews comprising term i and N is the total number of
reviews. TF represents the number of occurrence of each term in a review, while IDF
denotes the length normalization. The pseudo code for the feature extraction process is
illustrated in Algorithm 1. The related-term matrix features acquired at this stage are fed
into the fuzzy based clustering.
Skip-Gram Based on other words the skip-gram method tries to find the words in a
same sentence. We utilize the present word with hidden layer projection as an input that
may calculates the words within the range.
The formula for skip-gram method is,
Q = C × D + D × log2 (v) . (3)
comprises both bag of words and word embedding model to extract the simple expres-
sions from the collections of user tips and identifying the valuable expressions.
In the word embedding model, the threshold value based comparison is made to
extract the context from user tips. Mainly, the similarities among contexts also taken to
strengthen the relations among them. For a user tip T1 = {c1 = {e1 , e2 , e3 }, c2 = {e2 , e4 }},
where c1 and c2 are the contexts for the expressions e1 and e2, correspondingly. If the
value of similarity ValS1,2 = L(cL(c
1 ∩c2 )
among the contexts c1 and c2 is higher than or equal
to a threshold value Tv , then the context c2 is append into the context list of the tip T1.
Likewise, If the value of similarity ValS2,1 among the contexts c2 and c1 is higher than Tv ,
then the context c2 is append into the context list of the tip T1. The setting of threshold
value Tv is user oriented to carrying out the technique.
2 2 1/2
tp − tq hp − hq (4)
d(p, q) = 2
+ ,
tscale h2scale
where delta time has represents as ‘t’, Elevation has represented as ‘h’ and tscale and hscale
used for normalization method so that can able to compare the dataset points over ‘t’
and ‘h’ axis. Therefore, the d(p, q) function are unit less thus the heuristic of effective
way to determine as two parameters. Here the Eps value determine the points at all time
so it can modify only min pts because that can able to estimate the density of the average
points from search space.
The distance between the cluster centroid and the data point is evaluated to update the
cluster centroid which is performed using the different similarity calculation techniques
such as Dice coefficient, Damerau–Levenshtein distance, Tversky index, Cosine similar-
ity, and Jaro–Winkler distance [31] are defined below;
• Dice’s coefficient
This similarity measure used to estimate the similarity of user reviews, and the math-
ematical formulation for similarity estimation, which is shown in Eq. 5
2|A ∩ B|
QS = , (5)
|A| + |B|
where |A| and |B| represents the sum of expressions available in reviews. QS repre-
sents quotient of similarity.
• Cosine similarity
In this similarity method, two vectors of an inner product space is calculated to
determine cosine of the angle. Cosine of 0° value is 1, for some other angle it is
< 1. Majorly, in positive sign cosine similarity is utilized, its outcomes proficiently
limited in (0, 1). Cosine similarity [cos(θ)] can be computed using the subsequent
Eqs. 6 and 7,
Dot product (K , L)
Similarity = cos θ = , (6)
�K � ∗ �L�
Ki Li
cos θ = ,
n (7)
L2 i i
i=1 i=1
|X ∩ Y |
S(X, Y ) = , (8)
|X ∩ Y | + α|X − Y | + β|Y − X|
where |si| represents string length, transposition number is denoted as t and the
number of matching characters are denoted as m.
• Damerau–Levenshtein distance
It is also considered as string metric which estimate the edit distance among
two different sequences. Informally, the less number of operations (like, substi-
tution, transposition, deletion, and insertion) that are required to convert one
word to another is obtained by Damerau–Levenshtein distance.
distance between the string a having i symbol prefix and the string b of j symbol
Input Layer
Review 1(text)
Hidden Layer
x (k–1)
u (k)
Review 3(text)
y (k)
Review n(text)
Fig. 2 Recurrent neural network architecture of venue recommendation based on user tags and tips
yt = Wy ht , (11)
S(t) = , (13)
1 + e−t
To get the updated information the values of i(t) and C̃(t) is multiplied by the sigmoid
function that we really want to add the cell state.
The output gate explains which data will be output in the cell state. The cell state is first
triggered in the tanh layer before being multiplied by o(t). At time t the multiplication
result is the output data h(t) in the block of LSTM.
The available data was categorized into three non-overlapping sets for the purpose of
training, testing and validation. The size of training data varies depending on the sce-
nario. At first, we want to train the LSTM model in foursquare location recommenda-
tion datasets. To choose the best parameters as well as the performance in the proposed
model the validation is employed. Finally the same dataset is utilized to the testing pur-
pose to verify the performance and accuracy. The weight and bias value of all the three
gates can be updated by BES [32] optimization. To validate the co-sequences of every
phase of hunting is the main behaviour of bald eagle. The hunting behaviour of BES can
be classified in to three stage i.e. select, search and swooping stage.
Wi (t)
Fitness f (t) = max . (19)
In the select phase, the BES find and pick the best area as best bias within the chosen
search space.
In the swooping stage, the bald eagle swings from best weight in the search space and
best bias in the best area. Both these are calculated and mathematically illustrated as
W , bi, new = rand ∗ W , bbest + x1(i) ∗ (bi − c1 ∗ Pmean ) + y1(i) ∗ (Wi − c2 ∗ Wbest ).
Based on the above Eq. (23), the weight and bias value will be updated in RNN.
Model training
In this study, an end to end type of LSTM model is employed to explore the process of
recommendation prediction. The related model parameter setting is listed in Table 2.
The learning rate is initially set to 0.002, the BES optimization is employed to adjust
the hyper parameters during model training. The batch size is set to 8, and the state
and hyper parameters in the proposed model are marginally adjusted on the testing
process for correct prediction.
Performance measures
The following are the performance measures that are used in the simulation for the per-
formance analysis. Accuracy, Precision and Recall and are the performance parameters
utilised in the experimental results.
Precision = . (23)
Recall = . (24)
Accuracy = . (25)
TP + FP + FN + TN
90.6, 95.2 and 97.8%. The accuracy value for DNN, ANN, KNN, DAE-SR and proposed
DRNN-BES classifier are 85.3, 90, 90.4, 92, 94.7 and 97.3%. Compared with the existing
techniques our proposed approach has better performance.
The Table 4 and Fig. 4 depicts the performance value and investigation of KNN, ANN,
DNN, DAE, DAE-SR and proposed DRNN-BES for Damerau–Levenshtein distance
based similarity. For KNN, ANN, DNN, DAE, DAE-SR and proposed DRNN-BES clas-
sifier precision value are 76.2, 79.2, 82.5, 83.8, 90.3 and 94.3%, recall value are 79.2, 80.4,
86.4, 88.2, 89.3 and 93.6% also accuracy value are 78.2, 82.4, 84.4, 86.4, 90.7 and 94%.
It is obviously agreed, compared with other classifier the proposed approach has better
The Table 5 gives the performance value of KNN, ANN, DNN, DAE, DAE-SR and
proposed DRNN-BES for Tversky index and Fig. 5 shows the graphical representation
of analysis. For KNN, ANN, DNN, DAE, DAE-SR and proposed DRNN-BES classifier
precision value are 90.9, 93.5, 96.8, 97.2, 97 and 99.5% recall value are 92.8, 94.2, 95.7,
92.8, 98.8 and 99.3. The accuracy value for DNN, ANN, KNN, DAE-SR and proposed
DRNN-BES classifier are 93.8, 95.9, 97.6, 98.1, 99 and 99.4%. Compared with the
existing techniques our proposed approach has better performance.
The Table 6 and Fig. 6 depicts the performance value and investigation of KNN,
ANN, DNN, DAE, DAE-SR and proposed DRNN-BES for Cosine similarity. For KNN,
ANN, DNN, DAE, DAE-SR and proposed DRNN-BES classifier precision value are
90, 92, 94, 94, 98.2 and 98.6%, recall value are 85, 88, 92, 93, 98.5 and 98.9% also accu-
racy value are 89, 93, 96, 95, 98.3 and 98.7%. It is obviously agreed, compared with
other classifier the proposed approach has better performance.
The Table 7 gives the performance value of KNN, ANN, DNN, DAE, DAE-SR and
proposed DRNN-BES for Tversky index and Fig. 7 shows the graphical representation
of analysis. For KNN, ANN, DNN, DAE, DAE-SR and proposed DRNN-BES classifier
precision value are 79.2, 80.2, 88.4, 87.3, 91.4 and 95.7% recall value are 80.3, 83.6, 85.9,
88.2, 90.3 and 95.8%. The accuracy value for DNN, ANN, KNN, DAE-SR and proposed
DRNN-BES classifier are 81.2, 85.5, 87.2, 90.3, 93.6 and 95.8%. Compared with the exist-
ing techniques our proposed approach has better performance. From the above results,
Tversky index based performance are is high when compared tom the other similarity.
Performance evaluation based on sentiment contextual information and training data size
In context aware recommendation models, training data size gains the notable importance.
Simulations have been conducted with different training data size. The efficiency of pro-
posed recommendation model gets increases while increasing the size of training data.
Table 8 displays the performance of proposed DRNN along with DAE-SR and DAE using
different feature representations on changing the size training data. Here firstly, a complete
training data (100% of the total data) is utilized for model training by different feature rep-
resentations. Additionally, 20% trimming is performed to each of the training data files and
recurrent the same simulations. The evaluation obtained from the table values shows that
the proposed DRNN model achieves higher accuracy up to 99.5% compared to DAE-SR
and DAE based model. Table 9 shows the accuracy performance comparison using various
sentiment with context information. From the analysis of table values conclude that the
accuracy performance is improved while adding the contextual information. Moreover, the
comparison is made with respect to training, validation, testing accuracies among the pro-
posed DRNN model with other models is displayed in Table 10.
As in our proposed model, the Cosine similarity is employed to determine the contex-
tual similarity of terms which leads to provide fair recommendation. Moreover, the pro-
posed DRNN has hybrid with BES, an optimization algorithm. This hybrid architecture has
improved the overall performance of proposed architecture. Normally, different optimiza-
tion algorithms are now available but we have selected this algorithm to hybrid with BES
this is because the proposed BES has attained efficient solution identification that other
optimization algorithms. Due to this reason, we have hybrid BES with DRNN and attained
efficient result than other existing algorithms. Existing model does not include any optimi-
zation algorithms for optimal parameter selection, but in our work we have combined BES
with DRNN to develop efficient recommendation system.
Similarly, the alternative hypothesis represented as Halt defined in Eqn. 26 to pledge the
null hypothesis Hnull.
Especially, five trails has been taken from all models by employing number of itera-
tions to conduct an ANOVA test. Additionally, other measures such as level of impor-
tance α = 0.05 and confidence interval (CI) range = 95% are considered. Tables 11(a),
(b), and 12(a), (b) have display the input selected for executing the ANOVA test based
on accuracy and recall metrics to examine the output value in the form of f-ratio and
p-value. The valuation of the test outcomes shown in Tables 11(b), 12(b), it can be con-
firmed that the alteration in the mean value of error has statistically valid, therefore the
null hypothesis Hnull is disapproved and approved the alternative hypothesis Halt . Addi-
tionally, in the ANOVA test for accuracy, the value of f-ratio is 52.4082. The p-value is 0.
Therefore, the ANOVA test outcome at p < 0.05 is valid.
The f-ratio is 52.4082. The p-value is 0. The outcome at p < 0.05 is valid.
The f-ratio is 70.392. The p-value is 0. The outcome is not-valid at p < 0.05.
In this paper, an effective context aware recommendation model is proposed. Initially,
the consumer feedback comments are extracted from the online amenities via web
crawling technique. In the beginning of the recommendation system, pre-processing is
carried out to remove the irrelevant words from the user reviews. After pre-process-
ing, TF–IDF vector model is employed to extract relevant features numerically from the
feedback user tips. Further, word embedding model is employed to extract the contex-
tual information from user tips. Then, the density based clustering algorithm is executed
to group similar sentiments of user tips. Finally, the deep recurrent neural network
model is employed to select the possible user preference vectors from clusters. The com-
parative analysis performed based on similarity measures, training data size and senti-
ment based contextual information using this metrics the metrics of accuracy, precision,
recall. Our proposed model achieves accuracy up to 99.6 with the inclusion of contex-
tual information and outperforms compared to other deep learning model. In future,
the aspect based opinions need to be considered to achieve fair recommendation with
different domain datasets.
NLTK: Natural Language Tool Kit; TF–IDF: Term Frequency–Inverse Document Frequency; RS: Recommendation systems;
IANFS: Improved Adaptive Neuro-Fuzzy Inference System; CLB: Collaboration based; CB: Content-based; GB: Grade-
based; DLMNN: Deep learning modified neural network; RA: Review analysis; CEP: Complex Event Processing; RNN:
Recurrent neural network; LSTM: Long Short Term Memory; KNN: K-nearest neighbours; ANN: Artificial neural networks;
DNN: Deep neural network; DAE: Denoising Auto encoder; DAE-SR: Denoising Auto encoder-Super Resolution; TP: True
positive; TN: True negative; FP: False positive; FN: False negative.
Authors’ contributions
VB has found the proposed algorithms and obtained the datasets for the research and explored different methods
discussed. SP contributed to the modification of study objectives and framework. Their rich experience was instrumental
in improving our work. All authors contributed to the editing and proofreading. Both authors read and approved the
final manuscript.
Author details
School of Computer Science and Engineering, Vellore Institute of Technology, Chennai Campus, Chennai, India. 2 School
of Computer Science and Engineering, Vellore Institute of Technology, Chennai Campus, Chennai, India.
