Trip Recommendation With Multiple User Constraints by Integrating Point-of-Interests and Travel Packages
Trip Recommendation With Multiple User Constraints by Integrating Point-of-Interests and Travel Packages
Trip Recommendation With Multiple User Constraints by Integrating Point-of-Interests and Travel Packages
Abstract—With the advances of mobile communication Museum of Art and Times Square would be returned when the
techniques in recent years, numerous kinds of Location-Based targeted city is New York. Hence, researches on trip
Services (LBSs) have been developed and one popular recommendation that consider personal preference of users
application of LBSs is trip recommendation. Although there exist [13][14][17] have attracted extensive attentions. With the
already a number of studies on this topic in literatures, most of advancement of intelligent mobile devices, wireless networks
them focused on combining a set of point-of-interests (POIs, or and Web 2.0 techniques, a number of Location-Based Social
say attractions) as a trip based on user-specific constraints. In Network (LBSNs) services such as Gowalla, FourSquare and
another way, some few works discussed making recommendation Facebook Place have emerged. These LBSNs allow users to
in terms of travel packages, which have the benefits of lower cost
share their travel experiences with their friends via mobile
and higher convenience. However, no prior work explores to
devices. As a result, immense amounts of travel data which
integrate attractions and travel packages simultaneously for trip
recommendation. In fact, such a hybrid-style recommender can contains geo-spatial and user preference information have
provide higher benefits for users although there exist critical been accumulated. This data benefits people to explore
challenges here like the efficiency issue in such kind of real-time interesting attractions or trips for travel recommendation.
applications. In this paper, we propose a novel framework named To seize travel business opportunity, more and more travel
Package-Attraction-based Trip Recommender (PATR) to efficiently agencies provide and design travel packages for tourists.
recommend the personalized trips satisfying multiple constraints Travel packages have the benefits of lower cost and higher
by effectively combining attractions and packages. In PATR, a
convenience. For example, a package can provide some
Score Inference Model is proposed to infer the scores of
attractions and packages by taking user-based preference and
discounts of attraction tickets. Moreover, tourists do not need
temporal-based properties into account. Then, the Hybrid Trip- to spend so much time on surfing travel information for trip
Mine algorithm is proposed to efficiently discover the optimal planning and worry about transportation. Hence, it generates a
trip which satisfies the multiple user-specific constraints with new type of recommendation systems to recommend travel
both of attractions and packages considered simultaneously. packages to tourist [3][11][15]. As mentioned above, the rapid
Furthermore, we propose two pruning strategies based on Hybrid growth of online travel information brings a big challenge on
Trip-Mine, named Score Estimation (SE) and Score Bound how to choose the interesting attractions and arrange them into
Tightening (SBT), to further improve the execution efficiency and the most interesting trip with a proper order or straight buy
memory utilization. To the best of our knowledge, this is the first existing travel packages that satisfy their personalized
work on travel recommendation that considers attractions and requirements.
packages simultaneously. Through extensive experimental
evaluations, our proposed approaches were shown to deliver Take a scenario as an example. Suppose that Tom has only
excellent performance. 10 hours and 100 dollars to travel in New York, and he wants
to visit Metropolitan Museum of Art, Central Park, Broadway
Keywords—Trip Recommendation; Point-of-Interest; Travel Theatre and Times Square. The total travel time or cost of this
Package; Location-Based Social Network; Data Mining. schedule may exceed his constraints. If there exist a travel
package which includes Metropolitan Museum of Art, Central
I. INTRODUCTION Park and Broadway Theatre with lower cost and provided
transportation, Tom can choose this package and add Times
In recent years, due to the rise of living standards, travel Square to arrange his trip for reducing the travel time and cost.
has become one of the most important entertainment in our Although travel packages are cheaper and more convenient,
life. Before departing, trip planning is the most troublous task. they may not cover all attractions which tourists want to visit.
Most backpackers may search travel information around the Therefore, taking attractions into consideration would be more
targeted city via travel guide websites such as Lonely Planet customized for trip planning. We consider that a good trip
and Yahoo Travel. However, most of the discovered recommender should take both attractions and packages into
attractions may concentrate on famous attractions and they account simultaneously to recommend more better trips and
may not be satisfactory for everyone. For example, a number provide higher benefits for users. Hence, how to plan a trip
of information about famous attractions such as Metropolitan that not only considers attractions and packages together but
34
al. proposed Regularized Matrix Factorization (RMF) properties of attractions. In [17], Yin at el. proposed a travel
personalized recommender to recommend locations for users. path search system based the Panoramio geo-tag photos to
RMF recommender is a latent factor model for predicting the plan trip for tourists, not only for where to visit but also how
ratings of users to spots by using model-based CF. In [16], Ye to visit. Hsieh at el. [7] proposed a time-sensitive model based
et al. collected the data from FourSquare and Whrrl and on the information squeezing from Gowalla check-in data and
further recommended locations for users by fusing user used greedy algorithm to recommend trip routes. In [6], Hsieh
preference with social influence and geographical influence. at el. attempted to coordinate the preference lists from
However, this work does not consider the temporal property multiple users and construct a route which not only satisfies
which is an important factor in real applications. In [18], Yuan user preferences as many as possible, but also being popular
et al. proposed a unified framework which consists of the and reasonable. However, they could merely discover the
spatial and temporal characteristics of attractions. They split popular travel sequence while not the personalized travel
time in multiple slots and fill these slots with check-in values sequence. In [12], Lu et al. propose Personalized Trip
that users made at each specific hour of the day. Recommendation (PTR) framework to efficiently recommend
the personalized trips meeting multiple constraints of users by
C. Travel Package Recommendation mining user's check-in behaviors from Gowalla. PTR
There are few existing works on personalized travel considers both of user-based preferences and temporal-based
package recommendation. A travel package includes the properties to infer the scores of each attraction. Due to the
destination, hotels, price, landscapes, organized activities, etc. computation complexity, they further proposed Parallel Trip-
Building travel package recommendation systems is Mine+ to efficiently plan the trip under multiple constraints.
challenging because the personal data that can be used for
learning individual preferences are not conveniently available III. PROBLEM STATEMENT
and the values of travel packages created by travel companies In this section, we first define some terms used in this
can easily depreciate over time. Xie et al. [15] proposed a research work and then specify our research goal.
method of composite recommendation of points of interest for
tourists according to the tourist’s budget. This system Definition 1. Attraction (also called Point-Of-Interest,
recommends top-k packages for the user. But they target at POI). A = {a1, a2, ..., a|A|} denotes the collection of attractions.
generating travel routes for a tourist rather than recommending For each attraction a ∈ A, a has an attraction CATegory
the existing travel packages. Ge et al. [3] developed two cost- CAT(a), an Attraction Score AS(u, t, a), an Attraction Cost
aware latent factor models to recommend travel packages by AC(a) and a Stay Time ST(a), which represent the category of
considering both cost constraints and the tourist’s interests. attraction a, how interesting attraction a is for user u at time t,
Liu et al. [11] designed the Tourist-Area-Season Topic (TAST) how much money users often spend in attraction a and how
model to analyze the characteristics of travel packages and long users often stay in attraction a, respectively.
provide a personalized travel package recommendation. Definition 2. Package. P = {p1, p2, ..., p|P|} denotes the
However, the disadvantage of package recommendations may
be limited for the arrangement of travel agencies. collection of packages. For each package p ∈ P, p = ( ak ,
1
35
are (a1, a2), PC(p1) is 15, BT(p1) is 13:00, ET(p1) is 14:15 and Definition 10. Travel Time and Budget Constraint. The
DT(p1) is 75 minutes. The travel time between a1 and a2 TT(a1, constraints of travel time cT and budget cB are defined as how
a2) is 30. much time and money the user has in this trip, respectively.
a4
ST(a4) = 50 Definition 11. Valid Trip. Let au, cT and cB be the current
ST(a1) = 30 AC(a4) = 30
AC(a1) = 5 50 location of the user, the user-specific travel time constraint
20 and the user-specific budget constraint, respectively. A trip tp
a1
30 a5
ST(a5) = 40 = <e1, e2, ..., en> is called a Valid Trip if AT(e) BT(e), ∀e ∈
65 50 50 AC(a5) = 25
tp and e is a package, TPT(au, tp) is less than or equal to cT
30 30 60 and TPC(tp) is less than or equal to cB; otherwise, it is an
Package Begin End
Package
40
Cost Time Time Invalid Trip. In Fig. 1, the current location is au, the travel
40
40
p1=(a1,a2) 15 13:00 14:15 time constraint cT is 300 and the travel budget cB is 70. A trip
a2
30
p2=(a2,a4) 35 10:30 12:25 tp = <a3, p1> is a valid trip since TPT(au, tp) = 285 ≤ cT and
ST(a2) = 30
AC(a2) = 15 20 au p3=(a3,a5) 35 11:00 12:30 TPC(tp) = 35 ≤ cB. However, a trip tp’ = <p2, a3, a5> is not a
a3 20 Current Location valid trip since TPC(au, tp’) = 80 which is greater than cB.
ST(a3) = 30 cT = 300
AC(a3) = 20 cB = 70 Definition 12. Optimal Trip. Let au and t be the current
Figure 1. An example of a trip map network. location of the user u and the departure time of u, respectively.
A valid trip tp = <e1, e2, ..., en> is called the Optimal Trip if
Definition 6. Trip. A trip tp = <e1, e2, ..., en>, also denoted as there is no valid trip tp’ such that TPS(u, t, tp’) is greater than
n-trip, which is orderly composed of one or several trip TPS(u, t, tp). Notes that there may be not only one optimal trip
elements, where n indicates the number of elements (|tp| = n). in some constraint settings.
For each ex ∈ tp, ex may be an attraction or a package. We
respectively use ex.aF and ex.aL to represent the first and the Problem Formulation. With the above definitions, the main
last attraction in ex when ex is a package. If ex is an attraction, problem we address in this paper is formulated as follows.
ex.aF and ex.aL are equal to ex. Let AT(ex) be the Arrived Time Given a current location of the user and several user-specific
of ex which is defined as (1), where x > 1, ∀ex ∈ A ∨ P. For constraints such as travel time and travel budget, our goal is to
example, <a3, p1> is one of trips in Fig. 1. develop an efficient trip recommendation framework which
provides the optimal trip meeting multiple user-specific
BT (ex ), If ex ∈ P. constraints and effectively combine packages and attractions.
° We expect the recommender can efficiently, effectively and
AT (ex ) = ® AT (ex −1 ) + ST (ex −1 ) + TT (ex −1 , ex ), If ex , ex −1 ∈ A. (1)
°ET (e ) + TT (e .a , e ), If e ∈ A and e ∈ P. accurately return the trip answer.
¯ x −1 x −1 L x x x −1
Definition 7. Trip Score. Given a user u, a departure time t IV. PROPOSED METHOD
and a trip tp = <e1, e2, ..., en>, the Trip Score TPS(u, t, tp) is In this section, we first give the proposed system
defined as (2), which represents how interesting this trip is. framework of Package-Attraction-based Trip Recommender
Take <a3, p1> in Fig. 1 as an example. Since the departure (PATR). Next, two main techniques of PATR, including Score
time for user u from au is 10:00, AT(a3) is 10:20 and AT(p1) is Inference and Hybrid Trip Planning, are presented in detail.
13:00. Assume that AS(u, 10:20, a3) is 0.65 and PS(u, p1) is
1.1. Hence, the trip score is TPS(<a3, p1>) = 0.65 + 1.1 = 1.75.
A. System Framework
TPS (u , t , tp ) = ¦ AS (u, AT (e), e) + ¦ PS (u, e)
∀e∈tp A ∀e∈tp P
(2) In this paper, we aim to design a trip recommendation
system which can suggest the user a personalized trip
Definition 8. Trip Time. Let au be the start location and t be considering attractions and packages simultaneously and
the departure time. For a trip tp = <e1, e2, ..., en>, the Trip satisfying several constraints in real time. As Fig. 2 shows,
Time TPT(au, tp) is formulated as (3), which represents how PATR consists of two stages. The first stage is an offline data
long the users travel around this trip. For example, t is set as mining mechanism. In this stage, we can first obtain the
10:00, TPT(au, <a3, p1>) = ET(p1) + TT(a2, au) - t = 285 attraction, package, personal and social link information
(minutes). directly from the travel websites such as Gowalla and LBSN
websites. Then, we infer the user-based and temporal-based
AT (en ) + ST (en ) + TT (en , au ) − t , If en ∈ A scores for each attraction and each package by the proposed
TPT (au , tp) = ® (3)
¯ET (en ) + TT (en .a L , au ) − t , If en ∈ P inference model. The second stage is an online query
mechanism. In this stage, users need to input a departure time,
Definition 9. Trip Cost. Let tp = <e1, e2, ..., en> be a trip, the a start location, a budget constraint, a time constraint and a
Trip Cost TPC(tp) is formulated as (4), which represents how temporal preference via their handheld mobile devices when
much money the users need to spend around this trip. For they need to plan a trip. After receiving user’s request, we first
example, TPC(<a3, p1>) = AC(a3) + PC(p1) = 35. fuse the user-based attraction/package score and temporal-
based attraction/package score by a user-specific weight. Next,
TPC (tp ) = ¦ AC (e) + ¦ PC (e)
∀e∈tp A ∀e∈tp P
(4) Hybrid Trip-Mine algorithm is proposed to plan trip. The
algorithm not only considers both of the time constraint and
budget limitation at the same time, but also considers
36
attractions and packages simultaneously. Based on Hybrid Hence, to emphasize different category weights in a package,
Trip-Mine, we propose two pruning strategies for further we also take the stay time of each attraction in the package
improving the computational efficiency. Finally, the optimal into account. We use (5) to calculate the preference of each
trip can be recommended to the user. category for user u [12].
Travel Region Data Mining Mechanism |D ' (u , c )|
ps(u,c) =
Attraction
Information
Package
Information
Attraction and Package
Score Inference
arg c'∈C Max(| D ' (u , c ' ) |) (5)
1.Cost 2.Category
3.Check-in Log
1. Cost 2.Attractions
3. Arrangement
User-based Score where D' (u , c) = {ci | ci.u = u and CAT(ci.a) = c, ∀ci ∈ D}
• Attraction
4.Stay time 4. Stay time
• Package
Temporal-based Score
For a user u, the preference vector, denoted as PV(u), is
Personal Social Link
Information Information • Attraction obtained by u’s preference score of each category, which is
Check-in log • Package
defined as (6).
Online Query Mechanism
PV (u ) =< ps (u , c1 ), ps (u , c 2 ),..., ps (u , c|C | ) > (6)
Constraint Hybrid Trip-Mine
1.Departure Time Recommended Trip
2.Start Location Score Fusion For a user u and a package p , the User-based package Score
1.Attraction
3.Budge Constraint
4.Time Constraint 2.Package
Score Estimation
by RBP-P, denoted as USRBP-P(u, p), is defined as (7).
5.Temporal Preference
Score Bound Tightening
ps (u , a.cat ) × ST ( a )
US RBP − P (u , p ) = ¦ × p
∀a∈ p ¦ ST ( a' )
∀a '∈ p
(7)
Figure 2. System framework of PATR.
b) Friend-based CF with Package (FCF-P)
B. Attraction and Package Score Inference User-based package score may be inferred by the concept
For each attraction/package, PATR needs to understand of Collaborative Filtering (CF). Hence, the first proposed CF-
that how interesting the attraction/package is for a specific based strategy is Friend-based CF with Package (FCF-P) that
user at a specific time before recommending the trip. To considers the friendship information as social links to infer the
measure the attraction/package score, we propose an inference user-based package score. The basic idea of FCF-P is to infer
model which consists of two aspects: 1) User-based the score of a package for a user by considering the check-in
attraction/package Score (US), which is to measure how behaviors of the user’s friends. Note that the friendship
interesting the attraction/package is for a specific user. We between users can be collected directly from Gowalla. The
propose four strategies to infer US; and 2) Temporal-based User-based package Score by FCF-P, denoted as USFCF-P(u,
attraction/package Score (TS), which is to measure how p), is defined as (8), where FU indicates the set of all the
suitable users visit the attraction/package at a specific time. friends of u. The functions chk(u, a) and fri(u, v) equal to 1 if
Different attractions may have different suitable time periods. user u has checked-in attraction a and user u and v exist a
We propose a strategy to infer TS. Finally, US and TS are friend relationship, respectively. Otherwise, the two functions
fused by a user-specific weight parameter as the final score are equal to 0.
including Attraction Score (AS) and Package Score (PS).
chk (v, a ) × ST (a )
¦ ¦ × p × fri(u, v)
1) User-based Attraction/Package Score Inference ∀v∈FU ∀a∈ p ¦ ST (a' ) (8)
To infer the user-based attraction score US(u, a), we apply US FCF − P (u , p ) =
∀a '∈ p
37
chk(v,a)× ST(a) TS ( p ) = ¦ ∀a∈ p TS (a, AT (a, p))
¦ ¦ × p × sim(u,v) (15)
∀v∈SU ∀a∈ p ¦ ST(a')
∀a'∈ p (10)
US SCF − P (u , p) = For an attraction or a package, we have obtained the user-
∀v∈SU
¦ sim(u,v) based score and temporal-based score. In the next section, we
will describe how to fuse them when a user wants to plan a trip.
d) Similarity-Friend-based CF with Package (SFCF-P)
The last proposed CF-based strategy is Similarity-Friend- C. Trip Recommendation
based CF with Package (SFCF-P) which is the fusion of FCF- In PATR, the main goal of online query stage is to
P and SCF-P. In SFCF-P, user-based package score is inferred recommend the trip for users. This stage consists of three main
based on not only similar users but also friends. The User- processes: 1) Score Fusion. For an attraction or a package, the
based package Score by SFCF-P, denoted as USSFCF-P(u, p), is user-based score and temporal-based score are fused by a user-
defined as (11), where λ (limited from 0 to 1) is used to specific temporal factor; 2) Trip Planning. We propose the
control the weight between similarity and friendship. Hybrid Trip-Mine algorithm to plan trips under the user-
specific budget constraint and time constraint; and 3) Pruning
chk (v, a) × ST (a)
¦ ¦ × p × simfri(u, v) Strategy. Based on the proposed algorithm, we propose two
∀v∈SU ∀a∈ p ¦ ST (a' )
∀a '∈ p
pruning strategies for improving the computational efficiency.
US SFCF − P (u, p ) = (11)
¦ simfri(u, v) 1) Score Fusion
∀v∈SU After obtaining the user-based score and temporal-based
where simfri (u, v) = λ × sim(u , v) + (1 − λ ) × fri (u, v) score of each attraction and package, the main purpose of
PATR is to recommend a trip which not only satisfies user
2) Temporal-based Attraction/Package Score Inference preference but also match the suitable visiting time. Hence, the
In real travel experiences, different attractions should have two types of scores are fused as the final Attraction Score and
different suitable visiting time periods. For example, Times Package Score, denoted as AS(u, a, t) and PS(u, p) and defined
Square in New York is more suitable to be visited in the as (16) and (17), which represent the score of attraction a and
evening. Hence, we apply our previous strategies [12] to define package p for user u at time period t, respectively. The
24 Temporal-based attraction Scores (TS), denoted TS(a, t), in parameter α is set by users to control the weight between user-
a day, where t indicates a specific time period. TS is considered based score and temporal-based score.
from two aspects: 1) What time is more suitable for visiting this
attraction? This strategy is called Normalized-By-Time (NBT). AS (u , a, t ) = α * TS ( a, t ) + (1 − α ) * US (u , a ) (16)
The Temporal-based attraction Score by NBT, denoted as
PS (u , p ) = α * TS ( p ) + (1 − α ) *US (u , p ) (17)
TSNBT(a, t), is defined as (12), where |D”(a, t)| represents the
number of check-ins of attraction a during time period t, and T TABLE I. TEMPORAL-BASED SCORE AND USER-BASED SCORE.
indicates 24 time periods. 2) Which attractions are popular Temporal-based Score User-based Score
now? This strategy is called Normalized-By-Attraction (NBA). … 10:00 11:00 12:00 … Tom Amy Darren …
The Temporal-based attraction Score by NBA, denoted as {a1} … 0.5 0.6 0.6 … {a1} 0.5 0.7 0.6 …
TSNBA(a, t), is defined as (13), where A indicates the set of all
{a2} … 0.2 0.3 0.3 … {a2} 0.7 0.2 0.4 …
attractions.
{a3} … 0.8 0.6 0.6 … {a3} 0.5 0.5 0.6 …
|D" (a, t )| {a4} … 0.9 0.8 0.9 … {a4} 0.3 0.8 0.7 …
TS NBT (a,t ) = (12)
arg t'∈T Max(| D" (a, t ' ) |) {a5} … 0.6 0.8 0.8 … {a5} 0.6 0.5 0.8 …
38
tables are shown as Table I. Tom sets the preference weight of permutation. We put all valid trips generated from candidate
temporal-based score α as 0.5. Table II shows a sample trip. 2-sets into VT. There are 31 candidate 3-sets. By checking
This trip starts from au, travels to a3, p1 and finally backs to au. their trip cost, there are many invalid candidates, e.g., {a3, a5,
The third column of Table II shows the score of each p2}, {a3, a4, a5}, {a4, a5, p3}, etc. Hence, these candidates can
attraction and package. Tom arrives a3 at 10:20. Based on be pruned. Again, we still need to check the travel time of
Table I, TS(a3, 10:20) is 0.8 and US(Tom, a3) is 0.5. The two these candidate 3-sets. The algorithm stops when no candidate
scores of a3 are fused as 0.65 (α*0.8+(1-α)*0.5). By the same set is generated. Finally, we can obtain a list of trips ranked by
process, the score of p1 is 1.1. The trip score can be calculated the trip score. The optimal trip is <a3, a5, p1> since the trip
as 1.75 by sum of all scores. Besides, we need to check score is the highest from the valid trips set.
whether this trip is valid under Tom’s constraints. The trip
3) Pruning Strategy
time of this trip is TPT(Tom, <a3, p1>) = 285 minutes which
To improve the efficiency of Hybrid Trip-Mine algorithm,
meets Tom’s time constraint. The trip cost of this trip is
we design two pruning strategies to reduce the number of
TPC(<a3, p1>) = 35 USD also satisfies Tom’s budget
candidate generations.
constraint. Therefore, the trip < a3, p1> is valid for Tom.
Candidate 1-Sets. Candidate 2-Sets. Candidate 3-Sets. Candidate 4-Sets. Pruning Strategy 1. Score Estimation (SE). The first
1-Set Time Cost Score 2-Set Time Cost Score 3-Set Time Cost Score 4-Set Time Cost Score pruning strategy is Score Estimation (SE). Hybrid Trip-Mine
{a1} 90 5 0.5 {a1, a2} 150
{a1, a3} 150
20
25
?
?
{a1, a2, a3}
{a1, a2, a4}
?
?
40
50
?
?
{a1, a2, a3, a4}
{a1, a2, a3, a5}
?
?
70
65
?
?
needs a lot of computational cost to check whether a candidate
{a2} 90 15 0.45
trip is valid or not. Hence, we propose this pruning strategy to
…
{p2, p3}
Checking. Checking. <a2, a3, a1> 210 1.55 <a5, p1, a3> … … …
Trip Time Score Trip Time Score <a3, a1, a2> 210 1.7 <p1, a3, a5> not influence the final result.
…
<p3, p1> 285 2.4 <p2, p3> 2.3 <a3, a2, a1> 190 1.7 <p1, a5, a3>
ST(a1) = 30 50 a4 ST(a4) = 50
Cost = 5 Cost = 30
Figure 3. The process of Hybrid Trip-Mine algorithm. a1
39
from the highest temporal-based score and user-based score. from Gowalla and the experimental settings. Next, the
Hence, the maximum score of an attraction is 0.7. Finally, {a1, experimental evaluations are divided into two parts: 1)
a2, a4} can be pruned since the estimated upper bound score of Comparison of effectiveness by various package scoring
its superset is at most 1*0.7+0.55+0.5+0.6=2.35. strategies; and 2) Comparison of efficiency by various trip
Temporal-based Attraction Score User-based Attraction Score
planning methods. All the experiments were implemented in
Tom Amy Darren … Maximum Score
Java on an Intel i7 CPU 3.40GHz machine with 8 GB of
… 9:00 10:00 11:00 12:00 …
{a1} … 0.5 0.5 0.6 0.6 … {a1} 0.5 0.7 0.6 … {a1} 0.55
memory running Microsoft Windows 7.
{a2} … 0.3 0.2 0.3 0.3 … {a2} 0.7 0.2 0.4 … {a2} 0.5
{a3} … 0.7 0.8 0.6 0.6 … {a3} 0.5 0.5 0.6 … {a3} 0.65 A. Experiment Settings
{a4} … 0.8 0.9 0.8 0.9 … {a4} 0.3 0.8 0.7 … {a4} 0.6 We first describe a real check-in dataset obtained from
{a5} … 0.7 0.6 0.8 0.8 … {a5} 0.6 0.5 0.8 … {a5} 0.7 Gowalla. We collect the dataset from January 2009 to April
{p1} 0.9 {p1} 1.3 0.3 0.5 … {p1} 1.1 (Avg:0.55) 2011. The collected dataset consists of check-in data in New
{p2} 1.0 {p2} 1.0 0.5 0.7 … {p2} 1.0 (Avg:0.5) York (NY) and other cities. For the whole dataset, the number
{p3} 1.4 {p3} 1.2 0.6 0.3 … {p3} 1.3 (Avg:0.65) of users, attractions and attraction check-ins are 8,268,
447,409 (15,616 in NY) and 2,181,978 (271,228 in NY),
Figure 5. Maximum score estimation of each attraction.
respectively. Each user averagely checks-in 32.8 different
1-Set
Maximum
1-Set
Maximum attractions in NY and 231.1 different attractions in other cities.
Score Score
{a1} 0.55 {a5} 0.7 {a5} {p3} … {a3} … {p2} {a2} Besides, we collect packages from travel websites and our
{a2} 0.5 {p3} 0.65 (Avg)
…
developed web system [12]. We regard the average check-ins
{a3} 0.65 {a3} 0.65
{a5 p3} {a3 a4}{a3 p1}{a3 a1} … {p2 a2} of all attractions in each package as its rating. For a package,
{a4} 0.6 {a4} 0.6
we call a user checks-in it if at least one attraction in the
…
40
compute the ranking result relative to the ideal ranking list. C. Efficiency of Trip Planning
For each package, we take the average check-in frequency as We conduct an experiment for comparing the efficiency of
rating. The DCG (Discounted Cumulative Gain) is computed our proposed algorithms and pruning strategies on Gowalla
as (18), where reli is the average check-in frequency of and package datasets. We compare the efficiency of our
package at rank i in the recommended ranking list. Given the proposed fundamental Hybrid Trip-Mine to Hybrid Trip-Mine
ideal discounted cumulative gain IDCGp, then NDCG at p-th with pruning strategies, namely Score Estimation (SE) and
position can be computed as (19). Score Bound Tightening (SBT), in terms of execution time and
reli memory utilization.
DCG p = rel1 + ¦i =2
p
(18)
log 2 i
120 700
DCG p
Execution Time(sec.)
NDCG p = (19) 100 600
Memory(MB)
IDCG p 80
500
400
60
precision is often used to measure the prediction performance. 40
300
We examine the precision of recommended ranking list in our 200
20 100
experiments. The precision at p-th position can be computed 0 0
as (20), where TPp and FPp indicate the p-th package in the
200
300
400
500
600
700
800
900
1000
200
300
400
500
600
700
800
900
1000
ranking list that the testing user has ever checked-in one of
Number of Attractions (nA) Number of Attractions (nA)
attractions in package and the user has never checked-in any (a) Execution Time (b) Memory Utilization
attraction in package, respectively.
Figure 8. Impact of number of attractions.
TPp
precisionp = (20)
TPp + FPp 1) Impact of Number of Attractions nA
This experiment analyzes the impact of the two pruning
Fig. 7 shows that our proposed SFCF-P outperforms strategies, namely SE and SBT, based on Hybrid Trip-Mine
various package recommendation strategies in terms of NDCG when the number of attractions (nA) is varied from 200 to
and precision under rank 5, 10 and 15. As the results show, 1,000 in terms of execution time and memory utilization. The
both of the NDCG and precision of Random recommendation number of packages is fixed 40. Fig. 8(a) and 8(b) show that
strategy are very low. RBC-P, RBF-P and HIS-P perform well Hybrid Trip-Mine + SE + SBT outperforms the fundamental
since popular travel packages usually contain popular Hybrid Trip-Mine with varied the number of attractions. We
attractions and attract most of people. However, they do not observe that the execution time and memory utilization
consider user preference. Hence, the packages scores increase significantly with the number of attractions increases.
calculated by these methods may not be satisfied by everyone. The reason is that the number of possible trips exponentially
For our proposed strategies, RBP-P is not good since RBP-P increases when the number of attractions increases. Due to our
only considers the categories and stay time of attractions in problem is to recommend the optimal trip, our proposed
packages. Hence, RBP-P cannot differentiate which package is algorithms need more execution time and memory utilization
better when there are several packages consisting of to generate and check the more candidate sets consist of
attractions with the same category and similar stay time. FCF- attractions and packages.
P is not good too since users’ friends may have different
interests. We can observe that SCF-P is better than RBC-P,
1400
RBF-P and HIS-P in terms of NDCG and precision. The 800
Execution Time(sec.)
700 1200
reason is that SCF-P captures the user preferences based on
Memory(MB)
600 1000
the user check-in similarity. Similar users may have more 500 800
similar interests. Finally, SFCF-P outperforms SCF-P and 400
600
300
FCF-P since SFCF-P considers the check-in similarity and 200 400
friendship together for trip recommendation. 100 200
0 0
240
280
320
360
400
440
480
520
560
600
240
280
320
360
400
440
480
520
560
600
HITS-P
Random
SFCF-P
RBP-P
RBF-P
HITS-P
Random
SCF-P
FCF-P
RBC-P
41
travel time constraint. The reason is that the number of valid utilization. For the future work, we will design more efficient
trips increases, when the travel time constraint increases. optimal trip recommendation approaches and optimization
Hence, our proposed algorithms need to spend more time to strategies to further enhance the performance of trip planning.
discover the best one among all valid trips. In addition, we plan to consider more useful factors in detail
of packages and collect more real dataset to design more
3) Impact of Budget Constraint cB accurate model and verify our proposed approaches.
This experiment analyzes the impact of the two pruning
strategies, namely SE and SBT, based on Hybrid Trip-Mine
when the budget constraint (cB) is varied from 0 to 180 USD ACKNOWLEDGMENT
in terms of execution time and memory utilization. Fig. 10(a) This research was supported by National Science Council,
and 10(b) show that Hybrid Trip-Mine + SE + SBT Taiwan, R.O.C. under grant no. NSC101-2221-E-006-255-
outperforms the fundamental Hybrid Trip-Mine with varied MY3.
the budget constraint. We observe that the execution time
significantly increase at first while keep steady after the REFERENCES
budget constraint higher than 140 USD. The reason is that the
[1] J. Bao, Y. Zheng and M. F. Mokbel, "Location-Based and Preference-
number of candidate sets will be bounded by the number of Aware Recommendation Using Sparse Geo-Social Networking Data," in
attractions and the travel time constraint. Hence, the number ACM GIS, pp. 199-208, 2012.
of valid trips will not increase even if we keep increasing the [2] B. Berjani and T. Strufe, "A Recommendation System for Spots in
budget constraint. Location-Based Online Social Network," in SNS, 2011.
[3] Y. Ge, Q. Liu, H. Xiong, A. Tuzhilin and J. Chen, "Cost-Aware Travel
5 Tour Recommendation," in KDD, pp. 983-991, 2011.
600
Execution Time(sec.)
500
[4] Gowalla, http://gowalla.com
4
Memory(MB)
0
20
40
60
80
100
120
140
160
180
42