Trip Recommendation With Multiple User Constraints by Integrating Point-of-Interests and Travel Packages

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

2014 IEEE 15th International Conference on Mobile Data Management

Trip Recommendation with Multiple User Constraints


by Integrating Point-of-Interests and Travel Packages
Shih-Hsin Fang1, Eric Hsueh-Chan Lu2 and Vincent S. Tseng1
1
Institute of Computer Science and Information Engineering, National Cheng Kung University
No.1, University Rd., Tainan City 701, Taiwan (R.O.C.)
2
Dept. of Computer Science and Information Engineering, National Taitung University
No.684, Sec. 1, Zhonghua Rd., Taitung City, Taitung County 950, Taiwan (R.O.C.)
[email protected]; [email protected]; [email protected]

Abstract—With the advances of mobile communication Museum of Art and Times Square would be returned when the
techniques in recent years, numerous kinds of Location-Based targeted city is New York. Hence, researches on trip
Services (LBSs) have been developed and one popular recommendation that consider personal preference of users
application of LBSs is trip recommendation. Although there exist [13][14][17] have attracted extensive attentions. With the
already a number of studies on this topic in literatures, most of advancement of intelligent mobile devices, wireless networks
them focused on combining a set of point-of-interests (POIs, or and Web 2.0 techniques, a number of Location-Based Social
say attractions) as a trip based on user-specific constraints. In Network (LBSNs) services such as Gowalla, FourSquare and
another way, some few works discussed making recommendation Facebook Place have emerged. These LBSNs allow users to
in terms of travel packages, which have the benefits of lower cost
share their travel experiences with their friends via mobile
and higher convenience. However, no prior work explores to
devices. As a result, immense amounts of travel data which
integrate attractions and travel packages simultaneously for trip
recommendation. In fact, such a hybrid-style recommender can contains geo-spatial and user preference information have
provide higher benefits for users although there exist critical been accumulated. This data benefits people to explore
challenges here like the efficiency issue in such kind of real-time interesting attractions or trips for travel recommendation.
applications. In this paper, we propose a novel framework named To seize travel business opportunity, more and more travel
Package-Attraction-based Trip Recommender (PATR) to efficiently agencies provide and design travel packages for tourists.
recommend the personalized trips satisfying multiple constraints Travel packages have the benefits of lower cost and higher
by effectively combining attractions and packages. In PATR, a
convenience. For example, a package can provide some
Score Inference Model is proposed to infer the scores of
attractions and packages by taking user-based preference and
discounts of attraction tickets. Moreover, tourists do not need
temporal-based properties into account. Then, the Hybrid Trip- to spend so much time on surfing travel information for trip
Mine algorithm is proposed to efficiently discover the optimal planning and worry about transportation. Hence, it generates a
trip which satisfies the multiple user-specific constraints with new type of recommendation systems to recommend travel
both of attractions and packages considered simultaneously. packages to tourist [3][11][15]. As mentioned above, the rapid
Furthermore, we propose two pruning strategies based on Hybrid growth of online travel information brings a big challenge on
Trip-Mine, named Score Estimation (SE) and Score Bound how to choose the interesting attractions and arrange them into
Tightening (SBT), to further improve the execution efficiency and the most interesting trip with a proper order or straight buy
memory utilization. To the best of our knowledge, this is the first existing travel packages that satisfy their personalized
work on travel recommendation that considers attractions and requirements.
packages simultaneously. Through extensive experimental
evaluations, our proposed approaches were shown to deliver Take a scenario as an example. Suppose that Tom has only
excellent performance. 10 hours and 100 dollars to travel in New York, and he wants
to visit Metropolitan Museum of Art, Central Park, Broadway
Keywords—Trip Recommendation; Point-of-Interest; Travel Theatre and Times Square. The total travel time or cost of this
Package; Location-Based Social Network; Data Mining. schedule may exceed his constraints. If there exist a travel
package which includes Metropolitan Museum of Art, Central
I. INTRODUCTION Park and Broadway Theatre with lower cost and provided
transportation, Tom can choose this package and add Times
In recent years, due to the rise of living standards, travel Square to arrange his trip for reducing the travel time and cost.
has become one of the most important entertainment in our Although travel packages are cheaper and more convenient,
life. Before departing, trip planning is the most troublous task. they may not cover all attractions which tourists want to visit.
Most backpackers may search travel information around the Therefore, taking attractions into consideration would be more
targeted city via travel guide websites such as Lonely Planet customized for trip planning. We consider that a good trip
and Yahoo Travel. However, most of the discovered recommender should take both attractions and packages into
attractions may concentrate on famous attractions and they account simultaneously to recommend more better trips and
may not be satisfactory for everyone. For example, a number provide higher benefits for users. Hence, how to plan a trip
of information about famous attractions such as Metropolitan that not only considers attractions and packages together but

978-1-4799-5705-7/14 $31.00 © 2014 IEEE 33


DOI 10.1109/MDM.2014.10
also satisfies user’s preference in terms of multiple travel also propose two pruning strategies for further enhancing the
constraints is an interesting and challenging problem. recommendation efficiency.
In this work, we aim to address two main issues: 1) How to 4) We conduct a series of experiments to evaluate the
evaluate the degree of interestingness for each attraction and performance of our approaches. The results show superior
package? and 2) How to efficiently and effectively performance in terms of effectiveness and efficiency.
recommend a trip by integrating attractions and packages The remainder of this paper is organized as follows. We
based on the multiple user-specific constraints? For the first briefly review the related work in Section II. In Section III, we
issue, we try to extract the personal preference and social formulate the aimed problem. In Section IV, we will introduce
relations of a tourist from the generated LBSN data. We the proposed system framework in detail. In Section V, we
consider that these two factors may influence the interesting perform an empirical performance evaluation. Finally, in
degrees of attractions for the tourist. Because a travel package Section VI, we summarize our conclusions and future work.
consists of multiple attractions, we consider that the category
and stay time of each attraction in the package may influence
the interesting degree of package. For the second issue, in II. RELATED WORK
general, users would request real-time response when they In this section, we briefly classify the relevant previous
query the travel recommendation system. With massive studies into four categories: 1) Location-based social network;
amounts of attractions, how to recommend a trip efficiently is 2) Travel attraction recommendation; 3) Travel package
a critical issue in such a real-time query system. Besides, recommendation; and 4) Trip recommendation.
another challenge of trip recommendation is how to combine
attractions and packages effectively. For example, it should be
A. Location-Based Social Network (LBSN)
avoided to arrange in a trip the attraction and package which
contain the overlapped attraction together. Therefore, how to Location-Based Social Networks (LBSNs) have been
recommend a trip effectively is also a critical issue for trip growing rapidly in recent years such as Gowalla [4],
recommendation by integrating attractions and packages. FourSquare and Facebook Places. In these LBSNs, users can
establish friendship links to other users and share their
In this paper, we aim at developing a novel data mining location-related information in anywhere at any time. “Check-
based framework named Package-Attraction-based Trip in” is one of the main services on LBSNs, and users can
Recommender (PATR) to efficiently recommend a perform check-in actions that pin the geographical information
personalized trip with multiple constraints and effectively of current location which is also called spot. The check-in
combine attractions and packages. PATR consists of two records contain the name of spot, spatial and temporal
phases. In the first phase, we provide an Score Inference information and the category of spot. For example, John
Model to infer the personalized score of each attraction and checks-in the spot of Central Park in New York which belongs
each package by considering user-based preference and to “City Park” at 9:00AM. The check-in records are also
temporal-based properties. In the second phase, according to propagated and showed on the pages of their friends.
the multiple user-specific constraints including a travel time Therefore, the real-life experiences can be easily shared and
constraint, a budget constraint and some control parameters, discussed in the virtual world, it enable a new type of
we propose Hybrid Trip-Mine algorithm to discover the interaction between the virtual and the physical world. We
optimal trip efficiently and combining both attractions and target Gowalla as our research LBSN source.
packages effectively. To improve the efficiency of optimal trip
planning, we further propose two strategies named Score B. Travel Attraction Recommendation
Estimation (SE) and Score Bound Tightening (SBT) to reduce
the number of redundant candidate trip generations. To the The rapid proliferation of LBSNs has motivated a wealth
best of our knowledge, this is the first work on trip of studies that focus on location recommendation. Before
recommendation that integrates attractions and packages planning a trip, we should consider the popularity and suitable
simultaneously. Finally, through extensive experimental visiting time of each attraction. In [8], Huang et al. proposed a
evaluations based on a real data crawled from Gowalla, our personalized recommendation system by using Bayesian
approaches were shown to deliver excellent performance in network techniques. In [10], Kim at al. proposed the TripTip
terms of effectiveness and efficiency. We summarize the system to recommend users the place they would most likely
contributions of this paper in the following: want to visit by considering the similarity between previous
visiting places and the next place. In [5], Horozov et al. used
1) We study the problem of trip recommendation by location as key criterion for generating more useful
integrating attractions and packages. This has not been recommendation results based on the proposed enhanced
explored in the research community. Collaborative Filtering (CF) method. In [1], Bao et al.
2) We propose an score inference model for inferring the proposed a category-based similarity computation method to
interesting score of each attraction and each package by make recommendations instead of using traditional CF-based
method. The idea of this work is local experts have high
taking user-based preference and temporal-based properties
expertise in user’s preferred categories and the venues in the
into consideration. geospatial range of the user. However, most of them rely on
3) We propose the Hybrid Trip-Mine algorithm for information defined by domain experts while this kind of
effectively combining attractions and packages. Besides, we information is rare on the world. Besides, information defined
by domain expert may be easily out-of-date. In [2], Berjani et

34
al. proposed Regularized Matrix Factorization (RMF) properties of attractions. In [17], Yin at el. proposed a travel
personalized recommender to recommend locations for users. path search system based the Panoramio geo-tag photos to
RMF recommender is a latent factor model for predicting the plan trip for tourists, not only for where to visit but also how
ratings of users to spots by using model-based CF. In [16], Ye to visit. Hsieh at el. [7] proposed a time-sensitive model based
et al. collected the data from FourSquare and Whrrl and on the information squeezing from Gowalla check-in data and
further recommended locations for users by fusing user used greedy algorithm to recommend trip routes. In [6], Hsieh
preference with social influence and geographical influence. at el. attempted to coordinate the preference lists from
However, this work does not consider the temporal property multiple users and construct a route which not only satisfies
which is an important factor in real applications. In [18], Yuan user preferences as many as possible, but also being popular
et al. proposed a unified framework which consists of the and reasonable. However, they could merely discover the
spatial and temporal characteristics of attractions. They split popular travel sequence while not the personalized travel
time in multiple slots and fill these slots with check-in values sequence. In [12], Lu et al. propose Personalized Trip
that users made at each specific hour of the day. Recommendation (PTR) framework to efficiently recommend
the personalized trips meeting multiple constraints of users by
C. Travel Package Recommendation mining user's check-in behaviors from Gowalla. PTR
There are few existing works on personalized travel considers both of user-based preferences and temporal-based
package recommendation. A travel package includes the properties to infer the scores of each attraction. Due to the
destination, hotels, price, landscapes, organized activities, etc. computation complexity, they further proposed Parallel Trip-
Building travel package recommendation systems is Mine+ to efficiently plan the trip under multiple constraints.
challenging because the personal data that can be used for
learning individual preferences are not conveniently available III. PROBLEM STATEMENT
and the values of travel packages created by travel companies In this section, we first define some terms used in this
can easily depreciate over time. Xie et al. [15] proposed a research work and then specify our research goal.
method of composite recommendation of points of interest for
tourists according to the tourist’s budget. This system Definition 1. Attraction (also called Point-Of-Interest,
recommends top-k packages for the user. But they target at POI). A = {a1, a2, ..., a|A|} denotes the collection of attractions.
generating travel routes for a tourist rather than recommending For each attraction a ∈ A, a has an attraction CATegory
the existing travel packages. Ge et al. [3] developed two cost- CAT(a), an Attraction Score AS(u, t, a), an Attraction Cost
aware latent factor models to recommend travel packages by AC(a) and a Stay Time ST(a), which represent the category of
considering both cost constraints and the tourist’s interests. attraction a, how interesting attraction a is for user u at time t,
Liu et al. [11] designed the Tourist-Area-Season Topic (TAST) how much money users often spend in attraction a and how
model to analyze the characteristics of travel packages and long users often stay in attraction a, respectively.
provide a personalized travel package recommendation. Definition 2. Package. P = {p1, p2, ..., p|P|} denotes the
However, the disadvantage of package recommendations may
be limited for the arrangement of travel agencies. collection of packages. For each package p ∈ P, p = ( ak ,
1

ak2 , ..., ak n ) is orderly composed of several attractions, where


D. Trip Recommendation n indicates the number of attractions in p (|p| = n). p has a
Trip recommendation is another important issue in travel Package Score PS(u, p), a Package Cost PC(p), a Begin Time
related topics. A travel recommender should consider not only BT(p) and an End Time ET(p), which represent how
the interesting of attractions and packages but also user- interesting package p is for user u, how much money of
specific constraints to construct a popular and reasonable package p, a constant beginning time of package p and a
travel trip for users. In [19], Zheng et al. designed a HITS- constant ending time of package p, respectively. The Duration
based Inference Model to infer the degree of user’s travel Time of p, denoted DT(p), can be calculated by ET(p) – BT(p).
experience and location interest from users’ GPS trajectories.
Definition 3. Check-In. Let ci = (u, t, a) be a check-in record
This paper compared HITS-based Inference Model with the
which means user ci.u checks-in attraction ci.a at time ci.t. D
two baseline methods, namely Rank-By-Count (RBC) and
= {ci1, ci2, …, ci|D|} denotes a check-in database that contains
Rank-By-Frequency (RBF), in terms of recommendation
|D| check-in records.
accuracy. When we regard an attraction as a package, the three
methods can be extended to measure how interesting a Definition 4. Route. R = {r1, r2, ..., r|R|} denotes the collection
package is. However, this work did not consider any user of routes. For each r ∈ R, r = (ai , aj), where ai, aj ∈ A and i ≠
constraint and personal preference. In [14], Vansteenwegen at j. In this paper, the measurement of each route r is defined as
el. designed a web-based tourist expert system named City the Travel Time TT(ai , aj). The travel time is approximately
Trip Planner to provide city trips after capturing the tourist’s estimated by using the average travel speed in routine profile.
constraints and interests through a small questionnaire. In [13],
Lu et al. proposed Trip-Mine algorithm for efficient planning Definition 5. Travel Map. M = (A, R) denotes the map of
of the optimal trip with the highest score under a user-specific travel region. Take Fig. 1 as an example. There is one current
travel time constraint. For improving the mining efficiency, location au of user, five attractions, i.e., a1 to a5, three
three optimization mechanisms including Attraction Sorting, packages, i.e., p1 to p3, and 15 routes in the trip map. The
Lower Bound Checking, and Score Checking are designed. departure time of user at au is 10:00. For attraction a1, ST(a1)
However, the above studies did not consider the temporal is 30 and AC(a1) is 5. For package p1, the involved attractions

35
are (a1, a2), PC(p1) is 15, BT(p1) is 13:00, ET(p1) is 14:15 and Definition 10. Travel Time and Budget Constraint. The
DT(p1) is 75 minutes. The travel time between a1 and a2 TT(a1, constraints of travel time cT and budget cB are defined as how
a2) is 30. much time and money the user has in this trip, respectively.

a4
ST(a4) = 50 Definition 11. Valid Trip. Let au, cT and cB be the current
ST(a1) = 30 AC(a4) = 30
AC(a1) = 5 50 location of the user, the user-specific travel time constraint
20 and the user-specific budget constraint, respectively. A trip tp
a1
30 a5
ST(a5) = 40 = <e1, e2, ..., en> is called a Valid Trip if AT(e)  BT(e), ∀e ∈
65 50 50 AC(a5) = 25
tp and e is a package, TPT(au, tp) is less than or equal to cT
30 30 60 and TPC(tp) is less than or equal to cB; otherwise, it is an
Package Begin End
Package
40
Cost Time Time Invalid Trip. In Fig. 1, the current location is au, the travel
40
40
p1=(a1,a2) 15 13:00 14:15 time constraint cT is 300 and the travel budget cB is 70. A trip
a2
30
p2=(a2,a4) 35 10:30 12:25 tp = <a3, p1> is a valid trip since TPT(au, tp) = 285 ≤ cT and
ST(a2) = 30
AC(a2) = 15 20 au p3=(a3,a5) 35 11:00 12:30 TPC(tp) = 35 ≤ cB. However, a trip tp’ = <p2, a3, a5> is not a
a3 20 Current Location valid trip since TPC(au, tp’) = 80 which is greater than cB.
ST(a3) = 30 cT = 300
AC(a3) = 20 cB = 70 Definition 12. Optimal Trip. Let au and t be the current
Figure 1. An example of a trip map network. location of the user u and the departure time of u, respectively.
A valid trip tp = <e1, e2, ..., en> is called the Optimal Trip if
Definition 6. Trip. A trip tp = <e1, e2, ..., en>, also denoted as there is no valid trip tp’ such that TPS(u, t, tp’) is greater than
n-trip, which is orderly composed of one or several trip TPS(u, t, tp). Notes that there may be not only one optimal trip
elements, where n indicates the number of elements (|tp| = n). in some constraint settings.
For each ex ∈ tp, ex may be an attraction or a package. We
respectively use ex.aF and ex.aL to represent the first and the Problem Formulation. With the above definitions, the main
last attraction in ex when ex is a package. If ex is an attraction, problem we address in this paper is formulated as follows.
ex.aF and ex.aL are equal to ex. Let AT(ex) be the Arrived Time Given a current location of the user and several user-specific
of ex which is defined as (1), where x > 1, ∀ex ∈ A ∨ P. For constraints such as travel time and travel budget, our goal is to
example, <a3, p1> is one of trips in Fig. 1. develop an efficient trip recommendation framework which
provides the optimal trip meeting multiple user-specific
­BT (ex ), If ex ∈ P. constraints and effectively combine packages and attractions.
° We expect the recommender can efficiently, effectively and
AT (ex ) = ® AT (ex −1 ) + ST (ex −1 ) + TT (ex −1 , ex ), If ex , ex −1 ∈ A. (1)
°ET (e ) + TT (e .a , e ), If e ∈ A and e ∈ P. accurately return the trip answer.
¯ x −1 x −1 L x x x −1

Definition 7. Trip Score. Given a user u, a departure time t IV. PROPOSED METHOD
and a trip tp = <e1, e2, ..., en>, the Trip Score TPS(u, t, tp) is In this section, we first give the proposed system
defined as (2), which represents how interesting this trip is. framework of Package-Attraction-based Trip Recommender
Take <a3, p1> in Fig. 1 as an example. Since the departure (PATR). Next, two main techniques of PATR, including Score
time for user u from au is 10:00, AT(a3) is 10:20 and AT(p1) is Inference and Hybrid Trip Planning, are presented in detail.
13:00. Assume that AS(u, 10:20, a3) is 0.65 and PS(u, p1) is
1.1. Hence, the trip score is TPS(<a3, p1>) = 0.65 + 1.1 = 1.75.
A. System Framework
TPS (u , t , tp ) = ¦ AS (u, AT (e), e) + ¦ PS (u, e)
∀e∈tp  A ∀e∈tp  P
(2) In this paper, we aim to design a trip recommendation
system which can suggest the user a personalized trip
Definition 8. Trip Time. Let au be the start location and t be considering attractions and packages simultaneously and
the departure time. For a trip tp = <e1, e2, ..., en>, the Trip satisfying several constraints in real time. As Fig. 2 shows,
Time TPT(au, tp) is formulated as (3), which represents how PATR consists of two stages. The first stage is an offline data
long the users travel around this trip. For example, t is set as mining mechanism. In this stage, we can first obtain the
10:00, TPT(au, <a3, p1>) = ET(p1) + TT(a2, au) - t = 285 attraction, package, personal and social link information
(minutes). directly from the travel websites such as Gowalla and LBSN
websites. Then, we infer the user-based and temporal-based
­ AT (en ) + ST (en ) + TT (en , au ) − t , If en ∈ A scores for each attraction and each package by the proposed
TPT (au , tp) = ® (3)
¯ET (en ) + TT (en .a L , au ) − t , If en ∈ P inference model. The second stage is an online query
mechanism. In this stage, users need to input a departure time,
Definition 9. Trip Cost. Let tp = <e1, e2, ..., en> be a trip, the a start location, a budget constraint, a time constraint and a
Trip Cost TPC(tp) is formulated as (4), which represents how temporal preference via their handheld mobile devices when
much money the users need to spend around this trip. For they need to plan a trip. After receiving user’s request, we first
example, TPC(<a3, p1>) = AC(a3) + PC(p1) = 35. fuse the user-based attraction/package score and temporal-
based attraction/package score by a user-specific weight. Next,
TPC (tp ) = ¦ AC (e) + ¦ PC (e)
∀e∈tp  A ∀e∈tp  P
(4) Hybrid Trip-Mine algorithm is proposed to plan trip. The
algorithm not only considers both of the time constraint and
budget limitation at the same time, but also considers

36
attractions and packages simultaneously. Based on Hybrid Hence, to emphasize different category weights in a package,
Trip-Mine, we propose two pruning strategies for further we also take the stay time of each attraction in the package
improving the computational efficiency. Finally, the optimal into account. We use (5) to calculate the preference of each
trip can be recommended to the user. category for user u [12].
Travel Region Data Mining Mechanism |D ' (u , c )|
ps(u,c) =
Attraction
Information
Package
Information
Attraction and Package
Score Inference
arg c'∈C Max(| D ' (u , c ' ) |) (5)
1.Cost 2.Category
3.Check-in Log
1. Cost 2.Attractions
3. Arrangement
User-based Score where D' (u , c) = {ci | ci.u = u and CAT(ci.a) = c, ∀ci ∈ D}
• Attraction
4.Stay time 4. Stay time
• Package

Temporal-based Score
For a user u, the preference vector, denoted as PV(u), is
Personal Social Link
Information Information • Attraction obtained by u’s preference score of each category, which is
Check-in log • Package
defined as (6).
Online Query Mechanism
PV (u ) =< ps (u , c1 ), ps (u , c 2 ),..., ps (u , c|C | ) > (6)
Constraint Hybrid Trip-Mine
1.Departure Time Recommended Trip
2.Start Location Score Fusion For a user u and a package p , the User-based package Score
1.Attraction
3.Budge Constraint
4.Time Constraint 2.Package
Score Estimation
by RBP-P, denoted as USRBP-P(u, p), is defined as (7).
5.Temporal Preference
Score Bound Tightening
ps (u , a.cat ) × ST ( a )
US RBP − P (u , p ) = ¦ × p
∀a∈ p ¦ ST ( a' )
∀a '∈ p
(7)
Figure 2. System framework of PATR.
b) Friend-based CF with Package (FCF-P)
B. Attraction and Package Score Inference User-based package score may be inferred by the concept
For each attraction/package, PATR needs to understand of Collaborative Filtering (CF). Hence, the first proposed CF-
that how interesting the attraction/package is for a specific based strategy is Friend-based CF with Package (FCF-P) that
user at a specific time before recommending the trip. To considers the friendship information as social links to infer the
measure the attraction/package score, we propose an inference user-based package score. The basic idea of FCF-P is to infer
model which consists of two aspects: 1) User-based the score of a package for a user by considering the check-in
attraction/package Score (US), which is to measure how behaviors of the user’s friends. Note that the friendship
interesting the attraction/package is for a specific user. We between users can be collected directly from Gowalla. The
propose four strategies to infer US; and 2) Temporal-based User-based package Score by FCF-P, denoted as USFCF-P(u,
attraction/package Score (TS), which is to measure how p), is defined as (8), where FU indicates the set of all the
suitable users visit the attraction/package at a specific time. friends of u. The functions chk(u, a) and fri(u, v) equal to 1 if
Different attractions may have different suitable time periods. user u has checked-in attraction a and user u and v exist a
We propose a strategy to infer TS. Finally, US and TS are friend relationship, respectively. Otherwise, the two functions
fused by a user-specific weight parameter as the final score are equal to 0.
including Attraction Score (AS) and Package Score (PS).
chk (v, a ) × ST (a )
¦ ¦ × p × fri(u, v)
1) User-based Attraction/Package Score Inference ∀v∈FU ∀a∈ p ¦ ST (a' ) (8)
To infer the user-based attraction score US(u, a), we apply US FCF − P (u , p ) =
∀a '∈ p

our previously proposed strategies [12] namely Ranking-By- ¦ fri(u, v)


Preference (RBP), Friend-based CF (FCF), Similarity-based ∀v∈FU

CF (SCF) and Similarity-Friend-based CF (SFCF), where u c) Similarity-based CF with Package (SCF-P)


and a be a user and an attraction, respectively. To infer the
user-based package score US(u, p), we extend our previous As mentioned above, the preference vector of user u PV(u)
idea to propose four novel strategies named Ranking-By- can be obtained by considering the personal check-in history
Preference with Package (RBP-P), Friend-based CF with of u. We can further use PV(u) to calculate the similarity
Package (FCF-P), Similarity-based CF with Package (SCF-P) between u and other users. Let u and v be two users, the
and Similarity-Friend-based CF with Package (SFCF-P), similarity between u and v can be measured by the cosine
where p be a package. The goals of user-based attraction and similarity measurement as shown in (9). Based on the
package score inference are to calculate US(u, a) and US(u, p), similarity, the second proposed CF-based strategy is
respectively. Similarity-based CF with Package (SCF-P) that considers the
user similarities as social links to infer the user-based package
a) Ranking-By-Preference with Package (RPB-P) score. The User-based package Score by SCF-P, denoted as
We think that there is a strong relationship between user USSCF-P(u, p), is defined as (10), where SU indicates the set of
preferences and attraction categories [12]. For a user u and a top-m similar users of u.
package p, the idea of Ranking-By-Preference with Package
¦
|C |
ps(u , c i ) × ps (v, ci )
(RBP-P) is to accumulate the total preference score of each sim( PV (u ), PV (v)) = i =1
(9)
¦ ¦
|C | |C |
attraction category in p. Besides, a package may have its own ps(u , ci ) 2 × ps (v, ci ) 2
i =1 i =1
main attraction or topic. We consider that the more important
the attraction is, the much stay time hold in the package.

37
chk(v,a)× ST(a) TS ( p ) = ¦ ∀a∈ p TS (a, AT (a, p))
¦ ¦ × p × sim(u,v) (15)
∀v∈SU ∀a∈ p ¦ ST(a')
∀a'∈ p (10)
US SCF − P (u , p) = For an attraction or a package, we have obtained the user-
∀v∈SU
¦ sim(u,v) based score and temporal-based score. In the next section, we
will describe how to fuse them when a user wants to plan a trip.
d) Similarity-Friend-based CF with Package (SFCF-P)
The last proposed CF-based strategy is Similarity-Friend- C. Trip Recommendation
based CF with Package (SFCF-P) which is the fusion of FCF- In PATR, the main goal of online query stage is to
P and SCF-P. In SFCF-P, user-based package score is inferred recommend the trip for users. This stage consists of three main
based on not only similar users but also friends. The User- processes: 1) Score Fusion. For an attraction or a package, the
based package Score by SFCF-P, denoted as USSFCF-P(u, p), is user-based score and temporal-based score are fused by a user-
defined as (11), where λ (limited from 0 to 1) is used to specific temporal factor; 2) Trip Planning. We propose the
control the weight between similarity and friendship. Hybrid Trip-Mine algorithm to plan trips under the user-
specific budget constraint and time constraint; and 3) Pruning
chk (v, a) × ST (a)
¦ ¦ × p × simfri(u, v) Strategy. Based on the proposed algorithm, we propose two
∀v∈SU ∀a∈ p ¦ ST (a' )
∀a '∈ p
pruning strategies for improving the computational efficiency.
US SFCF − P (u, p ) = (11)
¦ simfri(u, v) 1) Score Fusion
∀v∈SU After obtaining the user-based score and temporal-based
where simfri (u, v) = λ × sim(u , v) + (1 − λ ) × fri (u, v) score of each attraction and package, the main purpose of
PATR is to recommend a trip which not only satisfies user
2) Temporal-based Attraction/Package Score Inference preference but also match the suitable visiting time. Hence, the
In real travel experiences, different attractions should have two types of scores are fused as the final Attraction Score and
different suitable visiting time periods. For example, Times Package Score, denoted as AS(u, a, t) and PS(u, p) and defined
Square in New York is more suitable to be visited in the as (16) and (17), which represent the score of attraction a and
evening. Hence, we apply our previous strategies [12] to define package p for user u at time period t, respectively. The
24 Temporal-based attraction Scores (TS), denoted TS(a, t), in parameter α is set by users to control the weight between user-
a day, where t indicates a specific time period. TS is considered based score and temporal-based score.
from two aspects: 1) What time is more suitable for visiting this
attraction? This strategy is called Normalized-By-Time (NBT). AS (u , a, t ) = α * TS ( a, t ) + (1 − α ) * US (u , a ) (16)
The Temporal-based attraction Score by NBT, denoted as
PS (u , p ) = α * TS ( p ) + (1 − α ) *US (u , p ) (17)
TSNBT(a, t), is defined as (12), where |D”(a, t)| represents the
number of check-ins of attraction a during time period t, and T TABLE I. TEMPORAL-BASED SCORE AND USER-BASED SCORE.
indicates 24 time periods. 2) Which attractions are popular Temporal-based Score User-based Score
now? This strategy is called Normalized-By-Attraction (NBA). … 10:00 11:00 12:00 … Tom Amy Darren …
The Temporal-based attraction Score by NBA, denoted as {a1} … 0.5 0.6 0.6 … {a1} 0.5 0.7 0.6 …
TSNBA(a, t), is defined as (13), where A indicates the set of all
{a2} … 0.2 0.3 0.3 … {a2} 0.7 0.2 0.4 …
attractions.
{a3} … 0.8 0.6 0.6 … {a3} 0.5 0.5 0.6 …
|D" (a, t )| {a4} … 0.9 0.8 0.9 … {a4} 0.3 0.8 0.7 …
TS NBT (a,t ) = (12)
arg t'∈T Max(| D" (a, t ' ) |) {a5} … 0.6 0.8 0.8 … {a5} 0.6 0.5 0.8 …

|D" (a, t )| {p1} 0.9 {p1} 1.3 0.3 0.5 …


TS NBA (a,t ) = {p2} 1.0 {p2} 1.0 0.5 0.7 …
arg a'∈A Max(| D" (a' , t ) |) (13) {p3} 1.4 {p3} 1.2 0.6 0.3 …
where D" (a, t ) = {ci | ci.a = a and ci.t = t , ∀ci ∈ D}
TSNBT and TSNBA are fused by the harmonic mean since we TABLE II. AN EXAMPLE OF TRIP.
want both of the two values are relative high. Therefore, we
Trip Element Duration Score
can ensure that time period t is really suitable for visiting au 10:00
attraction a and attraction a is suitable for visiting during time a3 10:20-10:50 (0.5)*0.8+(1-0.5)*0.5 = 0.65
period t. The final TS(a, t) can be defined as (14). p1 13:00-14:15 (0.5)*0.9+(1-0.5)*1.3 = 1.1

TS (a, t ) × TS NBA (a, t ) au 14:45


TS (a, t ) = 2 × NBT (14)
TS NBT (a, t ) + TS NBA (a, t ) 2) Trip Planning
As mentioned above, each attraction should have its own To clearly explain our design, we use Fig. 1 as an example.
suitable visiting time periods. So we can also apply TS(a, t) to Suppose that Tom is a tourist and he has 300 minutes and 70
calculate Temporal-based package Score, denoted TS(p). For a USD to travel. He starts from au at 10:00 and a1 to a5 represent
package p, because the beginning time and arrived time of five different attractions and p1 to p3 represent three different
every attraction a in p, denoted AT(a, p), are constant, TS(p) packages. Each attraction/package has the stay/duration time
can be defined as (15). and cost. For two attractions, the approximate travel time is
provided. The User-based Score and Temporal-based Score

38
tables are shown as Table I. Tom sets the preference weight of permutation. We put all valid trips generated from candidate
temporal-based score α as 0.5. Table II shows a sample trip. 2-sets into VT. There are 31 candidate 3-sets. By checking
This trip starts from au, travels to a3, p1 and finally backs to au. their trip cost, there are many invalid candidates, e.g., {a3, a5,
The third column of Table II shows the score of each p2}, {a3, a4, a5}, {a4, a5, p3}, etc. Hence, these candidates can
attraction and package. Tom arrives a3 at 10:20. Based on be pruned. Again, we still need to check the travel time of
Table I, TS(a3, 10:20) is 0.8 and US(Tom, a3) is 0.5. The two these candidate 3-sets. The algorithm stops when no candidate
scores of a3 are fused as 0.65 (α*0.8+(1-α)*0.5). By the same set is generated. Finally, we can obtain a list of trips ranked by
process, the score of p1 is 1.1. The trip score can be calculated the trip score. The optimal trip is <a3, a5, p1> since the trip
as 1.75 by sum of all scores. Besides, we need to check score is the highest from the valid trips set.
whether this trip is valid under Tom’s constraints. The trip
3) Pruning Strategy
time of this trip is TPT(Tom, <a3, p1>) = 285 minutes which
To improve the efficiency of Hybrid Trip-Mine algorithm,
meets Tom’s time constraint. The trip cost of this trip is
we design two pruning strategies to reduce the number of
TPC(<a3, p1>) = 35 USD also satisfies Tom’s budget
candidate generations.
constraint. Therefore, the trip < a3, p1> is valid for Tom.
Candidate 1-Sets. Candidate 2-Sets. Candidate 3-Sets. Candidate 4-Sets. Pruning Strategy 1. Score Estimation (SE). The first
1-Set Time Cost Score 2-Set Time Cost Score 3-Set Time Cost Score 4-Set Time Cost Score pruning strategy is Score Estimation (SE). Hybrid Trip-Mine
{a1} 90 5 0.5 {a1, a2} 150
{a1, a3} 150
20
25
?
?
{a1, a2, a3}
{a1, a2, a4}
?
?
40
50
?
?
{a1, a2, a3, a4}
{a1, a2, a3, a5}
?
?
70
65
?
?
needs a lot of computational cost to check whether a candidate
{a2} 90 15 0.45
trip is valid or not. Hence, we propose this pruning strategy to

{a1, a2, a5} ? 45 ? {a1, a2, a3, p3} 75


{a3} 70 20 0.65
{a4} 150 30 0.6
{a1, p1} 20 {a1, a2, p2} 55 {a1, a2, a4, a5} 75
reduce the number of candidates and further decrease the

{a1, p2} ? 40 ? {a1, a2, a4, p3} 85


{a5} 120 25 0.6
{a1, p3} ? 40 ? {a3, a5, p1} ? 60 ? {a1, a2, a5, p3} 80 number of permutation generations. Recall the goal of this
{p1} 285 15 1.1
{p2} 195 35 1.0
{a2, a3} 130 35 ? {a3, a5, p2} 80 {a1, a3, a4, a5} 80 paper is to recommend the optimal trip meeting multiple

{a3, p1, p2} 70 {a2, a3, a4, a5} 90


{p3} 190 35 1.3 {a5, p3} 60 {a4, a5, p1} 70 constraints and effectively combine packages and attractions.
{p1, p2} 50 {a4, a5, p3} 90 We can use the score of the so far best trip to be a score
{p1, p3} 285 50 2.4 {a4, p1, p3} 80
{p2, p3} 70 {a5, p1, p2} 75
threshold. This threshold can be used to determine that
{a1, a3}
whether the candidate should be pruned or not. For a candidate
{a1, p3} {a1, a2 , a3} {a3, a5, p1} {a1, a2, a3, a5}
Checking. Checking. Checking. Checking. Optimal Trip Checking. set, if the estimated upper bound score of its superset is still
Trip Score
<a1, a3> 1.05
Trip
<a1, p3>
Time Score
Trip Time Score Trip Time Score Trip Time Score lower than the highest score currently, the candidate set would
<a1, a2, a3> 190 1.55 <a3, a5, p1> 285 2.45 <a1, a2, a3, a5> 290 1.85
<a3, a1> 1.2 <p3, a1> 240 1.85
<a1, a3, a2> 210 1.55 <a3, p1, a5> <a1, a2, a5, a3> 310 never join the optimal trip generation. Hence, the candidate set

{p1, p3} <a2, a1, a3> 210 1.55 <a5, a3, p1> <a1, a3, a2, a5> 320 can be prune to reduce the number of candidates, and it does

{p2, p3}
Checking. Checking. <a2, a3, a1> 210 1.55 <a5, p1, a3> … … …
Trip Time Score Trip Time Score <a3, a1, a2> 210 1.7 <p1, a3, a5> not influence the final result.

<p3, p1> 285 2.4 <p2, p3> 2.3 <a3, a2, a1> 190 1.7 <p1, a5, a3>
ST(a1) = 30 50 a4 ST(a4) = 50
Cost = 5 Cost = 30
Figure 3. The process of Hybrid Trip-Mine algorithm. a1

Although a number of studies had been proposed to solve 65

the problem of trip planning, they cannot be directly applied to 30 50


our problem by taking package into count. Hence, we propose 30
a mining algorithm named Hybrid Trip-Mine for solving the
new trip planning problem by integrating attractions and a2 30
au
packages. Hybrid Trip-Mine is designed as an Apriori-based ST(a2) = 30
Current Location
Cost = 15
algorithm. To conveniently explain the strategy of candidate Tc = 300
attraction set generation, we use Fig. 3 as an example. Let VT Bc = 70
be the set of all valid trips. Initially, VT = ∅. We first generate
Figure 4. Lower bound checking for attraction set {a1, a2, a4}.
8 candidate 1-sets and check whether both the trip time and
trip cost satisfy the two constraints. In this example, all of the Before applying Score Estimation pruning strategy, we
candidate 1-sets are valid. Hence, we put all candidate 1-sets first use Lower Bound Checking [13] to calculate the lower
into VT. Next, two valid 1-sets are joined as a candidate 2-set. bound of travel time for each candidate set. Take Fig. 3 and
There are 28 candidate 2-sets. For each candidate 2-set, we Fig. 4 as an example. Suppose that the currently highest score
should do repeated attraction checking, time checking and is 2.4 in the candidate 2-sets and we observe the candidate 3-
budget checking. We observe that {a1, p1} is invalid because set {a1, a2, a4} is valid. As Fig. 4 shows, the sum of stay time
the trip has attraction a1 which is repeated in package p1. for the three attractions is 30+30+50=110. The routes (a4, a1)
Hence, {a1, p1} can be pruned and does not join the candidate and (a1, a2) will be selected as the minimal moving travel time
3-set generation. For the time checking, we need to generate between attractions which is 30+50=80, and the routes (au, a1)
their permutations and check whether any permutation and (au, a2) will be selected as the minimal moving travel time
satisfies the time constraint. A package would split time into between start location to attractions which is 30+30=60.
two segments, and we must to check whether the time of each Finally, the travel time lower bound of {a1, a2, a4} is
segment is valid. Take {a1, p3} as an example. <a1, p3> is not 110+80+60=250. The travel time constraint is 300, and the
valid because AT(p3) > BT(p3), but {a1, p3} cannot be pruned remaining time is 300-250=50. In Fig. 1, we observe the
since <p3, a1> is valid. Another case is {p2, p3}, the time of minimum stay time of attraction is 30. Hence, we estimate the
these package is overlap, i.e., AT(p3) > BT(p3). For these remaining time can add ‫ہ‬50 / 30‫ = ۂ‬1 attraction at most. As Fig.
candidate 2-sets, we need to calculate the trip score of every 5 shows, the maximum score of an attraction can be combined

39
from the highest temporal-based score and user-based score. from Gowalla and the experimental settings. Next, the
Hence, the maximum score of an attraction is 0.7. Finally, {a1, experimental evaluations are divided into two parts: 1)
a2, a4} can be pruned since the estimated upper bound score of Comparison of effectiveness by various package scoring
its superset is at most 1*0.7+0.55+0.5+0.6=2.35. strategies; and 2) Comparison of efficiency by various trip
Temporal-based Attraction Score User-based Attraction Score
planning methods. All the experiments were implemented in
Tom Amy Darren … Maximum Score
Java on an Intel i7 CPU 3.40GHz machine with 8 GB of
… 9:00 10:00 11:00 12:00 …
{a1} … 0.5 0.5 0.6 0.6 … {a1} 0.5 0.7 0.6 … {a1} 0.55
memory running Microsoft Windows 7.
{a2} … 0.3 0.2 0.3 0.3 … {a2} 0.7 0.2 0.4 … {a2} 0.5
{a3} … 0.7 0.8 0.6 0.6 … {a3} 0.5 0.5 0.6 … {a3} 0.65 A. Experiment Settings
{a4} … 0.8 0.9 0.8 0.9 … {a4} 0.3 0.8 0.7 … {a4} 0.6 We first describe a real check-in dataset obtained from
{a5} … 0.7 0.6 0.8 0.8 … {a5} 0.6 0.5 0.8 … {a5} 0.7 Gowalla. We collect the dataset from January 2009 to April
{p1} 0.9 {p1} 1.3 0.3 0.5 … {p1} 1.1 (Avg:0.55) 2011. The collected dataset consists of check-in data in New
{p2} 1.0 {p2} 1.0 0.5 0.7 … {p2} 1.0 (Avg:0.5) York (NY) and other cities. For the whole dataset, the number
{p3} 1.4 {p3} 1.2 0.6 0.3 … {p3} 1.3 (Avg:0.65) of users, attractions and attraction check-ins are 8,268,
447,409 (15,616 in NY) and 2,181,978 (271,228 in NY),
Figure 5. Maximum score estimation of each attraction.
respectively. Each user averagely checks-in 32.8 different
1-Set
Maximum
1-Set
Maximum attractions in NY and 231.1 different attractions in other cities.
Score Score
{a1} 0.55 {a5} 0.7 {a5} {p3} … {a3} … {p2} {a2} Besides, we collect packages from travel websites and our
{a2} 0.5 {p3} 0.65 (Avg)

developed web system [12]. We regard the average check-ins
{a3} 0.65 {a3} 0.65
{a5 p3} {a3 a4}{a3 p1}{a3 a1} … {p2 a2} of all attractions in each package as its rating. For a package,
{a4} 0.6 {a4} 0.6
we call a user checks-in it if at least one attraction in the

{a5} 0.7 {p1} 0.55(Avg) Maximum


{p1} 0.55 (Avg) {a1} 0.55 Maximum {a3 a4 p1} … score=0.5 package had been checked-in by the user. The number of
score=0.7
{p2} 0.5 (Avg) {p2} 0.5 (Avg) Maximum packages and package check-ins are 40 and 71,751,
{p3} 0.65 (Avg) {a2} 0.5 score=0.65
respectively. Each user averagely checks-in 8.68 different
(a) Candidate 1-sets (b) Score Sorting (c) Candidate Set Generation packages in NY. For each attraction, the stay time and
attraction cost are estimated based on the category of
Figure 6. Score Bound Tightening.
attraction. For each package, due to our system is proposed to
Pruning Strategy 2. Score Bound Tightening (SBT). recommend a trip within a day, so we need to split each
According to the Score Estimation, a candidate can be pruned package into several short packages and simulate cost and
if its estimated upper bound score is lower than the currently time of them. The stay time and package cost are estimated
highest score. Hence, the more accurate upper bound score we based on the categories of included attractions in the package
estimate, the more number of candidates can be pruned. To with few discount (default discount is 70% and 5 USD fee).
perform this idea, the second pruning strategy we propose is The travel time between attractions is proportionally estimated
named Score Bound Tightening (SBT). Due to Hybrid Trip- based on the distance of them. Both the user-based score and
Mine is designed as a breadth-first search and level-by-level temporal-based score are inferred by our proposed strategies.
algorithm, every 1-set can be viewed as the root of a subtree as
Fig. 6(c) shows. Based on the candidate generation strategy, B. Effectivenessof Package Scoring
each 1-set just combine with the following 1-sets as a To compare the effectiveness of various package scoring
candidate 2-set, e.g., in Fig. 6(a), {a3} just combine with {a4}, strategies, we equally divide users into 5 groups (4 groups of
{a5}, etc. We observe that candidate sorting based on the users for training and 1 group of users for testing). For the
maximum score benefits Hybrid Trip-Mine to tighten the 80% training users (4 groups), the preference vectors are
estimated upper bound score. As Fig. 6(b) shows, the extracted by their whole check-in data. For the 20% testing
candidate 1-sets are sorted according to their maximum score. users, we remove their check-ins in NY and extract their
Although the global maximum score of all attractions is 0.7, preference vectors by their check-in data in other cities. The
we can use 0.65 as a maximum score of an attraction when similarity between training users and testing users can be
generating all of the candidate n-sets start with {a3} since {a3} further calculated based on these preference vectors. And the
never combine with {a5}, where n  2. Hence, Score scores of packages in NY can be inferred for testing users
Estimation can obtain more accurate upper bound score of based on the similarity or friendship between users. Hence,
candidate n-sets. {p2} is another example. {p2} only need to these scores can be used to form a ranking list of packages in
consider {a2} when generating candidate 2-sets. Thus, the NY to the testing user. The effectiveness of recommendations
maximum score can be further tightened to 0.5 by this strategy. can be measured by comparing the ranking list with ground
The lower the estimated upper bound score, the more truth. We compare our proposed strategies RBP-P (Ranking-
candidates can be pruned. Thus, this strategy can significantly By-Preference with Package), FCF-P (Friend-based CF with
improve the efficiency of the optimal trip discovery. Package), SCF-P (Similarity-based CF with Package) and
SFCF-P (Similarity-Friend-based CF with Package) with
V. EXPERIMENTAL EVALUATIONS Random recommendation and the three strategies HITS-P,
In this section, we conduct a series of experiments to RBF-P and RBC-P we adjust from [19] in terms of two well-
evaluate the effectiveness and efficiency of our proposed known measurements, namely NDCG (Normalized
PATR. We first describe the experimental real data crawled Discounted Cumulative Gain) [9] and precision. NDCG is
used to evaluate the performance of a ranking result and

40
compute the ranking result relative to the ideal ranking list. C. Efficiency of Trip Planning
For each package, we take the average check-in frequency as We conduct an experiment for comparing the efficiency of
rating. The DCG (Discounted Cumulative Gain) is computed our proposed algorithms and pruning strategies on Gowalla
as (18), where reli is the average check-in frequency of and package datasets. We compare the efficiency of our
package at rank i in the recommended ranking list. Given the proposed fundamental Hybrid Trip-Mine to Hybrid Trip-Mine
ideal discounted cumulative gain IDCGp, then NDCG at p-th with pruning strategies, namely Score Estimation (SE) and
position can be computed as (19). Score Bound Tightening (SBT), in terms of execution time and
reli memory utilization.
DCG p = rel1 + ¦i =2
p
(18)
log 2 i
120 700
DCG p

Execution Time(sec.)
NDCG p = (19) 100 600

Memory(MB)
IDCG p 80
500
400
60
precision is often used to measure the prediction performance. 40
300
We examine the precision of recommended ranking list in our 200
20 100
experiments. The precision at p-th position can be computed 0 0
as (20), where TPp and FPp indicate the p-th package in the

200
300
400
500
600
700
800
900
1000

200
300
400
500
600
700
800
900
1000
ranking list that the testing user has ever checked-in one of
Number of Attractions (nA) Number of Attractions (nA)
attractions in package and the user has never checked-in any (a) Execution Time (b) Memory Utilization
attraction in package, respectively.
Figure 8. Impact of number of attractions.
TPp
precisionp = (20)
TPp + FPp 1) Impact of Number of Attractions nA
This experiment analyzes the impact of the two pruning
Fig. 7 shows that our proposed SFCF-P outperforms strategies, namely SE and SBT, based on Hybrid Trip-Mine
various package recommendation strategies in terms of NDCG when the number of attractions (nA) is varied from 200 to
and precision under rank 5, 10 and 15. As the results show, 1,000 in terms of execution time and memory utilization. The
both of the NDCG and precision of Random recommendation number of packages is fixed 40. Fig. 8(a) and 8(b) show that
strategy are very low. RBC-P, RBF-P and HIS-P perform well Hybrid Trip-Mine + SE + SBT outperforms the fundamental
since popular travel packages usually contain popular Hybrid Trip-Mine with varied the number of attractions. We
attractions and attract most of people. However, they do not observe that the execution time and memory utilization
consider user preference. Hence, the packages scores increase significantly with the number of attractions increases.
calculated by these methods may not be satisfied by everyone. The reason is that the number of possible trips exponentially
For our proposed strategies, RBP-P is not good since RBP-P increases when the number of attractions increases. Due to our
only considers the categories and stay time of attractions in problem is to recommend the optimal trip, our proposed
packages. Hence, RBP-P cannot differentiate which package is algorithms need more execution time and memory utilization
better when there are several packages consisting of to generate and check the more candidate sets consist of
attractions with the same category and similar stay time. FCF- attractions and packages.
P is not good too since users’ friends may have different
interests. We can observe that SCF-P is better than RBC-P,
1400
RBF-P and HIS-P in terms of NDCG and precision. The 800
Execution Time(sec.)

700 1200
reason is that SCF-P captures the user preferences based on
Memory(MB)

600 1000
the user check-in similarity. Similar users may have more 500 800
similar interests. Finally, SFCF-P outperforms SCF-P and 400
600
300
FCF-P since SFCF-P considers the check-in similarity and 200 400
friendship together for trip recommendation. 100 200
0 0
240
280
320
360
400
440
480
520
560
600

240
280
320
360
400
440
480
520
560
600

NDCG@5 NDCG@10 NDCG@15 P@5 P@10 P@15


0.5 0.5 Time Constraint (cT) Time Constraint (cT)
0.45 0.45 (a) Execution Time (b) Memory Utilization
0.4 0.4
0.35 0.35 Figure 9. Impact of travel time constraint.
0.3 0.3
0.25 0.25 2) Impact of Travel Time Constraint cT
0.2 0.2
0.15 0.15 This experiment analyzes the impact of the two pruning
0.1 0.1 strategies, namely SE and SBT, based on Hybrid Trip-Mine
0.05 0.05 when the travel time constraint (cT) is varied from 240 to 600
0 0
minutes in terms of execution time and memory utilization.
SFCF-P
SCF-P
FCF-P
RBP-P
RBC-P
RBF-P

HITS-P
Random
SFCF-P

RBP-P

RBF-P

HITS-P
Random
SCF-P
FCF-P

RBC-P

Fig. 9(a) and 9(b) show that Hybrid Trip-Mine + SE + SBT


NDCG@k precision@k
outperforms the fundamental Hybrid Trip-Mine with varied
the travel time constraint. We observe that the execution time
Figure 7. NDCG and precision of various scoring strategies. and memory utilization significantly increase by increasing the

41
travel time constraint. The reason is that the number of valid utilization. For the future work, we will design more efficient
trips increases, when the travel time constraint increases. optimal trip recommendation approaches and optimization
Hence, our proposed algorithms need to spend more time to strategies to further enhance the performance of trip planning.
discover the best one among all valid trips. In addition, we plan to consider more useful factors in detail
of packages and collect more real dataset to design more
3) Impact of Budget Constraint cB accurate model and verify our proposed approaches.
This experiment analyzes the impact of the two pruning
strategies, namely SE and SBT, based on Hybrid Trip-Mine
when the budget constraint (cB) is varied from 0 to 180 USD ACKNOWLEDGMENT
in terms of execution time and memory utilization. Fig. 10(a) This research was supported by National Science Council,
and 10(b) show that Hybrid Trip-Mine + SE + SBT Taiwan, R.O.C. under grant no. NSC101-2221-E-006-255-
outperforms the fundamental Hybrid Trip-Mine with varied MY3.
the budget constraint. We observe that the execution time
significantly increase at first while keep steady after the REFERENCES
budget constraint higher than 140 USD. The reason is that the
[1] J. Bao, Y. Zheng and M. F. Mokbel, "Location-Based and Preference-
number of candidate sets will be bounded by the number of Aware Recommendation Using Sparse Geo-Social Networking Data," in
attractions and the travel time constraint. Hence, the number ACM GIS, pp. 199-208, 2012.
of valid trips will not increase even if we keep increasing the [2] B. Berjani and T. Strufe, "A Recommendation System for Spots in
budget constraint. Location-Based Online Social Network," in SNS, 2011.
[3] Y. Ge, Q. Liu, H. Xiong, A. Tuzhilin and J. Chen, "Cost-Aware Travel
5 Tour Recommendation," in KDD, pp. 983-991, 2011.
600
Execution Time(sec.)

500
[4] Gowalla, http://gowalla.com
4
Memory(MB)

400 [5] T. Horozov, N. Narasimhan and V. Vasudevan, "Using Location for


3 Personalized POI Recommendations in Mobile Environments," in
300 SAINT, pp. 124-129, 2006.
2
200
[6] H.-P. Hsieh and C.-T. Li, "Constructing Routes with User Preference
1 100 from Location Check-in Data," in ACM UbiComp, pp. 195-198, 2013.
0 0 [7] H.-P. Hsieh, C.-T. Li and S.-D. Lin, "Exploiting Large-Scale Check-in
0
20
40
60
80
100
120
140
160
180

0
20
40
60
80
100
120
140
160
180

Data to Recommend Time-Sensitive Routes," in ACM UrbComp, pp. 55-


Budget Constraint (cB) Budget Constraint (cB) 62, 2012.
(a) Execution Time (b) Memory Utilization [8] Y. Huang and L. Bian, "A Bayesian Network and Analytic Hierarchy
Process Based Personalized Recommendations for Tourist Attractions
Figure 10. Impact of budget constraint. over the Internet," ESWA, 36(1), pp. 933-943, 2009.
[9] K. Järvelin and J. Kekäläinen, "Cumulated Gain-Based Evaluation of IR
Techniques," ACM TOIS, 20(4), pp. 422-446, 2002.
VI. CONCLUSIONS AND FUTURE WORK
[10] J. Kim, H. Kim and J.-H. Ryu, "TripTip: A Trip Planning Service with
In this paper, we have proposed a novel approach named Tag-based Recommendation," in CHI, pp. 3467-3472, 2009.
Package-Attraction-based Trip Recommender (PATR) for not [11] Q. Liu, Y. Ge, Z. Li, E. Chen and H. Xiong, "Personalized Travel
only efficiently recommending the personal trips but also Package Recommendation," in ICDM, pp. 407-416, 2011.
effectively combining attractions and packages with several [12] E. H.-C. Lu, C.-Y. Chen and V. S. Tseng, "Personalized Trip
constraints. In PATR, we have proposed a Score Inference Recommendation with Multiple Constraints by Mining User Check-in
model to infer the scores of attractions and packages with both Behaviors," in ACM GIS, pp. 209-218, 2012.
user-based preference and temporal-based properties [13] E. H.-C. Lu, C.-Y. Lin, and V. S. Tseng, "Trip-Mine: An Efficient Trip
Planning Approach with Travel Time Constraints," in MDM, pp. 152-
considered. Then, we have proposed the Hybrid Trip-Mine 161, 2011.
algorithm for efficiently trip planning. Finally, we have further [14] P. Vansteenwegen, W. Souffriau, G.V. Berghe and D.V. Oudheusden,
proposed two pruning strategies: 1) Score Estimation (SE) for "The City Trip Planner: An Expert System for Tourists," ESWA, 38(6),
evaluating the upper bound score of each candidate and pp. 6540–6546, 2011.
reducing the number of candidates. 2) Score Bound Tightening [15] M. Xie, L. V.S. Lakshmanan and P. T. Wood, "Breaking out of the Box
(SBT) for precipitately pruning the candidates with more of Recommendations: From Items to Packages," in ACM RecSys,
accurate upper bound of trip scores. To the best of our pp.151–158, 2010.
knowledge, this is the first work on trip recommendation that [16] M. Ye, P. Yin, W.-C. Lee and D.-L. Lee, "Exploiting Geographical
integrates attractions and packages simultaneously with Influence for Collaborative Point-of-Interest Recommendation," in
SIGIR, pp. 325-334, 2011.
consideration of multiple user-specific constraints.
[17] H. Yin, C. Wang, N. Yu and L. Zhang, "Trip Mining and
To evaluate the performance of the proposed approaches, Recommendation from Geo-tagged Photos," in ICMEW, pp. 540-545,
2012.
we have conducted a series of experiments based on the real
dataset Gowalla. The experimental results show that the [18] Q. Yuan, G. Cong, Z. Ma, A. Sun and N. Magnenat-Thalmann, "Time-
Aware Point-of-Interest Recommendation," in SIGIR, pp. 363-372,
proposed algorithm Hybrid Trip-Mine achieves significantly 2013.
high efficiency in trip planning. Besides, Hybrid Trip-Mine [19] Y. Zheng and X. Xie, "Learning Travel Recommendations from User
integrates the two proposed pruning strategies to achieve Generated GPS Traces," ACM TIST, 2(1), 2011.
superior performs in terms of execution time and memory

42

You might also like