From: AAAI Technical Report SS-99-06. Compilation copyright © 1999, AAAI (www.aaai.org). All rights reserved.
A Route Advice Agent that Models Driver Preferences
Seth Rogers and Claude-Nicolas Fiechter and Pat Langley
DaimlerChrysler Research and Technology Center
1510 Page Mill Road, Palo Alto, CA 94304–1135
+1 650 845 2533/2504/2532
frogers, fiechter,
[email protected]
Abstract
The state of the art in computer technology has advanced to
the point where systems for generating driving directions between two points are commonplace. There are several web
sites offering street-level driving directions, and several incar systems available as an option on purchased or rented
cars. The availability of vector road map digital representations (digital maps) and large-capacity, fast processors enable this technology, but little attention has been paid to ensuring that the interface is flexible enough to deliver satisfactory routes to users that have different preferences.
Current systems for route advice compute solutions using a shortest path algorithm to find the minimal-cost route
from the origin to the destination. Some systems fix the cost
as the estimated travel time, while others allow the user to
choose between the shortest path, the quickest, or the “most
scenic” one. In all cases, the system then describes the route
to the user with little or no recourse if the driver finds the
route unsatisfactory. These systems disregard the fact that
driving occurs in a rich environment where many factors influence the desirability of a particular route. For example,
some drivers may prefer the shortest route as long as it does
not have too many turns, or the fastest route as long as it
does not go on the highway. The relative importance of these
factors varies among individuals, and drivers may not know
themselves what they value most in routes.
In this paper, we describe the Adaptive Route Advisor, an
adaptive user interface (Pat Langley 1997) agent that recommends routes from a source address on a road network to a
destination address. Given a routing task, the Route Advisor
interacts with the driver to generate a route that he or she
finds satisfactory. Initially, the agent proposes a small number of possible routes, taking into account the driver’s preferences if known. The driver can then request solutions that
differ from the initial options along some dimension. For
instance, the driver may request a route that is simpler, even
if it means a longer route. The driver and the Route Advisor continue to work together, generating and evaluating
different solutions, until the driver is satisfied. During this
process, the agent unobtrusively collects information about
the driver’s choices and uses these data to refine its model of
the driver’s preferences for the next task.
If we define the level of autonomy as the number of interactions with a user, with zero interactions implying total
autonomy, the user adjusts the autonomy of the Adaptive
Route Advisor by continuing to request routes until satisfied. Since the system bases its initial route suggestion on
the user model, the user’s satisfaction with the initial route
depends on the accuracy of the model. Ideally, after a reasonable number of interactions, the agent’s user model will
be accurate enough so that in most cases the driver will be
satisfied with the routes proposed and that additional interactions will not be required. This is particularly important
in a driving environment where the demand on the driver’s
attention must be limited.
The Adaptive Route Advisor is designed for in-car use. It
is a Java application that functions as a resource-light network client, suitable for mobile environments with a wireless communication infrastructure. The remote servers provide resource-intensive functions such as routing and geolocation.2 Although the current version does not yet take advantage of information available from mobile deployment
(primarily current and past locations from the Global Positioning System), and the interface is not fully optimized for
limited input and output resources common in vehicles, we
intend to more fully assimilate the Adaptive Route Advisor
in a mobile environment in future work.
1
An HTML version of this paper with links is available on the
web at http://pc19.rtna.daimlerchrysler.com/˜rogers/ss-99.
2
Geolocation is mapping a plain English street location to its
place in a digital map structure.
Generating satisfactory routes for driving is a challenging task because the desirability of a particular route depends on many factors and varies from driver to driver.
Current route advice systems present a single route to
the driver based on static evaluation criteria, with little
or no recourse if the driver finds this solution unsatisfactory. In this paper, we propose a more flexible approach
and its implementation in the Adaptive Route Advisor.
Our route advice agent interacts with the driver to generate routes that he or she finds satisfactory, uses these
interactions to build a model of the driver’s preferences,
and then uses the model to generate better routes for that
driver in future interactions. As the preference model
becomes more accurate, the need for interaction decreases and the agent’s autonomy increases. We also
present a pilot study on using route selections to construct a personalized model of driver preferences.
Introduction 1
n
se
lec
tio
te
Ro
u
Initial
Preferences
Start & End Location
Short-term preferences
Actual Driven Routes
User Model
Route options, tradeoffs
Interface Client
GPS
Route Server
Current Traffic
Conditions
Links,
features
Digital Map
Figure 1: Architecture for the Adaptive Route Advisor. Elements with solid lines are already implemented, whereas elements
with dashed lines are under development.
The pages that follow describe our approach in more detail and present the results of an experiment in personalizing the user model from rankings of routes generated from
static preference models. First we present the overall system architecture, including the route generation component,
the adaptation method the Route Advisor uses to personalize the user preference model, and the user interface that
presents route options to the user and gathers preference
feedback. We then report on an experiment adapting a preference model to human subjects and its results. Next, we
present our approach to handling hidden attributes and outline planned improvements to the agent. Finally, we summarize the Adaptive Route Advisor and describe its relevance
to more general problems.
System Architecture
The Adaptive Route Advisor requires heavy memory resources to store the digital map, and processing resources to
compute an optimal route. However, we assume computational resources in vehicles will be limited in the near future.
The client/server architecture shown in Figure 1 resolves this
difficulty by offloading resource-intensive processes onto a
remote server. This architecture also lets the routing system
use dynamic information about the current traffic conditions,
which would be available as a centralized service, as in the
ITGS service in Tokyo (Hadfield 1997).
In the figure, portions in the current implementation are
drawn in solid lines, and planned extensions in dashed lines.
The interface client is a resource-light process suitable for
a vehicle’s limited computational power. It connects to the
servers via a wireless TCP/IP connection. The route server
receives route requests from the client, and uses the digital
map to compute an optimal route according to preferences in
the user model. Later route requests may include short-term
changes in the preference model to reflect unusual situations
or corrections in the model.
The system initializes the agent with a default user model,
and refines the user model with feedback from interaction
with the interface. Future versions will also allow feedback
from direct sensing of the driver’s preferred routes using the
Global Positioning System. The future work section discusses these extensions in more detail.
The Routing Algorithm
The generative component of the Adaptive Route Advisor is
a routing algorithm that plans a path through a digital map
from a starting point to a destination. The planner represents
the digital map as a graph, where the nodes are intersections
and the edges are parts of roads between intersections. Our
digital maps provide four attributes for each edge: length,
estimated driving time, turn angle to connected edges, and
road class (e.g., highway, freeway, arterial road, local road).
The planner refers to these digital maps to minimize the
weighted sum of the driving time, length, number of turns,
number of intersections, and driving time on each road class.
The routing algorithm finds a path from a designated
source node, usually the current position, to a designated
destination. The cost of an edge is computed as a weighted
sum of its attributes,
c
=
X
(wi
ai ):
i
The weight vector plays the role of a user preference model
that defines the relative importance of the attributes. The
system uses an optimized version of Dijkstra’s shortest path
algorithm (Dijkstra 1959) to find the path with the minimal
sum of the costs for the edges in the path.
Figure 2: The route request window.
Constructing the User Model
Although weighting each edge attribute creates a flexible
cost function for the planner, the space of possible models is
a continuum with as many dimensions as there are attributes.
It would be difficult and inconvenient for a user to specify
his relative preference for each attribute. Instead, our system
automatically induces driver preferences from driver route
choices. We have implemented a perceptron-style training
algorithm (Nilsson 1965), which we call differential perceptron, that processes a sequence of interactions with the planner and produces a weight vector that attempts to model the
preferences expressed. In this way, as the driver uses the
interface, it adapts itself to the user’s preferences. This is
reminiscent of the way in which Hermens and Schlimmer’s
system for filling out repetitive forms (Hermens & Schlimmer 1994) adapts itself to a particular usage pattern and predicts defaults values based on those observed in previous
interactions.
We define an interaction with the planner to be the presentation of a set of N generated routes and feedback from the
user indicating which route is preferable. This is completely
unobtrusive to the user, because he or she evaluates a set of
routes and selects one as part of the route advice process.
For training, we expand the interaction into N , pairs,
representing the fact that the selected route is preferable to
each of the presented alternatives. These training pairs can
be used to improve the user model in a simple manner. If,
out of the two routes in a training pair, the route preferred
by the current user model is not the one the user selected,
the adaption method increases the weights corresponding to
the features in the selected route and decreases those corresponding to the features in the other route.
More precisely, the system represents routes with a vector ~x containing its measurable attributes. Given an initial
weight vector w
~ , it estimates the cost of a route to be the lin~ ~x. If route ~
x1 is rated better than route
ear product c w
~
x2 and the cost of ~
x1 is lower than that of ~x2 , the weights
are consistent and do not need modification. If the cost of ~x1
is higher than that of ~x2 , the system applies the differential
~ , which decreases the cost of ~
x1
perceptron update rule to w
and increases the cost of ~x2 using
1
=
w~ = ~x2 , ~x1 = (~x2 , ~x1 ):
For each pass through all available training data, the learning
algorithm adds w
~ to w
~ and continues running through the
training data until the weights stop changing or it has performed a maximum number of iterations. Although the system can update the perceptron on-line after each new training example, the experiment described in the next section
trains on a fixed set of examples.
Once the differential perceptron algorithm finds a weight
vector that best predicts preferable routes as a weighted sum
of attributes, the routing algorithm uses this weight vector in
its cost function. Since the routing algorithm is optimal on
the cost function, the resulting route is guaranteed to be the
lowest cost route for that user model among all routes between the same two nodes. In other words, the routes computed are always Pareto optimal, in that there can be routes
that are better along each of the dimensions (attributes) independently but none that can be better simultaneously on
all dimensions.
The Interaction Component
When started, the Route Advisor client locates the servers it
needs and displays a route request screen, like the one pictured in Figure 2. In the current implementation, the user
specifies origin and destination in a postal address style,
and identifies him/herself for the purpose of loading the
user model. An in-car implementation could simplify this
screen by providing the current location as a default starting
point, and the most frequent car driver as a default user identity. The driver could select a destination from a list of that
driver’s most common destinations.
After requesting a route, the main interaction window appears, as displayed in Figure 3(a). It provides a list of current route options and two menus, “Route” and “Modify.”
The current routes are presented in terms of five attributes:
total time, total distance, number of turns, number of intersections, and total time on highways. Initially the agent
presents two routes to the user. The first route uses the current preference model as the weight vector for the routing
cost function. The second uses novel weights in an attempt
to explore new directions in the space of preference models.
Presenting at least two route options forces the user to
make a choice and provide some feedback to the agent. The
turn directions for the selected route are shown in the field
below the route list and the map displays the selected route,
as in Figure 3(b). Clicking “Select” indicates that the highlighted route is satisfactory and terminates the window. The
route advisor assumes that the highlighted route is preferable
to the alternative routes and updates the user model. Click-
(a) The route selection window.
(b) The map window.
Figure 3: Initially, the user is presented two alternative routes. The best route according to the current user-model is highlighted.
(a) The route selection window.
(b) The map window.
Figure 4: The user generated the third route by selecting the first route and choosing “Shorter” from the “Modify” menu.
A
B
C
D
Figure 5: Sample task for the subjects. The starting point is the box at the upper left and the ending point is at the lower right.
A is the route with fewest turns, B is the fastest route, C is the route with fewest intersections, and D is the shortest route.
ing “Cancel” terminates the window but does not update the
model.
The “Modify” menu lets the user generate a new route that
is faster, shorter, has fewer turns, has fewer intersections, or
has less highway time than the selected route. The implicit
assumption is that the driver is willing to accept routes that
are somewhat worse on other attributes if he or she can find
one that is better on the selected attribute. This approach to
navigating through the space of possible solutions is similar to “tweaking” in Burke et al.’s F IND M E systems (Burke,
Hammond, & Young 1996). In that system, the user can
navigate a database of apartments for rent by asking for an
apartment that is either cheaper, bigger, nicer, or safer than
the one currently displayed.
The Adaptive Route Advisor searches for new routes that
satisfy the improvement request by modifying the weights it
places on attributes, increasing the weight of the selected
attribute, and decreasing the other weights. Since slight
changes in the weight vector may result in the same route,
the system continues modifying the weights until the resulting route is significantly different. For example, Figure 4(a)
shows a shorter route added to the route list. If the user is unsatisfied with all listed routes, the “Route” menu lets the user
generate an entirely new route as different as possible from
all those displayed. The route advisor does this by adding a
“penalty” in the cost function to all segments used by one of
the displayed routes.
The interface described above simultaneously and seamlessly fulfills two functions in the Adaptive Route Advisor.
First, it lets the users easily find routes they like by giving them choices and letting them interactively modify the
routes proposed. Second, it unobtrusively collects the information the learning algorithm needs to refine the user model
and adapt to a particular driver.
Experimental Results
In order to test the adaptation algorithm apart from the other
functionality of the Adaptive Route Advisor, we simulated
a series of interactions on paper with human subject evaluations of planner output. The test consisted of 20 tasks that
involved trips between intersections in the Palo Alto area. To
compensate for the lack of interactivity, we produced four
routes for each task instead of two. Since we had no opportunity to build user models, we used exploratory weight
vectors with a unit weight for one attribute and zero for the
rest, creating routes optimized for time, distance, number of
intersections, and number of turns, respectively. We plotted
the four routes, labeled randomly A through D, on a map
of Palo Alto. We presented the tasks in a different random
order for each subject. Figure 5 shows an example of one of
the tasks and its four route choices.
We asked the subjects to evaluate the routes for each task
and rank them in preference order, using 1 for best and 4
for worst. Since a ranking of four routes gives six independent binary preferences (A better/worse than B, C, D; B better/worse than C, D; C better/worse than D), each subject
provided 6 20 = 120 training instances.
We trained the perceptron for 100,000 epochs ( = 0.001)
for each subject, then looked for some way to compare the
resulting user models. Since the cost of a route is a relative
measure, the relative values of the weights are more informative than the absolute values. We will refer to the ratio of
two weights between two attributes as their exchange rate,
because they define how much of one attribute a driver is
willing to give up to improve another attribute. For example, if the exchange rate between time and turn weights is
30, the driver is willing to drive up to 30 seconds longer to
save one turn, but no more. Figure 6 shows the exchange
rates between distance and the other three attributes.
The results indicate that route preferences differ widely
across people. Some subjects, such as 11 and 16, are apparently willing to go to great distances to improve their
route on some other attribute. Other subjects, such as 9 and
17, would sacrifice other attributes to reduce the distance attribute. The most surprising result is that many subjects have
negative exchange rates. For example, the distance/turns exchange rate for Subject 10 is ,1027. This means that, given
two routes A and B , if route A has one more turn than route
B , it will have a lower cost if it is more than 1027 feet longer
than B . Besides its intuitive difficulties, it is inconvenient
to use these weights directly for planning because it means
some edges could have a negative cost. We believe these
negative weights come from the bias in the training data toward optimal routes on some attribute. For example, the fact
that drivers prefer shorter routes, other factors being equal,
is not explicitly represented in the training data. Our future work will include using such background knowledge to
eliminate negative exchange rates.
To evaluate the advantage of using personalized models
versus a single fixed model, we also created an aggregate
training set of all 120 24 = 2880 instances. Figure 7
compares the accuracy of the personalized model to the aggregate model. As expected, the accuracy of the aggregate
model is poor, hovering around chance (50%). The personalized model is uniformly better than chance and the aggregate model, but still far from perfect. Some possible sources
for this model failure are that people are inherently inconsistent or that our model space does not represent some important attributes in drivers’ route preferences. For example,
people may dislike a certain road or intersection, which affects the rankings for some tasks but not others. Future studies will include additional information about the routes and
measure the subjects’ consistency on redundant tasks.
Directions for Future Work
The results of our initial experiment indicate that it is possible to learn a cost function that predicts driver preferences
with reasonable accuracy. More importantly, this cost function serves as a user model for generating routes that will be
satisfactory to the driver. The Adaptive Route Advisor can
be made more powerful and useful through additional work
in five key areas: use of personal attributes, better street descriptions, use of direct driving feedback, a more effective
interface, and better model induction.
One source of error in the experiment was the limited and
impersonal nature of the route descriptors. As Haigh and
Veloso (Haigh & Veloso 1995) note, the descriptor set may
not represent all factors relevant to a driver. In fact, the digital map cannot possibly represent all the attributes that are
important to all individuals. However, an in-car navigation
system is well situated to use personalized attributes, because it can constantly monitor the driver’s behavior using
traces from a Global Positioning System. In particular we
can assume that the routes a person drives are desirable by
that person’s true internal cost function,3 and use information about familiar routes when planning new ones.
The planner could represent familiarity as a binary visited/not visited value for each edge, or it could try to represent the degree of familiarity as a continuous value. However, with an additional assumption that sequences of familiar edges (subroutes) are more desirable than isolated familiar edges, we have developed a familiarity preprocessor (Rogers et al. 1997) that groups sequences of road
edges between commonly used intersections into higherlevel links, similar to disjunctive macro-operators. A macro
link between two intersections represents all distinct routes
the driver has used between these intersections. These macro
links are hierarchically organized, with some links recursively incorporating smaller macro links. The largest macro
links represent entire trips, such as the drive from home to
work. Including these macro links affects the planner in
three ways: it uses sequences of familiar edges as primitives, it shortens the edge-by-edge description of the route
by summarizing familiar sequences, and it biases the route
description toward using familiar segments.
We can improve street descriptions by accessing existing
geographic databases and by generating new ones. Current
databases provide information about the location and types
of businesses, as well as demographic information. In future
work, we will generate new geographic databases by collecting and analyzing traces of trips from a Global Positioning System. Analyzing the trajectories of many cars along
the same edge provides average speed models for different
times of day, the location of traffic controls, and number of
lanes. An advantage to the client/server architecture is that
clients can serve as a distributed sensor network to sample
road conditions and provide dynamic updates to the digital
map for more accurate routing. Some possible dynamic attributes include transit time, congestion, and road closures.
Besides interacting with the interface, another form of
feedback comes from observing the routes actually driven.
If the driver does not take the route the user model predicted,
the new route is presumably better than the predicted route,
and this will generate a new instance for the personalization
module. This type of feedback may include more classification noise than direct feedback because there is no direct evidence that the driver liked his route or even that he
or she was not lost. However, if the driver usually follows
routes because of his own preferences, the noise should cancel out after sufficient training data. These indirect forms of
feedback are less intrusive than the approach in the experiment reported earlier or the approach used in the Automated
3
Situations in which this assumption does not hold include
cases where the driver is lost and where he is following directions.
1500
Distance Exchange Value in Feet
1000
500
0
-500
-1000
1 Second
1 Intersection
1 Turn
-1500
1
3
5
7
9
11
13
15
Subject Number
17
19
21
23
Figure 6: Exchange rates for three of the attributes with respect to distance. High positive values for an attribute indicate that
shorter distance is less important than reducing that attribute, near zero values indicate that shorter distance is more important,
and high negative values indicate that longer distance is more preferable.
100
Percent Correct
75
50
25
Individualized
Aggregate
0
1
3
5
7
9
11
13
15
Subject Number
17
19
21
23
Figure 7: Comparison between the accuracy of the personalized models and that of the aggregated model. The accuracy was
computed using a ten-fold cross validation. The error bars mark one standard deviation.
Travel Assistant (Linden & andNeal Lesh 1997), where the
user explicitly lists his preferences for airlines, airports, and
other plan components.
The current user interface is tuned to exhibit the functionality of the agent. To deploy the Adaptive Route Advisor in a
car, we will need to partly redesign the interface to take into
account the limited input and output facilities. For instance,
the menu for modifying the routes might be replaced by a
panel of buttons that the user can activate through a touch
screen. We will also need to evaluate the in-car user interface with drivers to ensure that the capabilities of the route
advisor are easily and intuitively available to drivers.
We are also exploring other inductive methods for adapting the user model, such as regression over the preference
rankings, multi-layer neural networks, and principal components analysis. A critical property of prospective methods is
that the model be able to generate a numeric cost for partial
and complete routes. We are also investigating more flexible
model representations, such as adjusting the weight vector
based on task characteristics. For example, the route advisor may always plan the fastest route to work but a more
leisurely drive home. Results from any method could improve with some background knowledge about the domain
and more relevant attributes for the street descriptions. We
can improve our evaluation by determining the fraction of
modeling errors that are due to driver inconsistency, which
we can measure by including some redundancy in the user
surveys. Our final goal is an agent with a flexible, usable
interface that accurately adapts itself to its user over time.
Conclusions
Route recommendation for automotive domains is a
knowledge-rich problem where the criteria for making decisions and the relative weight of these criteria can be personalized. The Adaptive Route Advisor serves as an intermediary between the driver and the complexity of the digital map.
The agent and the driver interact to generate multiple route
options, giving the driver a more satisfactory route than he
or she would receive from a single-option route planner, and
providing feedback from the driver that reflects his or her
route preferences. The agent encodes these preferences in a
user model that it uses to predict which route a driver will
find most appealing.
Although interaction is in the driver’s best interest if he or
she wants a satisfactory route, the Advisor does not require
it, and the amount of interaction is controlled by the driver.
Ideally, as the agent better approximates the driver’s cost
function, interaction becomes less necessary and the agent
becomes more autonomous. This low interaction requirement is crucial for in-car decision making where the driver’s
attention is necessarily focused elsewhere.
In general, our approach to developing advisory agents is
to automatically and unobtrusively acquire value judgments
by observing the user’s actions in a domain, and to utilize
interaction as an additional source of value judgments. The
advisor generates a solution using its current user model,
receives feedback from the user if its model is inaccurate,
and corrects its model in areas relevant to the problem being
solved.
Acknowledgments
The authors would like to thank Daniel Russakoff for preparing and running the experiment, and Renée Elio for many
helpful comments and discussions.
References
Burke, R. D.; Hammond, K. J.; and Young, B. C.
1996. Knowledge-based navigation of complex information spaces. In Proceedings of the Thirteenth National
Conference on Artificial Intelligence (AAAI’96), 462–468.
Portland, OR: (Cambridge, MA: AAAI Press/MIT Press).
Dijkstra, E. W. 1959. A note on two problems in connexion
with graphs. Numerische Mathematik 1:269–271.
Hadfield, P. 1997. Smart cars steer round traffic jams. New
Scientist.
Haigh, K. Z., and Veloso, M. M. 1995. Route planning
by analogy. In Veloso, M., and Aamodt, A., eds., CaseBased Reasoning Research and Development, Proceedings
of ICCBR-95, 169–180. Sesimbra, Portugal: (Berlin, Germany: Springer-Verlag).
Hermens, L. A., and Schlimmer, J. C. 1994. A machinelearning apprentice for the completion of repetitive forms.
IEEE Expert 9:28–33.
Pat Langley. 1997. Machine learning for adaptive user interfaces. In Proceedings of the 21st German Annual Conference on Artificial Intelligence, 53–62. Freiburg, Germany: Springer.
Linden, G., and andNeal Lesh, S. H. 1997. Interactive assesment of user preference models: The Automated Travel
Assistant. In Jameson, A.; Paris, C.; and Tasso, C., eds.,
User Modeling: Proceedings of the Sixth International
Conference, UM97, 67–78. Vienna, New York: Springer
Wien New York.
Nilsson, N. J. 1965. Learning Machines. New York:
McGraw-Hill.
Rogers, S.; Langley, P.; Johnson, B.; and Liu, A. 1997. Personalization of the automotive information environment. In
Engels, R.; Evans, B.; Herrmann, J.; and Verdenius, F.,
eds., Proceedings of the workshop on Machine Learning in
the real world; Methodological Aspects and Implications,
28–33.