The Focussed D Algorithm For Real-Time Replanning: Anthony Stentz

Download as ps, pdf, or txt
Download as ps, pdf, or txt
You are on page 1of 8

The Focussed D* Algorithm for Real-Time Replanning

Anthony Stentz
Robotics Institute
Carnegie Mellon University
Pittsburgh, Pennsylvania 15213
U. S. A.

Abstract incorrect, the remaining portion of the path may need to be


Finding the lowest-cost path through a graph is replanned to preserve optimality. A traverse is optimal if
central to many problems, including route planning every transition in the traverse is part of an optimal path to
for a mobile robot. If arc costs change during the the goal assuming, at the time of each transition, all known
traverse, then the remainder of the path may need to information about the arc costs is correct.
be replanned. This is the case for a sensor-equipped An important application for this problem, and the one
mobile robot with imperfect information about its that will serve as the central example throughout the paper, is
environment. As the robot acquires additional the task of path planning for a mobile robot equipped with a
information via its sensors, it can revise its plan to sensor, operating in a changing, unknown or partially-known
reduce the total cost of the traverse. If the prior environment. The states in the graph are robot locations, and
information is grossly incomplete, the robot may the arc values are the costs of moving between locations,
discover useful information in every piece of sensor based on some metric such as distance, time, energy
data. During replanning, the robot must either wait expended, risk, etc. The robot begins with an initial estimate
for the new path to be computed or move in the of arc costs comprising its “map”, but since the environment
wrong direction; therefore, rapid replanning is is only partially-known or changing, some of the arc costs
essential. The D* algorithm (Dynamic A*) plans are likely to be incorrect. As the robot acquires sensor data,
optimal traverses in real-time by incrementally it can update its map and replan the optimal path from its
repairing paths to the robot’s state as new current state to the goal. It is important that this replanning
information is discovered. This paper describes an be fast, since during this time the robot must either stop or
extension to D* that focusses the repairs to continue to move along a suboptimal path.
significantly reduce the total time required for the A number of algorithms exist for producing optimal
initial path calculation and subsequent replanning traverses given changing arc costs. One algorithm plans an
operations. This extension completes the initial path with A* [Nilsson, 1980] or the distance transform
development of the D* algorithm as a full [Jarvis, 1985] using the prior map information, moves the
generalization of A* for dynamic environments, robot along the path until either it reaches the goal or its
where arc costs can change during the traverse of sensor discovers a discrepancy between the map and the
the solution path.1 environment, updates the map, and then replans a new path
from the robot’s current state to the goal [Zelinsky, 1992].
Although this brute-force replanner is optimal, it can be
1 Introduction grossly inefficient, particularly in expansive environments
where the goal is far away and little map information exists.
The problem of path planning can be stated as finding a Boult [1987] maintains an optimal cost map from the
sequence of state transitions through a graph from some ini- goal to all states in the environment, assuming the
tial state to a goal state, or determining that no such sequence environment is bounded (finite). When discrepancies are
exists. The path is optimal if the sum of the transition costs, discovered between the map and the environment, only the
also called arc costs, is minimal across all possible affected portion of the cost map is updated. The map
sequences through the graph. If during the “traverse” of the representation is limited to polygonal obstacles and free
path, one or more arc costs in the graph is discovered to be space. Trovato [1990] and Ramalingam and Reps [1992]
extend this approach to handle graphs with arc costs ranging
over a continuum. The limitation of these algorithms is that
1. This research was sponsored by ARPA, under contracts “Per-
ception for Outdoor Navigation” (contract number DACA76-89-C- the entire affected portion of the map must be repaired
0014, monitored by the US Army TEC) and “Unmanned Ground before the robot can resume moving and subsequently make
Vehicle System” (contract number DAAE07-90-C-R059, moni- additional corrections. Thus, the algorithms are inefficient
tored by TACOM). when the robot is near the goal and the affected portions of
the map have long “shadows”. Stentz [1994] overcomes

In Proceedings of the International Joint Conference on Artificial Intelligence, August 1995.


these limitations with D*, an incremental algorithm which remaining portion of the path must be replanned at a given
maintains a partial, optimal cost map limited to those location in the traverse, which tends to get progressively
locations likely to be of use to the robot. Likewise, repair of shorter due to the second characteristic.
the cost map is generally partial and re-entrant, thus reducing
computational costs and enabling real-time performance.
Other algorithms exist for addressing the problem of
path planning in unknown or dynamic environments [Korf,
1987; Lumelsky and Stepanov, 1986; Pirzadeh and Snyder,
1990], but these algorithms emphasize fast operation and/or
low memory usage at the expense of optimality.
This paper describes an extension to D* which focusses Invalidated
the cost updates to minimize state expansions and further Area
reduce computational costs. The algorithm uses a heuristic R
function similar to A* to both propagate cost increases and
focus cost reductions. A biasing function is used to
compensate for robot motion between replanning operations.
The net effect is a reduction in run-time by a factor of two to
three. The paper begins with the intuition behind the
algorithm, describes the extension, presents an example, Known Known
evaluates empirical comparisons, and draws conclusions. Optimal
Area Obstacle Obstacle
2 Intuition for Algorithm Unknown
Closed Gate
Consider how A* solves the following robot path planning G
problem. Figure 1 shows an eight-connected graph repre- Figure 1: Invalidated States in the Graph
senting a Cartesian space of robot locations. The states in the
graph, depicted by arrows, are robot locations, and the arcs As described in Stentz [1994], D* leverages on these
encode the cost of moving between states. The white regions characteristics to reduce run-time by a factor of 200 or more
are locations known to be in free space. The arc cost for for large environments. The paper proves that the algorithm
moving between free states is a small value denoted by produces correct results regardless--only the performance
EMPTY . The grey regions are known obstacle locations, and improvement is affected by the validity of the problem
arcs connected to these states are assigned a prohibitively characteristics.
high value of OBSTACLE . The small black square is a Like A*, D* maintains an OPEN list of states for
closed gate believed to be open (i.e., EMPTY value). With- expansion; however, these states consist of two types:
out a loss of generality, the robot is assumed to be point-size RAISE and LOWER . RAISE states transmit path cost
and occupies only one location at a time. A* can be used to increases due to an increased arc value, and LOWER states
compute optimal path costs from the goal, G , to all states in reduce costs and re-direct arrows to compute new optimal
the space given the initial set of arc costs, as shown in the paths. The RAISE states propagate the arc cost increase
figure. The arrows indicate the optimal state transitions; through the invalidated states, by starting at the gate and
therefore, the optimal path for any state can be recovered by sweeping outward, adding the value of GATE to all states in
following the arrows to the goal. Because the closed gate is the region. The RAISE states activate neighboring LOWER
assumed to be open, A* plans a path through it. states which sweep in behind to reduce costs and re-direct
The robot starts at some initial location and begins pointers. LOWER states compute new, optimal paths to the
following the optimal path to the goal. At location R , the states that were previously raised.
robot’s sensor discovers the gate between the two large States are placed on the OPEN list by their key value,
obstacles is closed. This corresponds to an incorrect arc k(X) , which for LOWER states is the current path cost, h(X)
value in the graph: rather than EMPTY , it has a much higher (i.e., cost from the state X to the goal), and for RAISE states
value of GATE , representing the cost of first opening the the previous, unraised h(X) value. States on the list are
gate and then moving through it. All paths through this arc processed in order of increasing key value. The intuition is
are (possibly) no longer optimal, as indicated by the labelled that the previous optimal path costs of the RAISE states
region. A* could be used to recompute the cost map, but this define a lower bound on the path costs of LOWER states they
is inefficient if the environment is large and/or the goal is far can discover. Thus, if the path costs of the LOWER states
away. currently on the OPEN list exceed the previous path costs of
Several characteristics of the problem motivate a better the RAISE states, then it is worthwhile processing RAISE
approach. First, changes to the arc costs are likely to be in states to discover (possibly) a better LOWER state.
the vicinity of the robot, since it typically carries a sensor The process can terminate when the lowest value on the
with a limited range. This means that most plans need only OPEN list equals or exceeds the robot’s path cost, since
be patched “locally”. Second, the robot generally makes additional expansions cannot possibly find a better path to
near-monotonic progress toward the goal. Most obstructions the goal (see Figure 2). Once a new optimal path is
are small and simple path deflections suffice, thus avoiding computed or the old one is determined to be valid, the robot
the high computational cost of backtracking. Third, only the can continue to move toward the goal. Note in the figure that
only part of the cost map has been repaired. This is the reaches it through subsequent expansions. Note that this is a
efficiency of the D* algorithm. more efficient cut-off than the previous one, which considers
The D* algorithm described in Stentz [1994] propagates only the first criterion.
cost changes through the invalidated states without Figure 3 shows the same example, except that a
considering which expansions will benefit the robot at its focussed search is used. All states in the RAISE state wave
current location. Like A*, D* can use heuristics to focus the front have roughly the same f(°) value. The wave front is
search in the direction of the robot and reduce the total more “narrow” in the focussed case since the inclusion of the
number of state expansions. Let the focussing heuristic cost to return to the robot penalizes the wide flanks.
g(X, R) be the estimated path cost from the robot’s location R Furthermore, the LOWER states activated by the RAISE
to X . Define a new function, the estimated robot path cost, to state wave front have swept in from the outer sides of the
be f(X, R) = h(X) + g(X, R) , and sort all LOWER states on the obstacles to compute a new, optimal path to the robot. Note
OPEN list by increasing f(°) value. The function f(X, R) is that the two wave fronts are narrow and focussed on the
the estimated path cost from the state R through X to G . robot’s location. Compare Figure 3 to Figure 2. Note that
Provided that g(°) satisfies the monotone restriction, then both the RAISE and LOWER state wave fronts have covered
since h(X) is optimal when LOWER state X is removed from less ground for the focussed search than the unfocussed
the OPEN list, an optimal path will be computed to R search in order to compute a new, optimal path to R . Therein
[Nilsson, 1980]. The notation g(°) is used to refer to a is the efficiency of the Focussed D* algorithm.
function independent of its domain. The problem with focussing the search is that once a
new optimal path is computed to the robot’s location, the
robot then moves to a new location. If its sensor discovers
another arc cost discrepancy, the search should be focussed
RAISE on the robot’s new location. But states already on the OPEN
States list are focussed on the old location and have incorrect g(°)
LOWER and f(°) values. One solution is to recompute g(°) and f(°)
States for all states on the OPEN list every time the robot moves
and new states are to be added. Based on empirical evidence,
the cost of re-sorting the OPEN list more than offsets the
savings gained by a focussed search.
R

RAISE
States

LOWER
States
G R
Figure 2: LOWER States Reach the Robot
In the case of RAISE states, the previous h(°) value
defines a lower bound on the h(°) values of LOWER states
they can discover; therefore, if the same focussing heuristic
g(°) is used for both types of states, the previous f(°) values
of the RAISE states define lower bounds on the f(°) values of
the LOWER states they can discover. Thus, if the f(°) values
of the LOWER states on the OPEN list exceed the previous
f(°) values of the RAISE states, then it is worthwhile
processing RAISE states to discover better LOWER states. G
Based on this reasoning, the RAISE states should be sorted Figure 3: Focussed LOWER States Reach Robot
on the OPEN list by f(X, R) = k(X) + g(X, R) . But since The approach in this paper is to take advantage of the
k(X) = h(X) for LOWER states, the RAISE state definition fact that the robot generally moves only a few states between
for f(°) suffices for both kinds of states. To avoid cycles in replanning operations, so the g(°) and f(°) values have only a
the backpointers, it should be noted that ties in f(°) are sorted small amount of error. Assume that state X is placed on the
by increasing k(°) on the OPEN list [Stentz, 1993]. OPEN list when the robot is at location R 0 . Its f(°) value at
The process can terminate when the lowest value on the that point is f(X, R 0) . If the robot moves to location R 1 , we
OPEN list equals or exceeds the robot’s path cost, since the could calculate f(X, R 1) and adjust its position on the OPEN
subsequent expansions cannot possibly find a LOWER state list. To avoid this computational cost, we compute a lower
that 1) has a low enough path cost, and 2) is “close” enough bound on f(X, R 1) given by
to the robot to be able to reduce the robot’s path cost when it f L(X, R 1) = f(X, R 0) – g(R 1, R 0) – ε . f L(X, R 1) is a lower bound
on f(X, R 1) since it assumes the robot moved in the These neighbors are in turn placed on the OPEN list to
“direction” of state X , thus subtracting the motion from continue the process.
g(X, R 0) . The parameter ε is an arbitrarily small positive States are sorted on the OPEN list by a biased f(°)
number. If X is repositioned on the OPEN list by f L(X, R 1) , value, given by f B(X, R i) , where X is the state on the OPEN
then since f L(X, R 1) is a lower bound on f(X, R 1) , X will be list and R i is the robot’s state at the time X was inserted or
selected for expansion before or when it is needed. At the adjusted on the OPEN list. Let { R 0, R 1, …, R N } be the
time of expansion, the true f(X, R 1) value is computed, and X sequence of states occupied by the robot when states were
is placed back on the OPEN list by f(X, R 1) . added to the OPEN list. The value of f B(°) is given by
At first this approach appears worse, since the OPEN f B(X, R i) = f(X, R i) + d(R i, R 0) , where f(°) is the estimated
list is first re-sorted by f L(°) and then partially adjusted to robot path cost given by f(X, R i) = h(X) + g(X, R i) and d(°) is
replace the f L(°) values with the correct f(°) values. But since the accrued bias function given by
g(R 1, R 0) + ε is subtracted from all states on the OPEN list, d(R i, R 0) = g(R 1, R 0) + g(R 2, R 1) + … + g(R i, R i – 1) + iε if
the ordering is preserved, and the list need not be re-sorted. i > 0 and d(R 0, R 0) = 0 if i = 0 . The function g(X, Y) is the
Furthermore, the first step can be avoided altogether by focussing heuristic, representing the estimated path cost
adding g(R 1, R 0) + ε to the states to be inserted on the OPEN from Y to X . The OPEN list states are sorted by increasing
list rather than subtracting it from those already on the list, f B(°) value, with ties in f B(°) ordered by increasing f(°) , and
thus preserving the relative ordering between states already ties in f(°) ordered by increasing k(°) . Ties in k(°) are
on the list and states about to be added. Therefore, the only ordered arbitrarily. Thus, a vector of values 〈 f B(°), f(°), k(°)〉
remaining computation is the adjustment step. But this step is stored with each state on the list.
is needed only for those states that show promise for Whenever a state is removed from the OPEN list, its
reaching the robot’s location. For typical problems, this f(°) value is examined to see if it was computed using the
amounts to fewer than 2% of the states on the OPEN list. most recent focal point. If not, its f(°) and f B(°) values are
recalculated using the new focal point and accrued bias,
3 Definitions and Formulation respectively, and the state is placed back on the list.
Processing the f B(°) values in ascending order ensures that
To formalize this intuition, we begin with the notation and
the first encountered f(°) value using the current focal point
definitions used in Stentz [1994], and then extend it for the
is the minimum such value, denoted by f min . Let k val be its
focussed algorithm. The problem space can be formulated as
corresponding k(°) value. These parameters comprise an
a set of states denoting robot locations connected by direc-
important threshold for D*. By processing properly-focussed
tional arcs, each of which has an associated cost. The robot
f(°) values in ascending order (and k(°) values in ascending
starts at a particular state and moves across arcs (incurring
order for a constant f(°) value), the algorithm ensures that for
the cost of traversal) to other states until it reaches the goal
all states X , if f(X) < f min or ( f(X) = f min and h(X) ≤ k val ),
state, denoted by G . Every visited state X except G has a
then h(X) is optimal. The parameter val is used to store the
backpointer to a next state Y denoted by b(X) = Y . D* uses
vector 〈 f min, k val〉 for the purpose of this test.
backpointers to represent paths to the goal. The cost of tra-
versing an arc from state Y to state X is a positive number Let R curr be the current state on which the search is
given by the arc cost function c(X, Y) . If Y does not have an focussed, initialized to the robot’s start state. Define the
arc to X , then c(X, Y) is undefined. Two states X and Y are robot state function r(X) , which returns the robot’s state
neighbors in the space if c(X, Y) or c(Y, X) is defined. when X was last inserted or adjusted on the OPEN list. The
parameter d curr is the accrued bias from the robot’s start
D* uses an OPEN list to propagate information about
state to its current state; it is shorthand for d(R curr, R 0) and is
changes to the arc cost function and to calculate path costs to
i n i t i a l i z e d t o d curr = d(R 0, R 0) = 0 . T h e f o l l o w i n g
states in the space. Every state X has an associated tag t(X) ,
s h o r t h a n d n o t a t i o n i s u s e d f o r f B(°) a n d f(°) :
such that t(X) = NEW if X has never been on the OPEN list,
f B(X) ≡ f B(X, r(X)) and f(X) ≡ f(X, r(X)) .
t(X) = OPEN if X is currently on the OPEN list, and
t(X) = CLOSED if X is no longer on the OPEN list. For
each visited state X , D* maintains an estimate of the sum of 4 Algorithm Description
the arc costs from X to G given by the path cost function The D* algorithm consists primarily of three functions:
h(X) . Given the proper conditions, this estimate is equivalent PROCESS – STATE , MODIFY – COST , and
to the optimal (minimal) cost from state X to G . For each MOVE – ROBOT . PROCESS – STATE computes optimal
state X on the OPEN list (i.e., t(X) = OPEN ), the key path costs to the goal, MODIFY – COST changes the arc cost
function, k(X) , is defined to be equal to the minimum of h(X) function c(°) and enters affected states on the OPEN list, and
before modification and all values assumed by h(X) since X MOVE – ROBOT uses the two functions to move the robot
was placed on the OPEN list. The key function classifies a o p t i m a l l y. T h e a l g o r i t h m s f o r PROCESS – STATE ,
state X on the OPEN list into one of two types: a RAISE MODIFY – COST , a n d MOVE – ROBOT a r e p r e s e n t e d
state if k(X) < h(X) , and a LOWER state if k(X) = h(X) . D* below along with three of the more detailed functions for
u s e s RAISE s t a t e s o n t h e OPEN l i s t t o p r o p a g a t e managing the OPEN list: INSERT , MIN – STATE , and
information about path cost increases and LOWER states to MIN – VAL . The user provides the function GVAL(X, Y) ,
propagate information about path cost reductions. The which computes and returns the focussing heuristic g(X, Y) .
propagation takes place through the repeated removal of The embedded routines are: MIN(a, b) returns the
states from the OPEN list. Each time a state is removed from minimum of the two scalar values a and b; LESS(a, b) takes a
the list, it is expanded to pass cost changes to its neighbors. vector of values 〈 a 1, a 2〉 for a and a vector 〈 b 1, b 2〉 for b
and returns TRUE if a 1 < b 1 or ( a 1 = b 1 and a 2 < b 2 ); neighbor Y that has a backpointer to X , regardless of
LESSEQ(a, b) takes two vectors a and b and returns TRUE whether the new cost is greater than or less than the old.
if a 1 < b 1 or ( a 1 = b 1 and a 2 ≤ b 2 ); COST(X) computes Since these states are descendants of X , any change to the
f(X, R curr) = h(X) + GVAL(X, R curr) and returns the vector of path cost of X affects their path costs as well. The
values 〈 f(X, R curr), h(X)〉 for a state X ; DELETE(X) deletes backpointer of Y is redirected, if needed. All neighbors that
state X from the OPEN list and sets t(X) = CLOSED ; receive a new path cost are placed on the OPEN list, so that
PUT – STATE(X) sets t(X) = OPEN and inserts X on the they will propagate the cost changes to their neighbors.
OPEN list according to the vector 〈 f B(X), f(X), k(X)〉 ; and Function: PROCESS-STATE ()
GET – STATE returns the state on the OPEN list with
minimum vector value ( NULL if the list is empty). L1 X = MIN – STATE ( )
The INSERT function, given below, changes the value L2 if X = NULL then return NO – VAL
of h(X) to h new and inserts or repositions X on the OPEN L3 val = 〈 f(X), k(X)〉 ; k val = k(X) ; DELETE(X)
list. The value for k(X) is determined at lines L1 through L5. L4 if k val < h(X) then
The remaining two values in the vector are computed at line L5 for each neighbor Y of X :
L7, and the state is inserted at line L8. L6 if t(Y) ≠ NEW and LESSEQ(COST(Y), val) and
Function: INSERT (X, hnew) L7 h(X) > h(Y) + c(Y, X) then
L8 b(X) = Y ; h(X) = h(Y) + c(Y, X)
L1 if t(X) = NEW then k(X) = h new
L9 if k val = h(X) then
L2 else
L10 for each neighbor Y of X :
L3 if t(X) = OPEN then
L11 if t(Y) = NEW or
L4 k(X) = MIN(k(X), h new) ; DELETE(X)
L12 ( b(Y) = X and h(Y) ≠ h(X) + c(X, Y) ) or
L5 else k(X) = MIN(h(X), h new)
L13 ( b(Y) ≠ X and h(Y) > h(X) + c(X, Y) ) then
L6 h(X) = h new ; r(X) = R curr
L14 b(Y) = X ; INSERT(Y, h(X) + c(X, Y))
L7 f(X) = k(X) + GVAL(X, R curr) ; f B(X) = f(X) + d curr
L15 else
L8 PUT – STATE(X)
L16 for each neighbor Y of X :
The function MIN – STATE , given below, returns the
state on the OPEN list with minimum f(°) value. In order to L17 if t(Y) = NEW or
do this, the function retrieves the state on the OPEN list with L18 ( b(Y) = X and h(Y) ≠ h(X) + c(X, Y) ) then
lowest f B(°) value. If the state was placed on the OPEN list L19 b(Y) = X ; INSERT(Y, h(X) + c(X, Y))
when the robot was at a previous location (line L2), then it is L20 else
re-inserted on the OPEN list at lines L3 and L4. This L21 if b(Y) ≠ X and h(Y) > h(X) + c(X, Y) and
operation has the effect of correcting the state’s accrued bias L22 t(X) = CLOSED then
using the robot’s current state while leaving the state’s h(°) L23 INSERT(X, h(X))
and k(°) values unchanged. MIN – STATE continues to L24 else
retrieve states from the OPEN list until it finds one that was L25 if b(Y) ≠ X and h(X) > h(Y) + c(Y, X) and
placed on the OPEN list with the robot at its current state. L26 t(Y) = CLOSED and
Function: MIN-STATE () L27 LESS(val, COST(Y)) then
L1 while X = GET – STATE( ) ≠ NULL L28 INSERT(Y, h(Y))
L2 if r(X) ≠ R curr then L29 return MIN – VAL( )
L3 h new = h(X) ; h(X) = k(X) If X is a RAISE state, its path cost may not be optimal.
L4 DELETE(X) ; INSERT(X, h new) Before X propagates cost changes to its neighbors, its
L5 else return X optimal neighbors are examined at lines L4 through L8 to
L6 return NULL see if h(X) can be reduced. At lines L16 through L19, cost
The MIN – VAL function, given below, returns the f(°) changes are propagated to NEW states and immediate
and k(°) values of the state on the OPEN list with minimum descendants in the same way as for LOWER states. If X is
f(°) value, that is, 〈 f min, k val〉 . able to lower the path cost of a state that is not an immediate
descendant (lines L21 through L23), X is placed back on the
Function: MIN-VAL () OPEN list for future expansion. This action is required to
L1 X = MIN – STATE( ) avoid creating a closed loop in the backpointers [Stentz,
L2 if X = NULL then return NO – VAL 1993]. If the path cost of X is able to be reduced by a
L3 else return 〈 f(X), k(X)〉 suboptimal neighbor (lines L25 through L28), the neighbor
In function PROCESS – STATE cost changes are is placed back on the OPEN list. Thus, the update is
propagated and new paths are computed. At lines L1 through “postponed” until the neighbor has an optimal path cost.
L3, the state X with the lowest f(°) value is removed from In function MODIFY – COST , the arc cost function is
the OPEN list. If X is a LOWER state (i.e., k(X) = h(X) ), its updated with the changed value. Since the path cost for state
path cost is optimal. At lines L9 through L14, each neighbor Y will change, X is placed on the OPEN list. When X is
Y of X is examined to see if its path cost can be lowered. expanded via PROCESS – STATE , it computes a new
Additionally, neighbor states that are NEW receive an initial h(Y) = h(X) + c(X, Y) a n d p l a c e s Y o n t h e OPEN l i s t .
path cost value, and cost changes are propagated to each Additional state expansions propagate the cost to the
descendants of Y .
Function: MODIFY-COST (X, Y, cval) EMPTY arcs in the graph. No unobstructed path exists to the
L1 c(X, Y) = c val goal from S if h(S) ≥ OBSTACLE after exiting the loop at
line L6. Likewise, no unobstructed path exists to the goal
L2 if t(X) = CLOSED then INSERT(X, h(X))
from a state R during the traverse if h(R) ≥ OBSTACLE after
L3 return MIN – VAL( ) exiting the loop at line L16. Since R = R curr for a robot
The function MOVE – ROBOT illustrates how to use state R undergoing path recalculations, then g(R, R) = 0 and
PROCESS – STATE a n d MODIFY – COST t o m o v e t h e f(R, R) = h(R) . Therefore, optimality is guaranteed for a
robot from state S through the environment to G along an state R , if f min > h(R) or ( f min = h(R) and k val ≥ h(R) ).
optimal traverse. At lines L1 through L4 of
MOVE – ROBOT , t(°) is set to NEW for all states, the 5 Example
accrued bias and focal point are initialized, h(G) is set to
zero, and G is placed on the OPEN list. PROCESS – STATE Figure 4 shows a cluttered 100 x 100 state environment. The
is called repeatedly at lines L6 and L7 until either an initial robot starts at state S and moves to state G . All of the obsta-
path is computed to the robot’s state (i.e., t(S) = CLOSED ) cles, shown in black, are unknown before the robot starts its
or it is determined that no path exists (i.e., val = NO – VAL traverse, and the map contains only EMPTY arcs. The robot
and t(S) = NEW ). The robot then proceeds to follow the is point-size and is equipped with a 10-state radial field-of-
backpointers until it either reaches the goal or discovers a view sensor. The figure shows the robot’s traverse from S to
discrepancy (line L11) between the sensor measurement of G using the Basic D* algorithm. The traverse is shown as a
an arc cost s(°) and the stored arc cost c(°) (e.g., due to a black curve with white arrows. As the robot moves, its sen-
detected obstacle). Note that these discrepancies may occur sor detects the unknown obstacles. Detected obstacles are
anywhere, not just on the path to the goal. If the robot moved shown in grey with black arrows. Obstacles that remain
since the last time discrepancies were discovered, then its unknown after the traverse are shown in solid black or black
state R is saved as the new focal point, and the accrued bias, with white arrows. The arrows show the final cost field for
d curr , is updated (lines L12 and L13). MODIFY – COST is all states examined during the traverse. Note that most of the
called to correct c(°) and place affected states on the OPEN states are examined at least once by the algorithm.
list at line L15. PROCESS – STATE is then called repeatedly
at line L17 to propagate costs and compute a new path to the
goal. The robot continues to follow the backpointers toward
the goal. The function returns GOAL – REACHED if the
goal is found and NO – PATH if it is unreachable.
Function: MOVE-ROBOT (S, G)
L1 for each state X in the graph:
L2 t(X) = NEW
L3 d curr = 0 ; R curr = S
L4 INSERT(G, 0)
S G
L5 val = 〈 0, 0〉
L6 while t(S) ≠ CLOSED and val ≠ NO – VAL
L7 val = PROCESS – STATE( )
L8 if t(S) = NEW then return NO – PATH
L9 R = S
L10 while R ≠ G :
L11 if s(X, Y) ≠ c(X, Y) for some (X,Y) then
L12 if R curr ≠ R then
L13 d curr = d curr + GVAL(R, R curr) + ε ; R curr = R
L14 for each (X,Y) such that s(X, Y) ≠ c(X, Y) : Figure 4: Basic D* Algorithm
L15 val = MODIFY – COST(X, Y, s(X, Y)) Figure 5 shows the robot’s traverse using the Focussed
L16 while LESS(val, COST(R)) and val ≠ NO – VAL D* algorithm. The number of NEW states examined is fewer
L17 val = PROCESS – STATE( ) than Basic D*, since the Focussed D* algorithm focuses the
L18 R = b(R) initial path calculation and subsequent cost updates on the
L19 return GOAL – REACHED robot’s location. Note that even for those states examined by
It should be noted that line L8 in MOVE – ROBOT only the algorithm, fewer of them end up with optimal paths to
detects the condition that no path exists from the robot’s the goal. Finally, note that the two trajectories are not fully
state to the goal if, for example, the graph is disconnected. It equivalent. This occurs because the lowest-cost traverse is
does not detect the condition that all paths to the goal are not unique, and the two algorithms break ties in the path
obstructed by obstacles. In order to provide for this costs arbitrarily.
capability, obstructed arcs can be assigned a large positive
value of OBSTACLE and unobstructed arcs can be assigned
a small positive value of EMPTY . OBSTACLE should be
chosen such that it exceeds the longest possible path of
compute the initial path from the goal to the robot, or in the
case of FD*F, from the goal to all states in the environment.
This operation is “off-line” since it could be performed in
advance of robot motion if the initial map were available.
The on-line time is the total CPU time for all replanning
operations needed to move the robot from the start to the
goal.

S G

S
G

Figure 5: Focussed D* Algorithm

6 Experimental Results
Four algorithms were tested to verify optimality and to com-
pare run-time results. The first algorithm, the Brute-Force
Replanner (BFR), initially plans a single path from the goal
to the start state. The robot proceeds to follow the path until Figure 6: Typical Environment for Comparison
its sensor detects an error in the map. The robot updates the
map, plans a new path from the goal to its current location
using a focussed A* search, and repeats until the goal is
reached. The focussing heuristic, g(X, Y) , was chosen to be Focussed D* Focussed D* Brute-Force
the minimum possible number of state transitions between Y Basic D*
with Full Init with Min Init Replanner
and X , assuming the lowest arc cost value for each.
The second and third algorithms, Basic D* (BD*) and Off-line: 104 1.85 sec 0.16 sec 1.02 sec 0.09 sec
Focussed D* with Minimal Initialization (FD*M), are
On-line: 104 1.09 sec 1.70 sec 1.31 sec 13.07 sec
described in Stentz [1994] and Section 4, respectively. The
fourth algorithm, Focussed D* with Full Initialization Off-line: 105 19.75 sec 0.68 sec 12.55 sec 0.41 sec
(FD*F), is the same as FD*M except that the path costs are
propagated to all states in the planning space, which is On-line: 105 9.53 sec 18.20 sec 16.94 sec 11.86 min
assumed to be finite, during the initial path calculation, Off-line: 106 224.62 sec 9.53 sec 129.08 sec 4.82 sec
rather than terminating when the path reaches the robot’s
start state. On-line: 106 10.01 sec 41.72 sec 21.47 sec 50.63 min
The four algorithms were compared on planning
problems of varying size. Each environment was square, Table 1: Results for Empirical Tests
consisting of a start state in the center of the left wall and a
goal state in center of the right wall. Each environment The results for each algorithm are highly dependent on
consisted of a mix of map obstacles known to the robot the complexity of the environment, including the number,
before the traverse and unknown obstacles measurable by size, and placement of the obstacles, and the ratio of known
the robot’s sensor. The sensor used was omnidirectional with to unknown obstacles. For the test cases examined, all
a 10-state radial field of view. Figure 6 shows an variations of D* outperformed BFR in on-line time, reaching
environment model with approximately 100,000 states. The a s p e e d u p f a c t o r o f a p p r o x i m a t e l y 3 0 0 f o r l a rg e
known obstacles are shown in grey and the unknown environments. Generally, the performance gap increased as
obstacles in black. the size of the environment increased. If the user wants to
The results for environments of 104, 105, and 106 states minimize on-line time at the expense of off-line time, then
are shown in Table 1. The reported times are CPU time for a FD*F is the best algorithm. In this algorithm, path costs to
Sun Microsystems SPARC-10 processor. For each all states are computed initially and only the cost
environment size, the four algorithms were compared on five propagations are focussed. Note that FD*F resulted in lower
randomly-generated environments, and the results were on-line times and higher off-line times than BD*. The FD*M
averaged. The off-line time is the CPU time required to algorithm resulted in lower off-line times and higher on-line
times than BD*. Focussing the search enables a rapid start est-path problem. University of Wisconsin Technical Report
due to fewer state expansions, but many of the unexplored #1087, May 1992.
states must be examined anyway during the replanning [Stentz, 1993] A. Stentz. Optimal and efficient path plan-
process resulting in a longer execution time. Thus, FD*M is ning for unknown and dynamic environments. Carnegie
the best algorithm if the user wants to minimize the total Mellon Robotics Institute Technical Report CMU-RI-TR-
time, that is, if the off-line time is considered to be on-line 93-20, August 1993.
time as well. [Stentz, 1994] A. Stentz. Optimal and efficient path plan-
Thus, the Focussed D* algorithm can be configured to ning for partially-known environments. In Proceedings of
outperform Basic D* in either total time or the on-line the IEEE International Conference on Robotics and Automa-
portion of the operation, depending on the requirements of tion, May 1994.
the task. As a general strategy, focussing the search is a good [Trovato, 1990] K. I. Trovato. Differential A*: an adaptive
idea; the only issue is how the computational load should be search method illustrated with robot path planning for mov-
distributed. ing obstacles and goals, and an uncertain environment. Jour-
nal of Pattern Recognition and Artificial Intelligence, 4(2),
7 Conclusions 1990.
This paper presents the Focussed D* algorithm for real-time [Zelinsky, 1992] A. Zelinsky. A mobile robot exploration
path replanning. The algorithm computes an initial path from algorithm. IEEE Transactions on Robotics and Automation,
the goal state to the start state and then efficiently modifies 8(6), December 1992.
this path during the traverse as arc costs change. The algo-
rithm produces an optimal traverse, meaning that an optimal
path to the goal is followed at every state in the traverse,
assuming all known information at each step is correct. The
focussed version of D* outperforms the basic version, and it
offers the user the option of distributing the computational
load amongst the on- and off-line portions of the operation,
depending on the task requirements. The addition of a heu-
ristic focussing function to D* completes its development as
a generalization of A* to dynamic environments--A* is the
special case of D* where arc costs do not change during the
traverse of the solution path.

Acknowledgments
The author thanks Barry Brumitt and Jay Gowdy for feed-
back on the use of the algorithm.

References
[Boult, 1987] T. Boult. Updating distance maps when
objects move. In Proceedings of the SPIE Conference on
Mobile Robots, 1987.
[Jarvis, 1985] R. A. Jarvis. Collision-free trajectory plan-
ning using the distance transforms. Mechanical Engineering
Trans. of the Institution of Engineers, ME10(3), September
1985.
[Korf, 1987] R. E. Korf. Real-time heuristic search: first
results. In Proceedings of the Sixth National Conference on
Artificial Intelligence, July 1987.
[Lumelsky and Stepanov, 1986] V. J. Lumelsky and A. A.
Stepanov. Dynamic path planning for a mobile automaton
with limited information on the environment. IEEE Transac-
tions on Automatic Control, AC-31(11), November 1986.
[Nilsson, 1980] N. J. Nilsson. Principles of Artificial Intelli-
gence, Tioga Publishing Company, 1980, pp. 72-88.
[Pirzadeh and Snyder, 1990] A. Pirzadeh and W. Snyder. A
unified solution to coverage and search in explored and
unexplored terrains using indirect control. In Proceedings of
the IEEE International Conference on Robotics and Automa-
tion, May 1990.
[Ramalingam and Reps, 1992] G. Ramalingam and T. Reps.
An incremental algorithm for a generalization of the short-

You might also like