Articial Intelligence Search Algorithms
Articial Intelligence Search Algorithms
Articial Intelligence Search Algorithms
Richard E. Korf Computer Science Department University of California, Los Angeles Los Angeles, Ca. 90095 July 5, 1996
1 Introduction
Search is a universal problem-solving mechanism in arti cial intelligence (AI). In AI problems,
the sequence of steps required for solution of a problem are not known a priori, but often must be determined by a systematic trial-and-error exploration of alternatives. The problems that have been addressed by AI search algorithms fall into three general classes: single-agent path nding
sliding-tile puzzles are common testbeds for research in AI search algorithms because they are very simple to represent and manipulate, yet nding optimal solutions to the N N generalization of the sliding-tile puzzles is NP-complete 40]. Other well-known examples of single-agent path nding problems include Rubik's Cube, the travelling salesman problem, and theorem proving. In each case, the task is to nd a sequence of operations that map an initial state to a goal state. A second class of search problems include two-player perfect-information games, such as Chess, Checkers, and Othello. The third category is constraint-satisfaction problems, such as the Eight Queens Problem. The task is to place eight queens on an 8 8 chessboard, such that no two queens are on the same row, column or diagonal. We begin by describing the problem-space model on which search algorithms are based. Bruteforce searches are then considered including breadth- rst, uniform-cost, depth- rst, depth- rst iterative-deepening, and bidirectional search. Next, various heuristic searches are examined including pure heuristic search, the A* algorithm, iterative-deepening-A*, depth- rst branch-and-bound, the heuristic path algorithm, and recursive best- rst search. We then consider single-agent algorithms that interleave search and execution, including minimin lookahead search, real-time-A*, and learning-real-time-A*. Next, we consider two-player game searches, including minimax and alpha-beta pruning. Finally, we examine constraint-satisfaction algorithms, such as backtracking, constraint recording, and heuristic repair. The e ciency of these algorithms, in terms of the costs of the solutions they generate, the amount of time the algorithms take to execute, and the amount of computer memory they require are of central concern throughout. Since search is a universal problem-solving method, what limits its applicability is the e ciency with which it can be performed.
1 2 3 8 4 7 6 5
1 3 8 2 4 7 6 5
1 2 3 8 4 7 6 5
1 2 3 8 6 4 7 5
1 2 3 8 4 7 6 5
1 3 8 2 4 7 6 5
1 3 8 2 4 7 6 5
1 2 8 4 3 7 6 5
1 2 3 8 4 5 7 6
1 2 3 8 6 4 7 5
1 2 3 8 6 4 7 5
2 3 1 8 4 7 6 5
1 2 3 7 8 4 6 5
8 1 3 2 4 7 6 5
1 3 4 8 2 7 6 5
1 2 8 4 3 7 6 5
1 2 3 8 4 5 7 6
1 2 3 6 4 8 7 5
1 2 3 8 6 7 5 4
2 3 1 8 4 7 6 5
1 2 3 7 8 4 6 5
Figure 1: Eight Puzzle Search Tree Fragment 1040 nodes. Even a simple problem like the Twenty-Four Puzzle contains almost 1025 nodes. As a result, the problem-space graphs of AI problems are never represented explicitly by listing each state, but rather are implicitly represented by specifying an initial state and a set of operators to generate new states from existing states. Furthermore, the size of an AI problem is rarely expressed as the number of nodes in its problem-space graph. Rather, the two parameters of a search tree that determine the e ciency of various search algorithms are its branching factor and its solution
depth. The branching factor is the average number of children of a given node. For example, in the
Eight Puzzle the average branching factor is 3, or about 1.732. The solution depth of a problem instance is the length of a shortest path from the initial state to a goal state, or the length of 4
a shortest sequence of operators that solves the problem. If the goal were in the bottom row of gure 1, the depth of the problem instance represented by the initial state at the root would be three moves.
3 Brute-Force Search
The most general search algorithms are brute-force searches, since they do not require any domainspeci c knowledge. All that is required for a brute-force search is a state description, a set of legal operators, an initial state, and a description of the goal state. The most important bruteforce techniques are breadth- rst, uniform-cost, depth- rst, depth- rst iterative-deepening, and bidirectional search. In the descriptions of the algorithms below, to generate a node means to create the data structure corresponding to that node, whereas to expand a node means to generate all the children of that node.
of the tree at a time until a solution is found (see Figure 2). It is most easily implemented by maintaining a queue of nodes, initially containing just the root, and always removing the node at the head of the queue, expanding it, and adding its children to the tail of the queue. Since it never generates a node in the tree until all the nodes at shallower levels have been generated, breadth- rst search always nds a shortest path to a goal. The amount of time used by breadth- rst search is proportional to the number of nodes generated, since each node can be generated in constant time, and is a function of the branching factor b and the solution depth d. Since the number of nodes at level d is bd , the total number of nodes generated in the worst case is b + b2 + b3 + + bd , which is O(bd ), the asymptotic time complexity of breadth- rst search. 5
10
11
12
13
14
Figure 2: Order of Node Generation for Breadth-First Search The main drawback of breadth- rst search is its memory requirement. Since each level of the tree must be saved in order to generate the next level, and the amount of memory is proportional to the number of nodes stored, the space complexity of breadth- rst search is also O(bd ). As a result, breadth- rst search is severely space-bound in practice, and will exhaust the memory available on typical computers in a matter of minutes.
Whenever a node is chosen for expansion by uniform-cost search, a lowest-cost path to that node has been found. The worst-case time complexity of uniform-cost search is O(bc=m), where c is the cost of an optimal solution, and m is the minimum edge cost. Unfortunately, it also su ers the same memory limitation as breadth- rst search.
a child of the deepest unexpanded node (see Figure 3). Both algorithms can be implemented using a list of unexpanded nodes, with the di erence that breadth- rst search manages the list as a rst-in rst-out queue, whereas depth- rst search treats the list as a last-in rst-out stack. More commonly, depth- rst search is implemented recursively, with the recursion stack taking the place of an explicit node stack.
12
10
11
13
14
Figure 3: Order of Node Generation for Depth-First Search The advantage of depth- rst search is that its space requirement is only linear in the search 7
depth, as opposed to exponential for breadth- rst search. The reason is that the algorithm only needs to store a stack of nodes on the path from the root to the current node. The time complexity of a depth- rst search to depth d is O(bd), since it generates the same set of nodes as breadth- rst search, but simply in a di erent order. Thus, as a practical matter, depth- rst search is time-limited rather than space-limited. The disadvantage of depth- rst search is that it may not terminate on an in nite tree, but simply go down the left-most path forever. Even a nite graph can generate an in nite tree. The usual solution to this problem is to impose a cuto depth on the search. Although the ideal cuto is the solution depth d, this value is rarely known in advance of actually solving the problem. If the chosen cuto depth is less than d, the algorithm will fail to nd a solution, whereas if the cuto depth is greater than d, a large price is paid in execution time, and the rst solution found may not be an optimal one.
search 45, 18]. DFID rst performs a depth- rst search to depth one, then starts over, executing a complete depth- rst search to depth two, and continues to run depth- rst searches to successively greater depths, until a solution is found (see Figure 4). Since it never generates a node until all shallower nodes have been generated, the rst solution found by DFID is guaranteed to be along a shortest path. Furthermore, since at any given point it is executing a depth- rst search, saving only a stack of nodes, and the algorithm terminates when it nds a solution at depth d, the space complexity of DFID is only O(d). Although it appears that DFID wastes a great deal of time in the iterations prior to the one that nds a solution, this extra work is usually insigni cant. To see this, note that the number of 8
1,
3,
2,
6, 16
4, 10
5, 13
7, 17
8, 20
11
12
14
15
18
19
21
22
Figure 4: Order of Node Generation for Depth-First Iterative-Deepening Search nodes at depth d is bd, and each of these nodes are generated once, during the nal iteration. The number of nodes at depth d ? 1 is bd?1, and each of these are generated twice, once during the nal iteration, and once during the penultimate iteration. In general, the number of nodes generated by DFID is bd + 2bd?1 + 3bd?2 + + db. This is asymptotically O(bd) if b is strictly greater than one, since for large values of d the lower order terms become insigni cant. In other words, most of the work goes into the nal iteration, and the cost of the previous iterations is relatively small. The ratio of the number of nodes generated by DFID to those generated by breadth- rst search on a tree is approximately b=(b ? 1). In fact, DFID is asymptotically optimal in terms of time and space among all brute-force shortest-path algorithms on a tree 18]. If the edge costs di er from one another, then one can run an iterative deepening version of uniform-cost search, where the depth cuto is replaced by a cuto on the g (n) cost of a node. At the end of each iteration, the new threshold is the minimum cost of all nodes generated but pruned 9
on the previous iteration. On a graph with cycles, however, breadth- rst search may be much more e cient than any depth- rst search. The reason is that a breadth- rst search can check for duplicate nodes whereas a depth- rst search cannot. Thus, the complexity of breadth- rst search grows only as the number of nodes at a given depth, while the complexity of depth- rst search depends on the number of paths of a given length. For example, in a square grid, the number of nodes within a radius r of the origin is O(r2), whereas the number of paths of length r is O(3r ), since there are three children of every node, not counting its parent. Thus, in a graph with a large number of very short cycles, breadth- rst search is preferable to depth- rst search, if su cient memory is available. For one approach to the problem of duplicate nodes in depth- rst search, see 46].
10
4 Heuristic Search
In order to solve larger problems, domain-speci c knowledge must be added to improve search e ciency. In AI, heuristic search has a general meaning, and a more specialized technical meaning. In a general sense, the term heuristic is used for any advice that is often e ective, but isn't guaranteed to work in every case. Within the heuristic search literature, however, the term heuristic usually refers to the special case of a heuristic evaluation function.
The key properties of a heuristic evaluation function are that it estimate actual cost, and that it be inexpensive to compute. For example, the Euclidean distance between a pair of points can be computed in constant time. The manhattan distance between a pair of states can be computed in time proportional to the number of tiles. In addition, most naturally occurring heuristic functions are lower bounds on actual cost, and this property is referred to as admissibility. For example, airline distance is a lower bound on road distance between two points, since the shortest path between a pair of points is a straight line. Similarly, manhattan distance is a lower bound on the actual number of moves necessary to solve an instance of a sliding-tile puzzle, since every tile must move at least as many times as its distance in grid units from its goal position. A number of algorithms make use of heuristic functions, including pure heuristic search, the A* algorithm, iterative-deepening-A*, depth- rst branch-and-bound, and the heuristic path algorithm. In addition, heuristic information can be employed in bidirectional search as well.
just the initial state on the Open list. At each cycle, a node on the Open list with the minimum
h(n) value is expanded, generating all of its children, and is placed on the Closed list. The heuristic
function is applied to the children, and they are placed on the Open list in order of their heuristic values. The algorithm continues until a goal state is chosen for expansion. In a graph with cycles, multiple paths will be found to the same node, and the rst path found may not be the shortest. When a shorter path is found to an Open node, the shorter path is saved and the longer one discarded. When a shorter path to a Closed node is found, the node is moved 12
to Open, and the shorter path is associated with it. The main drawback of pure heuristic search is that since it ignores the cost of the path so far to node n, it does not nd optimal solutions. Breadth- rst search, uniform-cost search, and pure heuristic search are all special cases of a more general algorithm called best- rst search. They di er only in their cost functions: the depth of node n for breadth- rst search, g (n) for uniform-cost search, and h(n) for pure heuristic search.
4.3 A* Algorithm
The A* algorithm 13] combines features of uniform-cost search and pure heuristic search to e ciently compute optimal solutions. A* is a best- rst search in which the cost associated with a node is f (n) = g (n) + h(n), where g (n) is the cost of the path from the initial state to node n, and
h(n) is the heuristic estimate of the cost of a path from node n to a goal. Thus, f (n) estimates
the lowest total cost of any solution path going through node n. At each point a node with lowest
f value is chosen for expansion. Ties among nodes of equal f value should be broken in favor of
nodes with lower h values. The algorithm terminates when a goal node is chosen for expansion. A* nds an optimal path to a goal if the heuristic function h(n) is admissible, meaning it never overestimates actual cost 13]. For example, since airline distance never overestimates actual highway distance, and manhattan distance never overestimates actual moves in the sliding-tile puzzles, A* using these evaluation functions will nd optimal solutions to these problems. In addition, A* makes the most e cient use of a given heuristic function in the following sense: among all shortest-path algorithms using a given heuristic function h(n), A* expands the fewest number of nodes 4]. The main drawback of A*, and indeed of any best- rst search, is its memory requirement. Since at least the entire Open list must be saved, A* is severely space-limited in practice, and is no more practical than breadth- rst search on current machines. For example, while it can be 13
run successfully on the Eight Puzzle, it exhausts available memory in a matter of minutes on the Fifteen Puzzle.
4.4 Iterative-Deepening-A*
Just as depth- rst iterative-deepening solved the space problem of breadth- rst search, iterativedeepening-A* (IDA*) eliminates the memory constraint of A*, without sacri cing solution optimality 18].
Each iteration of the algorithm is a complete depth- rst search that keeps track of the cost,
f (n) = g(n) + h(n), of each node generated. As soon as a node is generated whose cost exceeds a threshold for that iteration, its path is cut o , and the search backtracks before continuing. The cost threshold is initialized to the heuristic estimate of the initial state, and in each successive iteration is increased to the total cost of the lowest-cost node that was pruned during the previous iteration. The algorithm terminates when a goal state is reached whose total cost does not exceed the current threshold. Since IDA* performs a series of depth- rst searches, its memory requirement is linear in the maximum search depth. In addition, if the heuristic function is admissible, the rst solution found by IDA* is an optimal one. Finally, by an argument similar to that presented for DFID, IDA* expands the same number of nodes, asymptotically, as A* on a tree, provided that the number of nodes grows exponentially with solution cost. These facts, together with the optimality of A*, imply that IDA* is asymptotically optimal in time and space over all heuristic search algorithms that nd optimal solutions on a tree. Additional bene ts of IDA* are that it is much easier to implement, and often runs faster than A*, since it does not incur the overhead of managing the Open and Closed lists.
14
While IDA* never expands any nodes whose cost exceeds the optimal cost, its overhead consists of expanding some nodes more than once. While DFBnB never expands any node more than once, its overhead consists of expanding some nodes whose costs exceed the optimal cost. For problems whose search trees are of bounded depth or for which it is easy to construct a good solution, such as the TSP, DFBnB is usually the algorithm of choice for nding an optimal solution. For problems with in nite search trees or for which it is di cult to construct a low-cost solution, such as the sliding-tile puzzles or Rubik's Cube, IDA* is usually the best choice.
16
w produces a range of algorithms from uniform-cost search (w = 0), through A* (w = 1=2), to pure
heuristic search (w = 1). Increasing w beyond 1=2 generally decreases the amount of computation, while increasing the cost of the solution generated. This tradeo is often quite favorable, with small increases in solution cost yielding huge savings in computation 22]. Furthermore, it can be shown that the solutions found by this algorithm are guaranteed to be no more than a factor of w=(1 ? w) greater than optimal 3], but often are signi cantly better.
that of some other node in the previously expanded portion of the tree, the algorithm backs up to their deepest common ancestor, and continues the search down the new path. In e ect, the algorithm maintains a separate threshold for each subtree diverging from the current search path. See 22] for full details on RBFS.
algorithm searches forward from the current state to a xed depth determined by the informational or computational resources available. At the search horizon, the A* evaluation function f (n) =
g(n) + h(n) is applied to the frontier nodes. Since all decisions are made by a single agent, the
value of an interior node is the minimum of the frontier values in the subtree below the node. A single move is then made to the neighbor of the current state with the minimum value. Most heuristic functions obey the triangle inequality characteristic of distance measures. As a 18
result, f (n) = g (n)+ h(n) is guaranteed to be monotonically nondecreasing along a path. Furthermore, since minimin search has a xed depth limit, we can apply depth- rst branch-and-bound to prune the search tree. The performance improvement due to branch-and-bound is quite dramatic, in some cases extending the achievable search horizon by a factor of ve relative to brute-force minimin search on sliding-tile puzzles 20]. Minimin search with branch-and-bound is an algorithm for evaluating the immediate neighbors of the current node. As such, it is run until the best child is identi ed, at which point the chosen move is executed in the real world. We can view the static evaluation function combined with lookahead search as simply a more accurate, but computationally expensive, heuristic function. In fact, it provides an entire spectrum of heuristic functions trading o accuracy for cost, depending on the search horizon.
5.2 Real-Time-A*
Simply repeating minimin search for each move ignores information from previous searches and results in in nite loops. In addition, since actions are committed based on limited information, often the best move may be to undo the previous move. The principle of rationality is that backtracking should occur when the estimated cost of continuing the current path exceeds the cost of going back to a previous state, plus the estimated cost of reaching the goal from that state.
Real-time-A* (RTA*) implements this policy in constant time per move 20].
For each move, the f (n) = g (n) + h(n) value of each neighbor of the current state is computed, where g (n) is now the cost of the edge from the current state to the neighbor, instead of from the initial state. The problem solver moves to the neighbor with the minimum f (n) value, and stores with the previous state the best f (n) value among the remaining neighbors. This represents the
h(n) value of the previous state from the perspective of the new current state. This is repeated
19
until a goal is reached. To determine the h(n) value of a previously visited state, the stored value is used, while for a new state the heuristic evaluator is called. Note that the heuristic evaluator may employ minimin lookahead search with branch-and-bound as well. In a nite problem space in which there exists a path to a goal from every state, RTA* is guaranteed to eventually to nd a solution, regardless of the heuristic evaluation function 20]. Furthermore, on a tree, RTA* makes locally-optimal decisions given the information it has seen so far.
5.3 Learning-Real-Time-A*
If a problem is to be solved repeatedly with the same goal state but di erent initial states, one would like an algorithm that improves its performance over time. Learning-real-time-A* (LRTA*) is such an algorithm. It behaves almost identically to RTA*, except that instead of storing the second-best f value of a node as its new heuristic value, it stores the best value instead. Once one problem instance is solved, the stored heuristic values are saved and become the initial values for the next problem instance. While LRTA* is less e cient than RTA* for solving a single problem instance, assuming it starts with admissible initial heuristic values, over repeated trials its heuristic values will eventually converge to their exact values. at which point the algorithm returns optimal solutions.
6 Two-Player Games
The second major application of heuristic search algorithms in AI is two-player games. One of the original challenges of AI, which in fact predates the term \arti cial intelligence", was to build a program that could play chess at the level of the best human players 47].
20
determined without examining all the nodes at the search frontier. Figure 5 shows an example of alpha-beta pruning. Only the labelled nodes are generated by the algorithm, with the heavy black lines indicating pruning. At the square nodes MAX is to move, while at the circular nodes it is MIN's turn. The search proceeds depth- rst to minimize the memory required, and only evaluates a node when necessary. First, nodes e and f are statically evaluated at 4 and 5, respectively, and their minimum value, 4, is backed up to their parent node
d. Node h is then evaluated at 3, and hence the value of its parent node g must be less than or
equal to 3, since it is the minimum of 3 and the unknown value of its right child. Thus, we label node g as <= 3. The value of node c must be 4 then, because it is the maximum of 4 and a value that is less than or equal to 3. Since we have determined the minimax value of node c, we do not need to evaluate or even generate the brother of node h. Similarly, after statically evaluating nodes k and l at 6 and 7, respectively, the backed up value of their parent node j is 6, the minimum of these values. This tells us that the minimax value of node i must be greater than or equal to 6, since it is the maximum of 6 and the unknown value of its right child. Since the value of node b is the minimum of 4 and a value that is greater than or equal to 6, it must be 4, and hence we achieve another cuto . The right half of the tree shows an example of deep pruning. After evaluating the left half of the tree, we know that the value of the root node a is greater than or equal to 4, the minimax value of node b. Once node q is evaluated at 1, the value of its parent node o must be less than or equal to 1. Since the value of the root is greater than or equal to 4, the value of node o cannot propagate to the root, and hence we need not generate the brother of node q . A similar situation exists after the evaluation of node r at 2. At that point, the value of node o is less than or equal to 1, and the value of node p is less than or equal to 2, hence the value of node n, which is the maximum of the values of nodes o and p, must be less than or equal to 2. Furthermore, since the value of node m 22
m <=2
i >=6
n <=2
g <=3
o <=1 p <=2
4 e
5 f
3 h
6 k
7 l
1 q
Figure 5: Alpha-Beta Pruning
2 r
is the minimum of the value of node n and its brother, and node n has a value less than or equal to 2, the value of node m must also be less than or equal to 2. This causes the brother of node n to be pruned, since the value of the root node a is greater than or equal to 4. Thus, we computed the minimax value of the root of the tree to be 4, by generating only seven of sixteen leaf nodes in this case. Since alpha-beta pruning performs a minimax search while pruning much of the tree, its e ect is to allow a deeper search with the same amount of computation. This raises the question of how much does alpha-beta improve performance? The best way to characterize the e ciency of a pruning algorithm is in terms of its e ective branching factor. The e ective branching factor is the 23
dth root of the number of frontier nodes that must be evaluated in a search to depth d.
The e ciency of alpha-beta pruning depends upon the order in which nodes are encountered at the search frontier. For any set of frontier node values, there exists some ordering of the values such that alpha-beta will not perform any cuto s at all. In that case, all frontier nodes must be evaluated and the e ective branching factor is b, the brute-force branching factor. On the other hand, there is an optimal or perfect ordering in which every possible cuto is realized. In that case, the e ective branching factor is reduced from b to b1=2, the square root of the brute-force branching factor. Another way of viewing the perfect ordering case is that for the same amount of computation, one can search twice as deep with alpha-beta pruning as without. Since the search tree grows exponentially with depth, doubling the search horizon is a dramatic improvement. In between worst-possible ordering and perfect ordering is random ordering, which is the average case. Under random ordering of the frontier nodes, alpha-beta pruning reduces the e ective branching factor to approximately b3=4 32]. This means that one can search 4=3 as deep with alpha-beta, yielding a 33% improvement in search depth.
those occurring in the middle of a piece trade. In those positions, a small secondary search is conducted until the static evaluation becomes more stable. In practice, this can be achieved by always exploring any capture moves one level deeper. Iterative-deepening is used to solve the problem of where to set the search horizon 44], and in fact predated its use as a memory-saving device in single-agent search. In a tournament game, there is a limited amount of time allowed for moves. Unfortunately, it is very di cult to accurately predict how long it will take to perform an alpha-beta search to a given depth. The solution is to perform a series of searches to successively greater depths. When time runs out, the move recommended by the last completed search is made. The search graphs of most games, such as chess, contain multiple paths to the same node, often reached by making the same moves in a di erent order, referred to as a transposition of the moves. Since alpha-beta is a depth- rst search, it is important to detect when a node has already been searched, in order to avoid researching it. A transposition table is a table of previously encountered game states, together with their backed-up minimax values. Whenever a new state is generated, if it is stored in the transposition table, its value there is used instead of searching the tree below the node. Virtually all two-player game programs use full-width, xed-depth, alpha-beta minimax search with node ordering, quiescence, iterative-deepening, and transposition tables, among other techniques.
computers, the best machines today are based on special-purpose hardware designed and built only to play chess. For example, DeepBlue is a chess machine that can evaluate about 100 million chess positions per second 17]. It recently lost a six-game tournament to Gary Kasparov, the world champion, but did win one game and draw several others.
7 Constraint-Satisfaction Problems
In addition to single-agent path- nding problems and two-player games, the third major application of heuristic search is constraint-satisfaction problems. The Eight Queens Problem mentioned previously is a classic example. Other examples include graph coloring, boolean satis ability, and scheduling problems. Constraint-satisfaction problems are modelled as follows: There is a set of variables, a set of values for each variable, and a set of constraints on the values that the variables can be assigned. A unary constraint on a variable speci es a subset of all possible values that can be assigned to that variable. A binary constraint between two variables speci es which possible combinations of assignments to the pair of variables satisfy the constraint. For example, in a map or graph coloring problem, the variables would represent regions or nodes, and the values would represent colors. The constraints are binary constraints on each pair of adjacent regions or nodes that prohibit them from being assigned the same color.
to the remaining variables can possibly resatisfy that constraint. Once a variable is reached which has no remaining legal assignments, then the last variable that was assigned is reassigned to the next legal value. The algorithm continues until either a complete, consistent assignment is found, resulting in success, or all possible assignments are shown to violate some constraint, resulting in failure. Figure 6 shows the tree generated by brute-force backtracking to nd all solutions to the Four Queens problem. The tree is searched depth- rst to minimize memory requirements.
Q Q
Q Q
Q Q Q
Q Q
Q Q
Q Q Q Q
Q Q Q
Q Q Q Q
Q Q Q Q Q Q
Q Q
28
The idea of variable ordering is to order the variables from most constrained to least constrained 9, 39]. For example, if a variable has only a single value remaining that is consistent with the previously instantiated variables, it should be assigned that value immediately. In general, the variables should be instantiated in increasing order of the size of their remaining domains. This can either be done statically at the beginning of the search, or dynamically, reordering the remaining variables each time a variable is assigned a new value. The order in which the values of a given variable are chosen determines the order in which the tree is searched. Since it doesn't e ect the size of the tree, it makes no di erence if all solutions are to be found. If only a single solution is required, however, value ordering can decrease the time required to nd a solution. In general, one should order the values from least constraining to most constraining, in order to minimize the time required to nd a rst solution 5, 11]. An important idea, originally called backjumping, is that when an impass is reached, instead of simply undoing the last decision made, the decision that actually caused the failure should be modi ed 10]. For example, consider a three-variable problem where the variables are instantiated in the order x; y; z . Assume that values have been chosen for both x and y , but that all possible values for z con ict with the value chosen for x. In pure backtracking, the value chosen for y would be changed, and then all the possible values for z would be tested again, to no avail. A better strategy in this case is to go back to the source of the failure, and change the value of x, before trying di erent values for y . When a variable is assigned a value, the idea of forward checking is to check each remaining uninstantiated variable to make sure that there is at least one assignment for each of them that is consistent with the previous assignments. If not, the original variable is assigned its next value.
29
fewest other queens. What is surprising about this simple strategy is how well it performs, relative to backtracking. While backtracking techniques can solve on the order of hundred-queen problems, heuristic repair can solve million-queen problems, often with only about 50 individual queen moves! This strategy has been extensively explored in the context of boolean satis ability, where it is referred to as GSAT 42]. GSAT can satisfy di cult formulas with several thousand variables, whereas the best backtracking-based approach, the Davis-Putnam algorithm 2] with unit propagation, can only satisgy di cult forumulas with several hundred variables. The main drawback of this approach is that it is not complete, in that it is not guaranteed to nd a solution in a nite amount of time, even if one exists. If there is no solution, these algorithms will run forever, whereas backtracking will eventually discover that a problem is not solvable. While constraint-satisfaction problems appear somewhat di erent from single-agent path- nding problems and two-player games, the is a strong similarity among the algorithms employed. For example, backtracking can be viewed as a form of branch-and-bound, where a node is pruned when a constraint is violated. Similarly, heuristic repair can be viewed as a heuristic search where the evaluation function is the total number of constraints that are violated, and the goal is to nd a state with zero constraint violations.
Thus, there is a continual demand for faster algorithms. A related research area is the development of space-e cient algorithms 23]. While the exponentialspace algorithms are clearly impractical, the linear-space algorithms use very little of the memory available on current machines. The primary issue here is given a xed amount of memory, how to make the best use of it to speed up a search as much as possible. In a two-player game search, the extra memory is used primarily in the transposition table. In single-agent path- nding problems, one of the most e ective uses of additional memory is a form of bidirectional search known as perimeter search 7, 28, 15]. The idea is to search breadth- rst backward from the goal state until memory is nearly exhausted. Then, the forward search proceeds until it encounters a state on the perimeter of the backward search. The main advantage to this approach is that the nodes on the perimeter can be used to re ne the heuristic estimates in the forward search. Another research area is the development of parallel search algorithms. Most search algorithms have a tremendous amount of potential parallelism, since the basic step of node generation and evaluation is often performed billions of times. As a result, many such algorithms are readily parallelized with nearly linear speedups. The algorithms that are di cult to parallelize are branchand-bound algorithms, such as alpha-beta pruning, because the results of searching one part of the tree determine whether another part of the tree needs to be examined at all. Since the performance of a search algorithm depends critically on the quality of the heuristic evaluation function, another important research area is the automatic generation of such functions. This was pioneered in the area of two-player games by Arthur Samuel's landmark checkers program that learned to improve its evaluation function through repeated play 41]. In the area of singleagent problems, a dominant theory is that the exact cost of a solution to a simpli ed version of a problem can be used as an admissible heuristic evaluation function for the original problem 33]. For example, in the sliding-tile puzzles, if we remove the constraint that a tile can only be slid into 32
the blank position, then any tile can be moved to any adjacent position at any time. The optimal number of moves required to solve this simpli ed version of the problem is the manhattan distance, which is an admissible heuristic for the original problem. Automating this approach, however, is still a research problem 38]. Another important research area is the development of selective search algorithms for two-player games. Despite the fact that it can evaluate 100 million chess positions per second, DeepBlue still lost to Gary Kasparov, a mere human. Obviously, humans are much more selective in their choices of what positions to examine. The development of alternatives to full-width, xed-depth minimax search is an active area of research. See 24] for one example of a selective search algorithm, along with pointers to other work in this area.
8.2 Summary
We have described search algorithms for three di erent classes of problems. In the rst, singleagent path- nding problem, the task is to nd a sequence of operators that map an initial state to a desired goal state. Much of the work in this area has focussed on nding optimal solutions to such problems, often making use of admissible heuristic functions to speed up the search without sacri cing optimality. In the second area, two-player games, nding optimal solutions is infeasible, and research has focussed on algorithms for making the best move decision possible given a limited amount of computing time. This approach has also been applied to single-agent problems as well. In the third class of problems, constraint-satisfaction problems, the task is to nd a state that satis es a set of constraints. While all three of these types of problems are di erent, the same set of ideas, such as brute-force searches and heuristic evaluation functions, can be applied to all three.
33
9 De ning Terms
Admissible: A heuristic is said to be admissible if it never overestimates actual distance from a
given state to a goal. An algorithm is said to be admissible if it always nds an optimal solution to a problem if one exists.
Branching factor: The average number of children of a node in a problem-space graph. Constraint-satisfaction problem: A problem where the task is to identify a state that
satis es a set of constraints.
Depth: The length of a shortest path from the initial state to a goal state. Heuristic evaluation function: A function from a state to a number. In a single-agent
problem, it estimates the distance from the state to a goal. In a two-player game, it estimates the merit of the position with respect to one player.
Node expansion: Generating all the children of a given state. Node generation: Creating the data structure that corresponds to a problem state. Operator: An action that maps one state into another state, such as a twist of Rubik's Cube. Problem instance: A problem space together with an initial state of the problem and a
desired set of goal states.
Problem space: A theoretical construct in which a search takes place, consisting of a set of
states and a set of operators.
Problem-space graph: A graphical representation of a problem space, where states are represented by nodes, and operators are represented by edges.
Search: A trial-and-error exploration of alternative solutions to a problem, often systematic. Search tree: A problem-space graph with no cycles. Single-agent path- nding problem: A problem where the task is to nd a sequence of
operators that map an initial state to a goal state. 34
State: A con guration of a problem, such as the arrangement of the parts of a Rubik's Cube
at a given point in time.
References
1] Bolc, L., and J. Cytowski, Search Methods for Arti cial Intelligence, Academic Press, London, 1992. 2] Davis, M., and H. Putnam, A computing procedure for quanti cation theory, Journal of the
Association for Computing Machinery, Vol. 7, 1960, pp. 201-215.
3] Davis, H.W, A. Bramanti-Gregor, and J. Wang, The advantages of using depth and breadth components in heuristic search, in Methodologies for Intelligent Systems 3, Z.W. Ras and L. Saitta (Eds.), North-Holland, Amsterdam, 1989, pp. 19-28. 4] Dechter, R., and J. Pearl, Generalized best- rst search strategies and the optimality of A*,
Journal of the Association for Computing Machinery, Vol. 32, No. 3, July 1985, pp. 505-536.
6] Dijkstra, E.W., A note on two problems in connexion with graphs, Numerische Mathematik, 1959, 1:269-71. 7] Dillenburg, J.F., and P.C. Nelson, Perimeter search, Arti cial Intelligence, Vol. 65, No. 1, Jan. 1994, pp. 165-178. 8] Doran, J.E., and D. Michie, Experiments with the Graph Traverser program, Proceedings of
the Royal Society A, Vol 294, 1966, pp. 235-259.
35
9] Freuder, E.C. 1982. A su cient condition for backtrack-free search. J. Assoc. Comput. Mach. 29(1):24-32 10] Gaschnig, J. Performance measurement and analysis of certain search algorithms, Ph.D. thesis. Department of Computer Science, Carnegie-Mellon Univ., Pittsburgh, Pa, 1979. 11] Haralick, R.M., and G.L. Elliott, Increasing tree search e ciency for constraint satisfaction problems, Arti cial Intelligence, Vol. 14, 1980, pp. 263-313. 12] Hart, T.P., and D.J. Edwards, The alpha-beta heuristic, M.I.T. Arti cial Intelligence Project Memo, Massachusetts Institute of Technology, Cambridge, Mass., October, 1963. 13] Hart, P.E., N.J. Nilsson, and B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE Transactions on Systems Science and Cybernetics, Vol. 4, no. 2, 1968, pp. 100-107. 14] Harvey, W.D., and M.L. Ginsberg, Limited discrepancy search, in Proceedings of the International Joint Conference on Arti cial Intelligence (IJCAI-95), Montreal, Canada, Aug. 1995,
pp. 607-613. 15] Kaindl, H., G. Kainz, A. Leeb, and H. Smetana, How to use limited memory in heuristic search, to appear in Proceedings of the Fourteenth International Joint Conference on Arti cial
Intelligence (IJCAI-95), Montreal, Canada, Aug. 1995.
16] Kanal, L., and V. Kumar (Eds.), Search in Arti cial Intelligence, Springer-Verlag, New York, 1988. 17] Keene, R., B. Jacobs, and T. Buzan, Man v Machine: The ACM Chess Challenge: Garry
Kasparov v IBM's Deep Blue, B.B. Enterprises, Sussex, 1996.
36
18] Korf, R.E., Depth- rst iterative-deepening: An optimal admissible tree search, Arti cial Intelligence, Vol. 27, No. 1, 1985, pp. 97-109.
19] Korf, R.E., Search in AI: A survey of recent results, in Exploring Arti cial Intelligence, H.E. Shrobe (Ed.), Morgan-Kaufmann, Los Altos, Ca. 1988. 20] Korf, R.E., Real-time heuristic search, Arti cial Intelligence, Vol. 42, No. 2-3, March 1990, pp. 189-211. 21] Korf, R.E., Search, revised version in the Encyclopedia of Arti cial Intelligence, Second Edition, John Wiley, New York, 1992, pp. 1460-1467.
22] Korf, R.E., Linear-space best- rst search, Arti cial Intelligence, Vol. 62, No. 1, July 1993, pp. 41-78. 23] Korf, R.E., Space-e cient search algorithms, Computing Surveys, Vol. 27, No. 3, Sept., 1995, pp. 337-339. 24] Korf, R.E., and D.M. Chickering, Best- rst minimax search, to appear in Arti cial Intelligence, 1996. 25] Korf, R.E., Improved limited discrepancy search, to appear, Proceedings of the Thirteenth
National Conference on Arti cial Intelligence (AAAI-96), Portland, OR, Aug. 1996.
26] Knuth, D.E., and R.E. Moore, An analysis of alpha-beta pruning, Arti cial Intelligence, Vol. 6, No. 4, 1975, pp. 293-326. 27] Mackworth, A.K., Consistency in networks of relations. Arti cial Intelligence Vol. 8, No. 1, 1977, pp. 99-118.
37
28] Manzini, G., BIDA*: An improved perimeter search algorithm, in Arti cial Intelligence, Vol. 75, No. 2, June 1995, pp. 347-360. 29] Minton, S., M.D. Johnston, A.B. Philips, and P. Laird, Minimizing con icts: A heuristic repair method for constraint satisfaction and scheduling problems, Arti cial Intelligence, Vol. 58, No. 1-3, December 1992, pp. 161-205. 30] Montanari, U., Networks of constraints: Fundamental properties and applications to picture processing, Information Science, Vol. 7, 1974, pp. 95-132. 31] Newell, A., and H.A. Simon, Human Problem Solving, Prentice-Hall, Englewood Cli s, N.J., 1972. 32] Pearl, J., The solution for the branching factor of the Alpha-Beta pruning algorithm and its optimality, Communications of the Association of Computing Machinery, Vol. 25, No. 8, 1982, pp. 559-64. 33] Pearl, J., Heuristics, Addison-Wesley, Reading, Ma., 1984. 34] Pearl, J., and R.E. Korf, Search techniques, in Annual Review of Computer Science, Vol. 2, Annual Reviews Inc., Palo Alto, Ca., 1987. 35] Pohl, I., First results on the e ect of error in heuristic search, in Machine Intelligence 5, B. Meltzer and D. Michie (eds.), American Elsevier, New York, 1970, pp. 219-236. 36] Pohl, I., Heuristic search viewed as path nding in a graph, Arti cial Intelligence, Vol. 1, 1970, pp. 193-204. 37] Pohl, I., Bi-directional search, in Machine Intelligence 6, B. Meltzer and D. Michie (eds.), American Elsevier, New York, 1971, pp. 127-140. 38
38] Prieditis, A. E. Machine discovery of e ective admissible heuristics, Machine Learning, Vol. 12, 1993, pp. 117-141. 39] Purdom, P.W. 1983. Search rearrangement backtracking and polynomial average time. Artif.
Intell. 21(1,2):117-33
40] Ratner, D., and M. Warmuth, Finding a shortest solution for the NxN extension of the 15Puzzle is intractable, in Proceedings of the Fifth National Conference on Arti cial Intelligence
(AAAI-86), Philadelphia, Pa., 1986.
41] Samuel, A.L., Some studies in machine learning using the game of checkers, in Computers and
Thought, E. Feigenbaum and J. Feldman (Eds.), McGraw-Hill, New York, 1963, pp. 71-105.
42] Selman, B., H. Levesque, and D. Mitchell, A new method for solving hard satis ability problems, Proceedings of the Tenth National Conference on Arti cial Intelligence (AAAI-92), San Jose, Ca., July 1992, pp. 440-446. 43] Shannon, C.E., Programming a computer for playing chess, Philosophical Magazine, Vol. 41, 1950, pp. 256-275. 44] Slate, D.J., and L.R. Atkin, CHESS 4.5 - The Northwestern University chess program, in Chess
Skill in Man and Machine, Frey, P.W. (Ed.), Springer-Verlag, New York, 1977, pp. 82-118.
45] Stickel, M.E., and W.M. Tyson, An analysis of consecutively bounded depth- rst search with applications in automated deduction, in Proceedings of the International Joint Conference on
Arti cial Intelligence (IJCAI-85), Los Angeles, Ca., August, 1985.
46] Taylor, L., and R.E. Korf, Pruning duplicate nodes in depth- rst search, Proceedings of the
National Conference on Arti cial Intelligence (AAAI-93), Washington D.C., July 1993, pp.
756-761. 39
47] Turing, A.M., Computing machinery and intelligence, Mind, Vol. 59, October, 1950, pp. 433460. Also in Computers and Thought, E. Feigenbaum and J. Feldman (Eds.), McGraw-Hill, New York, 1963.
10 Further Information
The classic reference in this area is 33]. More recent survey articles include 34], 19], and 21]. Much of the material in this article was derived from these sources. A number of papers have been collected in an edited volume devoted to search 16]. The most recent book-length treatment of this area is 1]. Most new research in this area initially appears in the Proceedings of the National Conference on Arti cial Intelligence (AAAI) or the Proceedings of the International Joint Conference on Arti cial Intelligence (IJCAI). Prominent journals in this area include Arti cial
Intelligence, and the IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI).
40