AIML UNIT I Notes
AIML UNIT I Notes
AIML UNIT I Notes
UNIT I
i. Defining AI
It is a branch of Computer Science that pursues creating the computers or machines as intelligent
as human beings. It is the science and engineering of making intelligent machines, especially
intelligent computer programs.
It is related to the similar task of using computers to understand human intelligence, but AI does not
have to confine itself to methods that are biologically observable
Definition: Artificial Intelligence is the study of how to make computers do things, which, at the
moment, people do better. According to the father of Artificial Intelligence, John McCarthy, it is
“The science and engineering of making intelligent machines, especially intelligent computer
programs”. Artificial Intelligence is a way of making a computer, a computer-controlled robot, or a
software think intelligently, in the similar manner the intelligent humans think. AI is accomplished
by studying how human brain thinks and how humans learn, decide, and work while trying to solve
a problem, and then using the outcomes of this study as a basis of developing intelligent software
and systems. It has gained prominence recently due, in part, to big data, or the increase in speed, size
and variety of data businesses are now collecting. AI can perform tasks such as identifying patterns
in the data more efficiently than humans, enabling businesses to gain more insight out of their data.
From a business perspective AI is a set of very powerful tools, and methodologies for using those
tools to solve business problems.
From a programming perspective, AI includes the study of symbolic programming, problem
solving, and search.
AI Vocabulary
Intelligence relates to tasks involving higher mental processes, e.g. creativity, solving problems,
pattern recognition, classification, learning, induction, deduction, building analogies, optimization,
language processing, knowledge and many more. Intelligence is the computational part of the ability
to achieve goals.
Intelligent behaviour is depicted by perceiving one’s environment, acting in complex
environments, learning and understanding from experience, reasoning to solve problems and
discover hidden knowledge, applying knowledge successfully in new situations, thinking abstractly,
using analogies, communicating with others and more.
Science based goals of AI pertain to developing concepts, mechanisms and understanding
biological intelligent behaviour. The emphasis is on understanding intelligent behaviour.
Engineering based goals of AI relate to developing concepts, theory and practice of building
intelligent machines. The emphasis is on system building. AI Techniques depict how we represent,
manipulate and reason with knowledge in order to solve problems.
Knowledge is a collection of ‘facts’. To manipulate these facts by a program, a suitable
representation is required. A good representation facilitates problem solving.
Learning means that programs learn from what facts or behaviour can represent. Learning denotes
changes in the systems that are adaptive in other words, it enables the system to do the same task(s)
more efficiently next time.
Applications of AI refers to problem solving, search and control strategies, speech recognition,
natural language understanding, computer vision, expert systems, etc.
AI is old, hence different views/definations have existed for the same.
ii. Types of AI
Artificial Intelligence is probably the most complex and astounding creations of humanity yet. And
that is disregarding the fact that the field remains largely unexplored, which means that
every amazing AI application that we see today represents merely the tip of the AI iceberg, as it
were. The revolutionary impact that AI is having on society, even at such a relatively early stage in
its evolution is remarkable. AI’s rapid growth and powerful capabilities have made people paranoid
about the inevitability and proximity of an AI takeover. Also, the transformation brought about by
AI in different industries has made business leaders and the mainstream public think that we are
close to achieving the peak of AI research and maxing out AI’s potential. However, understanding
the types of AI that are possible and the types that exist now will give a clearer picture of existing AI
capabilities and the long road ahead for AI research.
Understanding the types of AI classification
Since AI research aims to make machines emulate human-like functioning, the degree to which an AI
system can replicate human capabilities is used as the criterion for determining the types of AI. Thus,
depending on how a machine compares to humans in terms of versatility and performance, AI can
be classified under one, among the multiple types of AI. Under such a system, an AI that can
perform more human-like functions with equivalent levels of proficiency will be considered as a
more evolved type of AI, while an AI that has limited functionality and performance would be
considered a simpler and less evolved type.
Based on this criterion, there are two ways in which AI is generally classified. One type is based
on classifying AI and AI-enabled machines based on their likeness to the human mind, and their
ability to “think” and perhaps even “feel” like humans. According to this system of classification,
there are four types of AI or AI-based systems: reactive machines, limited memory machines,
theory of mind, and self-aware AI.
1. Reactive Machines
These are the oldest forms of AI systems that have extremely limited capability. They only emulate
the human mind’s ability to respond to different kinds of stimuli. These machines do not have
memory-based functionality. This means such machines cannot use previously gained experiences to
inform their present actions, i.e., these machines do not have the ability to “learn.” These machines
could only be used for automatically responding to a limited set or combination of inputs. They
cannot be used to rely on memory to improve their operations based on the same. A popular
example of a reactive AI machine is IBM’s Deep Blue, a machine that beat chess Grandmaster
Garry Kasparov in 1997.
2. Limited Memory
Limited memory machines are machines that, in addition to having the capabilities of purely
reactive machines, are also capable of learning from historical data to make decisions. Nearly all
existing applications that we know of come under this category of AI. All present-day AI systems,
such as those using deep learning, are trained by large volumes of training data that they store in
their memory to form a reference model for solving future problems. For instance, an image
recognition AI is trained using thousands of pictures and their labels to teach it to name objects it
scans. When an image is scanned by such an AI, it uses the training images as references to
understand the contents of the image presented to it, and based on its “learning experience” it labels
new images with increasing accuracy.
Almost all present-day AI applications, from chatbots and virtual assistants to self-driving
vehicles are all driven by limited memory AI.
3. Theory of Mind
While the previous two types of AI have been and are found in abundance, the next two types of AI
exist, for now, either as a concept or a work in progress. Theory of mind AI is the next level of AI
systems that researchers are currently engaged in innovating. A theory of mind level AI will be able
to better understand the entities it is interacting with by discerning their needs, emotions, beliefs,
and thought processes. While artificial emotional intelligence is already a budding industry and an
area of interest for leading AI researchers, achieving Theory of mind level of AI will require
development in other branches of AI as well. This is because to truly understand human needs, AI
machines will have to perceive humans as individuals whose minds can be shaped by multiple
factors, essentially “understanding” humans.
4. Self-aware
This is the final stage of AI development which currently exists only hypothetically. Self-aware AI,
which, self explanatorily, is an AI that has evolved to be so akin to the human brain that it has
developed self-awareness. Creating this type of Ai, which is decades, if not centuries away from
materializing, is and will always be the ultimate objective of all AI research. This type of AI will not
only be able to understand and evoke emotions in those it interacts with, but also have emotions,
needs, beliefs, and potentially desires of its own.
And this is the type of AI that doomsayers of the technology are wary of. Although the development
of self-aware can potentially boost our progress as a civilization by leaps and bounds, it can also
potentially lead to catastrophe. This is because once self-aware, the AI would be capable of having
ideas like self-preservation which may directly or indirectly spell the end for humanity, as such an
entity could easily out maneuver the intellect of any human being and plot elaborate schemes to take
over humanity.
The alternate system of classification that is more generally used in tech parlance is the classification
of the technology into Artificial Narrow Intelligence (ANI), Artificial General Intelligence (AGI), and
Artificial Superintelligence (ASI).
5. Artificial Narrow Intelligence (ANI)/Weak AI
This type of artificial intelligence represents all the existing AI, including even the most complicated
and capable AI that has ever been created to date. Artificial narrow intelligence refers to AI systems
that can only perform a specific task autonomously using human-like capabilities. These machines
can do nothing more than what they are programmed to do, and thus have a very limited or narrow
range of competencies. According to the aforementioned system of classification, these systems
correspond to all the reactive and limited memory AI. Even the most complex AI that uses machine
learning and deep learning to teach itself falls under ANI.
6. Artificial General Intelligence (AGI)/Strong AI
Artificial General Intelligence is the ability of an AI agent to learn, perceive, understand, and
function completely like a human being. These systems will be able to independently build multiple
competencies and form connections and generalizations across domains, massively cutting down on
time needed for training. This will make AI systems just as capable as humans by replicating our
multi-functional capabilities.
7. Artificial Superintelligence (ASI)
The development of Artificial Superintelligence will probably mark the pinnacle of AI research, as
AGI will become by far the most capable forms of intelligence on earth. ASI, in addition to
replicating the multi-faceted intelligence of human beings, will be exceedingly better at everything
they do because of overwhelmingly greater memory, faster data processing and analysis, and
decision-making capabilities. The development of AGI and ASI will lead to a scenario most
popularly referred to as the technological singularity. And while the potential of having such
powerful machines at our disposal seems appealing, these machines may also threaten our existence
or at the very least, our way of life.
At this point, it is hard to picture the state of our world when more advanced types of AI come into
being. However, it is clear that there is a long way to get there as the current state of AI development
compared to where it is projected to go is still in its rudimentary stage. For those holding a negative
outlook for the future of AI, this means that now is a little too soon to be worrying about the
singularity, and there's still time to ensure AI safety. And for those who are optimistic about the
future of AI, the fact that we've merely scratched the surface of AI development makes the future
even more exciting.
Alan Turing, in a 1951 paper, proposed a test called "The Imitation Game" that might finally settle
the issue of machine intelligence. The first version of the game he explained involved no computer
intelligence whatsoever. Imagine three rooms, each connected via computer screen and keyboard to
the others. In one room sits a man, in the second a woman, and in the third sits a person - call him or
her the "judge". The judge's job is to decide which of the two people talking to him through the
computer is the man. The man will attempt to help the judge, offering whatever evidence he can (the
computer terminals are used so that physical clues cannot be used) to prove his man-hood. The
woman's job is to trick the judge, so she will attempt to deceive him, and counteract her opponent's
claims, in hopes that the judge will erroneously identify her as the male.
What does any of this have to do with machine intelligence? Turing then proposed a modification
of the game, in which instead of a man and a woman as contestants, there was a human, of either
gender, and a computer at the other terminal. Now the judge's job is to decide which of the
contestants is human, and which the machine. Turing proposed that if, under these conditions, a
judge were less than 50% accurate, that is, if a judge is as likely to pick either human or computer,
then the computer must be a passable simulation of a human being and hence, intelligent. The game
has been recently modified so that there is only one contestant, and the judge's job is not to choose
between two contestants, but simply to decide whether the single contestant is human or machine.
Partly out of an attempt to pass Turing's test, and partly just for the fun of it, there arose, largely in
the 1970s, a group of programs that tried to cross the first human-computer barrier: language. These
programs, often fairly simple in design, employed small databases of (usually English) language
combined with a series of rules for forming intelligent sentences. While most were woefully
inadequate, some grew to tremendous popularity. Perhaps the most famous such program was
Joseph Weizenbaum's ELIZA. Written in 1966 it was one of the first and remained for quite a while
one of the most convincing. ELIZA simulates a Rogerian psychotherapist (the Rogerian therapist is
empathic, but passive, asking leading questions, but doing very little talking. e.g. "Tell me more
about that," or "How does that make you feel?") and does so quite convincingly, for a while. There is
no hint of intelligence in ELIZA's code, it simply scans for keywords like "Mother" or "Depressed"
and then asks suitable questions from a large database. Failing that, it generates something generic
in an attempt to elicit further conversation. Most programs since have relied on similar principles of
keyword matching, paired with basic knowledge of sentence structure. There is however, no better
way to see what they are capable of than to try them yourself. We have compiled a set of links to
some of the more famous attempts at NLP. Students are encouraged to interact with these programs
in order to get a feeling for their strengths and weaknesses, but many of the pages provided here link
to dozens of such programs, don't get lost among the artificial people.
Alan Turing's imitation game has fueled 40 years of controversy, with little sign of slowing. On one
side of the argument, human-like interaction is seen as absolutely essential to human-like
intelligence. A successful AI is worthless if its intelligence lies trapped in an unresponsive program.
Opponents of Turing's behavioural criterion of intelligence argue that it is either not sufficient, or
perhaps not even relevant at all. What is important, they argue, is that the computer demonstrates
cognitive ability, regardless of behaviour. It is not necessary that a program speak in order for it to
be intelligent. There are humans that would fail the Turing test, and unintelligent computers that
might pass. The test is neither necessary nor sufficient for intelligence, they argue.
AI, machine learning, and deep learning - these terms overlap and are easily confused, so let’s start
with some short definitions.
AI means getting a computer to mimic human behavior in some way.
Machine learning is a subset of AI, and it consists of the techniques that enable computers to figure
things out from the data and deliver AI applications.
Deep learning, meanwhile, is a subset of machine learning that enables computers to solve more
complex problems.
What Is AI?
Artificial intelligence as an academic discipline was founded in 1956. The goal then, as now, was to
get computers to perform tasks regarded as uniquely human: things that required intelligence.
Initially, researchers worked on problems like playing checkers and solving logic problems.
If you looked at the output of one of those checkers playing programs you could see some form of
“artificial intelligence” behind those moves, particularly when the computer beat you. Early
successes caused the first researchers to exhibit almost boundless enthusiasm for the possibilities of
AI, matched only by the extent to which they misjudged just how hard some problems were.
Artificial intelligence, then, refers to the output of a computer. The computer is doing something
intelligent, so it’s exhibiting intelligence that is artificial.
The term AI doesn’t say anything about how those problems are solved. There are many different
techniques including rule-based or expert systems. And one category of techniques started becoming
more widely used in the 1980s: machine learning.
The reason that those early researchers found some problems to be much harder is that those
problems simply weren't amenable to the early techniques used for AI. Hard-coded algorithms or
fixed, rule-based systems just didn’t work very well for things like image recognition or extracting
meaning from text.
The solution turned out to be not just mimicking human behavior (AI) but mimicking how
humans learn.
Think about how you learned to read. You didn’t sit down and learn spelling and grammar before
picking up your first book. You read simple books, graduating to more complex ones over time. You
actually learned the rules (and exceptions) of spelling and grammar from your reading. Put another
way, you processed a lot of data and learned from it.
That’s exactly the idea with machine learning. Feed an algorithm (as opposed to your brain) a lot of
data and let it figure things out. Feed an algorithm a lot of data on financial transactions, tell it which
ones are fraudulent, and let it work out what indicates fraud so it can predict fraud in the future. Or
feed it information about your customer base and let it figure out how best to segment them.
As these algorithms developed, they could tackle many problems. But some things that humans
found easy (like speech or handwriting recognition) were still hard for machines. However, if
machine learning is about mimicking how humans learn, why not go all the way and try to mimic
the human brain?
The idea of using artificial neurons (neurons, connected by synapses, are the major elements in your
brain) had been around for a while. And neural networks simulated in software started being used
for certain problems. They showed a lot of promise and could solve some complex problems that
other algorithms couldn’t tackle.
But machine learning still got stuck on many things that elementary school children tackled with
ease: how many dogs are in this picture or are they really wolves? Walk over there and bring me the
ripe banana. What made this character in the book cry so much?
It turned out that the problem was not with the concept of machine learning. Or even with the idea
of mimicking the human brain. It was just that simple neural networks with 100s or even 1000s of
neurons, connected in a relatively simple manner, just couldn’t duplicate what the human brain
could do. It shouldn't be a surprise if you think about it; human brains have around 86 billion
neurons and very complex interconnectivity.
A problem may have different aspects of representation and explanation. In order to choose the most
appropriate method for a particular problem, it is necessary to analyze the problem along several
key dimensions. Some of the main key features of a problem are given below.
→Is the problem decomposable into set of sub problems? Can the solution step be ignored or
undone?
→Is the problem universally predictable?
→Is a good solution to the problem obvious without comparison to all the possible solutions? Is the
desire solution a state of world or a path to a state?
→Is a large amount of knowledge absolutely required to solve the problem?
→Will the solution of the problem required interaction between the computer and the person?
The above characteristics of a problem are called as 7-problem characteristics under which the
solution must take place.
To solve the problem of building a system you should take the following steps:
1. Define the problem accurately including detailed specifications and what
constitutes a suitable solution.
2. Scrutinize the problem carefully, for some features may have a central
affect on the chosen method of solution.
3. Segregate and represent the background knowledge needed in the
solution of the problem.
4. Choose the best solving techniques for the problem to solve a solution.
Problem solving has been the key area of concern for Artificial Intelligence.
General Problem Solver (GPS) was a computer program created in 1957 by Simon
and Newell to build a universal problem solver machine. GPS was based on Simon
and Newell’s theoretical work on logic machines. GPS in principle can solve any
formalized symbolic problem, such as theorems proof and geometric problems and
chess playing. GPS solved many simple problems, such as the Towers of Hanoi,
that could be sufficiently formalized, but GPS could not solve any real-world
problems.
Analyze the problem – find few important features that may have
impact on the appropriateness of various possible techniques for
solving the problem
Isolate and represent task knowledge necessary to solve the problem
Choose the best problem-solving technique(s) and apply to the particular problem
Problem definitions
A problem is defined by its ‘elements’ and their ‘relations’. To provide a formal
description of a problem, we need to do the following:
a. Define a state space that contains all the possible configurations of the
relevant objects, including some impossible ones.
b. Specify one or more states that describe possible situations, from which
the problem- solving process may start. These states are called initial
states.
c. Specify one or more states that would be acceptable solution to the problem.
A problem space is represented by a directed graph, where nodes represent search state and paths
represent the operators applied to change the state.
A tree is a graph in which any two vertices are connected by exactly one path.
Alternatively, any connected graph with no cycles is a tree.
• Problem solving: The term, Problem Solving relates to analysis in AI. Problem
solving may be characterized as a systematic search through a range of possible actions
to reach some predefined goal or solution. Problem-solving methods are categorized as
special purpose and general purpose.
To solve the problem of playing a game, we require the rules of the game and targets
for winning as well as representing positions in the game. The opening position can be
defined as the initial state and a winning position as a goal state. Moves from initial
state to other states leading to the goal state follow legally. However, the rules are far
too abundant in most games— especially in chess, where they exceed the number of
particles in the universe. Thus, the rules cannot be supplied accurately and computer
programs cannot handle easily. The storage also presents another problem but
searching can be achieved by hashing.
The number of rules that are used must be minimized and the set can be created by
expressing each rule in a form as possible. The representation of games leads to a state
space representation and it is common for well-organized games with some structure.
This representation allows for the formal definition of a problem that needs the
movement from a set of initial positions to one of a set of target positions. It means that
the solution involves using known techniques and a systematic search. This is quite a
common method in Artificial Intelligence.
Search is the systematic examination of states to find path from the start / root state to
the goal state.
• Search deals with finding nodes having certain properties in a graph that
represents search space.
• Search methods explore the search space ‘intelligently’, evaluating
possibilities without investigating every single possibility.
Examples:
• For a Robot this might consist of PICKUP, PUTDOWN, MOVEFORWARD,
MOVEBACK, MOVELEFT, and MOVERIGHT—until the goal is reached.
•Puzzles and Games have explicit rules: e.g., the ‘Tower of Hanoi’ puzzle
This puzzle involves a set of rings of different sizes that can be placed on three different pegs.
•The puzzle starts with the rings arranged as shown in Figure 2.4(a)
•The goal of this puzzle is to move them all as to Figure 2.4(b)
• Condition: Only the top ring on a peg can be moved, and it may only be placed
on a smaller ring, or on an empty peg.
•State space
A state space is the set of all states reachable from the initial state.
A state space forms a graph (or map) in which the nodes are states and the arcs
between nodes are actions.
In a state space, a path is a sequence of states connected by a sequence of actions.
The solution of a problem is part of the map formed by the state space.
•Structure of a state space
The structures of a state space are trees and graphs.
A tree is a hierarchical structure in a graphical form.
A graph is a non-hierarchical structure.
•A tree has only one path to a given node;
i.e., a tree has one and only one path from any point to any other point.
• A graph consists of a set of nodes (vertices) and a set of edges (arcs). Arcs
establish relationships (connections) between the nodes; i.e., a graph has several
paths to a given node.
•The Operators are directed arcs between nodes.
A search process explores the state space. In the worst case, the search explores all possible
paths between the initial state and the goal state.
•Problem solution
In the state space, a solution is a path from the initial state to a goal state or, sometimes, just a
goal state.
A solution cost function assigns a numeric cost to each path; it also gives the
cost of applying the operators to the states.
A solution quality is measured by the path cost function; and an optimal solution
has the lowest path cost among all solutions.
The solutions can be any or optimal or all.
The importance of cost depends on the problem and the type of solution asked
•Problem description
A problem consists of the description of:
The current state of the world,
The actions that can transform one state of the world into another,
The desired state of the world.
The following action one taken to describe the problem:
✓ State space is defined explicitly or implicitly
A state space should describe everything that is needed to solve a problem and
nothing that is not needed to solve the problem.
✓ Initial state is start state
✓ Goal state is the conditions it has to fulfill.
The description by a desired state may be complete or
partial.
✓ Operators are to change state
✓ Operators do actions that can transform one state into another;
✓ Operators consist of: Preconditions and Instructions;
Preconditions provide partial description of the state of the world that must be true
in order to perform the action, and
Instructions tell the user how to create the next state.
• Operators should be as general as possible, so as to reduce their number.
• Elements of the domain has relevance to the problem
✓ Knowledge of the starting point.
• Problem solving is finding a solution
✓ Find an ordered sequence of operators that transform the current
(start) state into a goal state.
• Restrictions are solution quality any, optimal, or all
✓ Finding the shortest sequence, or
✓ finding the least expensive sequence defining cost, or
✓ finding any sequence as quickly as possible.
This can also be explained with the help of algebraic function as given below.
PROBLEM CHARACTERISTICS
To use the heuristic search for problem solving, we suggest analysis of the
problem for the following considerations:
• Decomposability of the problem into a set of independent smaller subproblems
• Possibility of undoing solution steps, if they are found to be unwise
• Predictability of the problem universe
• Possibility of obtaining an obvious solution to a problem without comparison of
all other possible solutions
• Type of the solution: whether it is a state or a path to the goal state
• Role of knowledge in problem solving
• Nature of solution process: with or without interacting with the user
Search Algorithms
Many traditional search algorithms are used in AI applications. For complex
problems, the traditional algorithms are unable to find the solutions within some
practical time and space limits. Consequently, many special techniques are
developed, using heuristic functions.
The algorithms that use heuristic functions are called heuristic algorithms.
In such problems, the search proceeds using current information about the problem
to predict which path is closer to the goal and follow it, although it does not always
guarantee to find the best possible solution. Such techniques help in finding a
solution within reasonable time and space (memory). Some prominent intelligent
search algorithms are stated below:
1. Generate and Test Search
2. Best-first Search
3. Greedy Search
4. A* Search
5. Constraint Search
6. Means-ends analysis
There are some more algorithms. They are either improvements or combinations of these.
• Hierarchical Representation of Search Algorithms: A Hierarchical representation
of most search algorithms is illustrated below. The representation begins with two
types of search:
• Uninformed Search: Also called blind, exhaustive or brute-force search, it
uses no information about the problem to guide the search and therefore may not
be very efficient.
• Informed Search: Also called heuristic or intelligent search, this uses information
about the problem to guide the search—usually guesses the distance to a goal state
and is therefore efficient, but the search may not be always possible.
The first requirement is that it causes motion, in a game playing program, it moves on the
board and in the water jug problem, filling water is used to fill jugs. It means the
control strategies without the motion will never lead to the solution.
The second requirement is that it is systematic, that is, it corresponds to the need for global
motion as well as for local motion. This is a clear condition that neither would it be
rational to fill a jug and empty it repeatedly, nor it would be worthwhile to move a
piece round and round on the board in a cyclic way in a game. We shall initially
consider two systematic approaches for searching. Searches can be classified by the
order in which operators are tried: depth-first, breadth-first, bounded depth-first.
Breadth-first search
A Search strategy, in which the highest layer of a decision tree is searched
completely before proceeding to the next layer is called Breadth-first search (BFS).
•In this strategy, no viable solutions are omitted and therefore it is guaranteed that
an optimal solution is found.
•This strategy is often not feasible when the search space is large.
Algorithm
1. Create a variable called LIST and set it to be the starting state.
2. Loop until a goal state is found or LIST is empty, Do
a. Remove the first element from the LIST and call it E. If the LIST is empty, quit.
b. For every path each rule can match the state E, Do
(i) Apply the rule to generate a new state.
(ii) If the new state is a goal state, quit and return this state.
(iii) Otherwise, add the new state to the end of LIST.
Advantages
1. Guaranteed to find an optimal solution (in terms of shortest number
of steps to reach the goal).
2. Can always find a goal node if one exists (complete).
Disadvantages
1. High storage requirement: exponential with tree depth.
Depth-first search
A search strategy that extends the current path as far as possible before backtracking
to the last choice point and trying the next alternative path is called Depth-first search
(DFS).
•This strategy does not guarantee that the optimal solution has been found.
•In this strategy, search reaches a satisfactory solution more rapidly than
breadth first, an advantage when the search space is large.
Algorithm
Depth-first search applies operators to each newly generated state, trying to drive
directly toward the goal.
1. If the starting state is a goal state, quit and return success.
2. Otherwise, do the following until success or failure is signalled:
a. Generate a successor E to the starting state. If there are no more successors, then signal failure.
b. Call Depth-first Search with E as the starting state.
c. If success is returned signal success; otherwise, continue in the loop.
Advantages
1. Low storage requirement: linear with tree depth.
2. Easily programmed: function call stack does most of the work of maintaining
state of the search.
Disadvantages
1. May find a sub-optimal solution (one that is deeper or more costly than the best solution).
2. Incomplete: without a depth bound, may not find a solution even if one exists.
2.4.2.3 Bounded depth-first search
Depth-first search can spend much time (perhaps infinite time) exploring a very deep
path that does not contain a solution, when a shallow solution exists. An easy way to
solve this problem is to put a maximum depth bound on the search. Beyond the depth
bound, a failure is generated automatically without exploring any deeper.
Problems:
1. It’s hard to guess how deep the solution lies.
2. If the estimated depth is too deep (even by 1) the computer time used is
dramatically increased, by a factor of bextra.
3. If the estimated depth is too shallow, the search fails to find a solution; all that
computer time is wasted.
Heuristics
A heuristic is a method that improves the efficiency of the search process. These are
like tour guides. There are good to the level that they may neglect the points in general
interesting directions; they are bad to the level that they may neglect points of interest
to particular individuals. Some heuristics help in the search process without sacrificing
any claims to entirety that the process might previously had. Others may occasionally
cause an excellent path to be overlooked. By sacrificing entirety it increases efficiency.
Heuristics may not find the best
solution every time but guarantee that they find a good solution in a reasonable time.
These are particularly useful in solving tough and complex problems, solutions of
which would require infinite time, i.e. far longer than a lifetime for the problems which
are not solved in any other way.
Heuristic search
To find a solution in proper time rather than a complete solution in unlimited time
we use heuristics. ‘A heuristic function is a function that maps from problem state
descriptions to measures of desirability, usually represented as numbers’. Heuristic
search methods use knowledge about the problem domain and choose promising
operators first. These heuristic search methods use heuristic functions to evaluate the
next state towards the goal state. For finding a solution, by using the heuristic
technique, one should carry out the following steps:
1. Add domain—specific information to select what is the best path to continue searching along.
2. Define a heuristic function h(n) that estimates the ‘goodness’ of a node n.
Specifically, h(n) = estimated cost(or distance) of minimal cost path from n to a goal state.
3. The term, heuristic means ‘serving to aid discovery’ and is an estimate, based on
domain specific information that is computable from the current state description of
how close we are to a goal.
Finding a route from one city to another city is an example of a search problem
in which different search orders and the use of heuristic knowledge are easily
understood.
1. State: The current city in which the traveller is located.
2. Operators: Roads linking the current city to other cities.
3. Cost Metric: The cost of taking a given road between cities.
4. Heuristic information: The search could be guided by the direction of the goal city
from the current city, or we could use airline distance as an estimate of the distance
to the goal. Heuristic search techniques
For complex problems, the traditional algorithms, presented above, are unable to find
the solution within some practical time and space limits. Consequently, many special
techniques are developed, using heuristic functions.
•Blind search is not always possible, because it requires too much time or Space (memory).
Comparison of Algorithms
Suppose there are N cities, then a solution would be to take N! possible combinations
to find the shortest distance to decide the required route. This is not efficient as with
N=10 there are 36,28,800 possible routes. This is an example of combinatorial explosion.
There are better methods for the solution of such problems: one is called branch and
bound. First, generate all the complete paths and find the distance of the first complete
path. If the next path is shorter, then save it and proceed this way avoiding the path
when its length exceeds the saved shortest path length, although it is better than the
previous method.
Hill Climbing
Hill Climbing is heuristic search used for mathematical optimization problems in
the field of Artificial Intelligence .
Given a large set of inputs and a good heuristic function, it tries to find a sufficiently
good solution to the problem. This solution may not be the global optimal maximum.
▪ In the above definition, mathematical optimization problems implies that hill
climbing solves the problems where we need to maximize or minimize a given
real function by choosing values from the given inputs. Example-Travelling
salesman problem where we need to minimize the distance traveled by
salesman.
▪ ‘Heuristic search’ means that this search algorithm may not find the optimal
solution to the problem. However, it will give a good solution in reasonable
time.
▪ A heuristic function is a function that will rank all the possible
alternatives at any branching step in search algorithm based on the
available information. It helps the algorithm to select the best route out of
possible routes.
Features of Hill Climbing
1. Variant of generate and test algorithm : It is a variant of generate and test
algorithm. The generate and test algorithm is as follows :
1. Simple Hill climbing : It examines the neighboring nodes one by one and selects
the first neighboring node which optimizes the current cost as next node.
Algorithm for Simple Hill climbing :
Step 1 : Evaluate the initial state. If it is a goal state then stop and return success. Otherwise,
make initial state as current state.
Step 2 : Loop until the solution state is found or there are no new operators present which can be
applied to current state.
a) Select a state that has not been yet applied to the current state and apply it to produce a new
state.
b) Perform these to evaluate new state
i. If the current state is a goal state, then stop and return success.
ii. If it is better than the current state, then make it current state and proceed further.
iii. If it is not better than the current state, then continue in the loop until a solution is found.
Step 3 : Exit.
2. Steepest-Ascent Hill climbing : It first examines all the neighboring nodes
and then selects the node closest to the solution state as next node.
Step 1 : Evaluate the initial state. If it is goal state then exit else make the current state as initial
state
Step 2 : Repeat these steps until a solution is found or current state does not change
i. Let ‘target’ be a state such that any successor of the current state will be better than it;
ii. for each operator that applies to the current state
a. apply the new operator and create a new state
b. evaluate the new state
c. if this state is goal state then quit else compare with ‘target’
d. if this state is better than ‘target’, set this state as ‘target’
e. if target is better than current state set current state to
Target Step 3 : Exit
3. Stochastic hill climbing : It does not examine all the neighboring nodes before
deciding which node to select .It just selects a neighboring node at random, and
decides (based on the amount of improvement in that neighbor) whether to
move to that neighbor or to examine another.
State Space diagram for Hill Climbing
State space diagram is a graphical representation of the set of states our search
algorithm can reach vs the value of our objective function(the function which we
wish to maximize).
X- axis : denotes the state space ie states or configuration our algorithm may reach.
Y-axis : denotes the values of objective function corresponding to to a particular state.
The best solution will be that state space where objective function has maximum
value(global maximum).
In BFS and DFS, when we are at a node, we can consider any of the adjacent
as next node. So both BFS and DFS blindly explore paths without considering any
cost function. The idea of Best First Search is to use an evaluation function to decide
which adjacent is most promising and then explore. Best First Search falls under the
category of Heuristic Search or Informed Search.
We use a priority queue to store costs of nodes. So the implementation is a variation of
BFS, we just need to change Queue to PriorityQueue.
Algorithm:
Best-First-Search(Grah g, Node start)
1) Create an empty
PriorityQueue PriorityQueue
pq;
2) Insert "start" in
pq.
pq.insert(start)
3) Until PriorityQueue is
empty u =
PriorityQueue.DeleteMin
If u is the goal
Exit
Else
Foreach neighbor v of
u If v "Unvisited"
Mark v
"Visited"
pq.insert(v)
Mark v
"Examined" End
procedure
Let us consider below example.
pq initially contains S
We remove s from and process
unvisited neighbors of S to pq.
pq now contains {A, C, B} (C is
put before B because C has
lesser cost)
We remove A from pq and process
unvisited neighbors of A to pq.
pq now contains {C, B, E, D}
We remove C from pq and process
unvisited neighbors of C to pq.
pq now contains {B, H, E, D}
A* Search Algorithm
In this tutorial I will look at the use of state space search to find the shortest path
between two points (pathfinding), and also to solve a simple sliding tile puzzle (the
8-puzzle). Let's look at some of the terms used in Artificial Intelligence when
describing this state space search.
Some terminology
A node is a state that the problem's world can be in. In pathfinding a node would be
just a 2d coordinate of where we are at the present time. In the 8-puzzle it is the
positions of all the tiles. Next all the nodes are arranged in a graph where links
between nodes represent valid steps in solving the problem. These links are known as
edges. In the 8-puzzle diagram the edges are shown as blue lines. See figure 1 below.
State space search, then, is solving a problem by beginning with the start state, and then
for each node we expand all the nodes beneath it in the graph by applying all the
possible moves that can be made at each point.
Heuristics and Algorithms
At this point we introduce an important concept, the heuristic. This is like an algorithm,
but with a key difference. An algorithm is a set of steps which you can follow to solve a
problem, which always works for valid input. For example you could probably write
an algorithm yourself for
multiplying two numbers together on paper. A heuristic is not guaranteed to work but
is useful in that it may solve a problem for which there is no algorithm.
We need a heuristic to help us cut down on this huge search problem. What we need is
to use our heuristic at each node to make an estimate of how far we are from the goal.
In pathfinding we know exactly how far we are, because we know how far we can
move each step, and we can calculate the exact distance to the goal.
But the 8-puzzle is more difficult. There is no known algorithm for calculating from a
given position how many moves it will take to get to the goal state. So various
heuristics have been devised. The best one that I know of is known as the Nilsson
score which leads fairly directly to the goal most of the time, as we shall see.
Cost
When looking at each node in the graph, we now have an idea of a heuristic, which can
estimate how close the state is to the goal. Another important consideration is the cost
of getting to where we are. In the case of pathfinding we often assign a movement cost
to each square. The cost is the same then the cost of each square is one. If we wanted to
differentiate between terrain types we may give higher costs to grass and mud than to
newly made road. When looking at a node we want to add up the cost of what it took
to get here, and this is simply the sum of the cost of this node and all those that are
above it in the graph.
8 Puzzle
Let's look at the 8 puzzle in more detail. This is a simple sliding tile puzzle on a 3*3
grid where one tile is missing and you can move the other tiles into the gap until you
get the puzzle into the goal position. See figure 1.
The 8 puzzle game state is as simple as representing a list of the 9 squares and what's
in them. Here are two states for example; the last one is the GOAL state, at which
point we've found the solution. The first is a jumbled up example that you may start
from.
Pathfinding
In a video game, or some other pathfinding scenario, you want to search a state space
and find out how to get from somewhere you are to somewhere you want to be,
without bumping into walls or going too far. For reasons we will see later, the A*
algorithm will not only find a path, if there is one, but it will find the shortest path. A
state in pathfinding is simply a position in the world. In the example of a maze game
like Pacman you can represent where everything is using a simple 2d grid. The start
state for a ghost say, would be the 2d coordinate of where the ghost is at the start of the
search. The goal state would be where pacman is so we can go and eat him.
There is also example code to do pathfinding on the github site.
Figure 2 : The first three steps of a pathfinding state space
Implementing A*
We are now ready to look at the operation of the A* algorithm. What we need to do is
start with the goal state and then generate the graph downwards from there. Let's take
the 8-puzzle in figure 1. We ask how many moves can we make from the start state?
The answer is 2, there are two directions we can move the blank tile, and so our graph
expands.
If we were just to continue blindly generating successors to each node, we could
potentially fill the computer's memory before we found the goal node. Obviously we
need to remember the best nodes and search those first. We also need to remember the
nodes that we have expanded already, so that we don't expand the same state
repeatedly.
Let's start with the OPEN list. This is where we will remember which nodes we haven't
yet expanded. When the algorithm begins the start state is placed on the open list, it is
the only state we know about and we have not expanded it. So we will expand the
nodes from the start and put those on the OPEN list too. Now we are done with the
start node and we will put that on the CLOSED list. The CLOSED list is a list of nodes
that we have expanded.
f=g+h
Using the OPEN and CLOSED list lets us be more selective about what we look at next
in the search. We want to look at the best nodes first. We will give each node a score on
how good we think it is. This score should be thought of as the cost of getting from the
node to the goal plus the cost of getting to where we are. Traditionally this has been
represented by the letters f, g and
h. 'g' is the sum of all the costs it took to get here, 'h' is our heuristic function, the
estimate of what it will take to get to the goal. 'f' is the sum of these two. We will store
each of these in our nodes.
Using the f, g and h values the A* algorithm will be directed, subject to conditions we
will look at further on, towards the goal and will find it in the shortest route possible.
To help make the operation of the algorithm clear we will look again at the 8-puzzle
problem in figure 1 above. Figure 3 below shows the f,g and h scores for each of the
tiles.
Figure 3 : 8-Puzzle state space showing f,g,h scores
First of all look at the g score for each node. This is the cost of what it took to get from
the start to that node. So in the picture the center number is g. As you can see it
increases by one at each level. In some problems the cost may vary for different state
changes. For example in pathfinding there is sometimes a type of terrain that costs
more than other types.
Next look at the last number in each triple. This is h, the heuristic score. As I mentioned
above I am using a heuristic known as Nilsson's Sequence, which converges quickly to a
correct solution in many cases. Here is how you calculate this score for a given 8-puzzle
state :
Advantages:
It is the best one from other techniques. It is used to solve very complex problems.
Disadvantages:
This algorithm is complete if the branching factor is finite and every action has fixed cost.
The speed execution of A* search is highly dependant on the accuracy of the heuristic
algorithm that is used to compute h (n).
AO* Search: (And-Or) Graph
The Depth first search and Breadth first search given earlier for OR trees or graphs can
be easily adopted by AND-OR graph. The main difference lies in the way termination
conditions are determined, since all goals following an AND nodes must be realized;
where as a single goal node following an OR node will do. So for this purpose we are
using AO* algorithm.
Like A* algorithm here we will use two arrays and one heuristic function.
OPEN:
It contains the nodes that has been traversed but yet not been marked solvable or unsolvable.
CLOSE:
Algorithm:
Step 3: Select a node n that is both on OPEN and a member of T0. Remove it from
OPEN and place it in
CLOSE
Step 4: If n is the terminal goal node then leveled n as solved and leveled all the
ancestors of n as solved. If the starting node is marked as solved then success and
exit.
Step 6: Expand n. Find all its successors and find their h (n) value, push them into OPEN.
Step 7: Return to Step 2.
Step 8: Exit.
Implementation:
Step 1:
In the above graph, the solvable nodes are A, B, C, D, E, F and the unsolvable nodes are
G, H. Take A as the starting node. So place A into OPEN.
Advantages:
It is an optimal algorithm.
If traverse according to the ordering of nodes. It can be used for both OR and AND graph.
Disadvantages:
Sometimes for unsolvable nodes, it can’t find the optimal path. Its complexity is than
other algorithms.
PROBLEM REDUCTION
When a problem can be divided into a set of sub problems, where each sub problem
can be solved separately and a combination of these will be a solution, AND-OR
graphs or AND - OR trees are used for representing the solution. The decomposition
of the problem or problem reduction generates AND arcs. One AND are may point to
any number of successor nodes. All
these must be solved so that the arc will rise to many arcs, indicating several possible
solutions. Hence the graph is known as AND - OR instead of AND. Figure shows an
AND - OR graph.
In figure (a) the top node A has been expanded producing two area one leading to B
and leading to C-D . the numbers at each node represent the value of f ' at that node
(cost of getting to the goal state from current state). For simplicity, it is assumed that
every operation(i.e. applying a rule) has unit cost, i.e., each are with single successor
will have a cost of 1 and each of its components. With the available information till
now , it appears that C is the most promising node to expand since its f ' = 3 , the
lowest but going through B would be better since to use C we must also use D' and the
cost would be 9(3+4+1+1). Through B it would be 6(5+1).
Thus the choice of the next node to expand depends not only n a value but also on
whether that node is part of the current best path form the initial mode. Figure (b)
makes this clearer. In figure the node G appears to be the most promising node, with
the least f ' value. But G is not on the current beat path, since to use G we must use GH
with a cost of 9 and again this demands that arcs be used (with a cost of 27). The path
from A through B, E-F is better with a total cost of (17+1=18). Thus we can see that to
search an AND-OR graph, the following three things must be done.
1. traverse the graph starting at the initial node and following the current best
path, and accumulate the set of nodes that are on the path and have not yet
been expanded.
2. Pick one of these unexpanded nodes and expand it. Add its successors to the
graph and computer f ' (cost of the remaining distance) for each of them.
3. Change the f ' estimate of the newly expanded node to reflect the new information
produced by its successors. Propagate this change backward through the graph.
Decide which of the current best path.
The propagation of revised cost estimation backward is in the tree is not necessary in A*
algorithm. This is because in AO* algorithm expanded nodes are re-examined so that
the current best path can be selected. The working of AO* algorithm is illustrated in
figure as follows:
Referring the figure. The initial node is expanded and D is Marked initially as
promising node. D is expanded producing an AND arc E-F. f ' value of D is updated to
10. Going backwards we can see that the AND arc B-C is better . it is now marked as
current best path. B and C have to be expanded next. This process continues until a
solution is found or all paths have led to dead ends, indicating that there is no solution.
An A* algorithm the path from one node to the other is always that of the lowest cost
and it is independent of the paths through other nodes.
The algorithm for performing a heuristic search of an AND - OR graph is given below.
Unlike A* algorithm which used two lists OPEN and CLOSED, the AO* algorithm uses
a single structure G. G represents the part of the search graph generated so far. Each
node in G points down to its immediate successors and up to its immediate
predecessors, and also has with it the value of h' cost of a path from itself to a set of
solution nodes. The cost of getting from the start nodes to the current node "g" is not
stored as in the A* algorithm. This is because it is not possible to compute a single such
value since there may be many paths to the same state. In AO* algorithm serves as the
estimate of goodness of a node. Also a there should value called FUTILITY is used. The
estimated cost of a solution is greater than FUTILITY then the search is abandoned as
too expansive to be practical.
For representing above graphs AO* algorithm is as follows
AO* ALGORITHM:
1. Let G consists only to the node representing the initial state call this node INTT.
Compute h' (INIT).
(II) Generate the successors of NODE . if there are no successors then assign
FUTILITY as h' (NODE). This means that NODE is not solvable. If there are
successors then for each
on
e called SUCCESSOR, that is not also an ancester of NODE do the following
(b) if successor is not a terminal node, mark it solved and assign zero to its h ' value.
(III) propagate the newly discovered information up the graph by doing the following .
let S be a set of nodes that have been marked SOLVED. Initialize S to NODE. Until
S is empty
repeat
the following procedure;
(b) compute h' of each of the arcs emerging from CURRENT , Assign
minimum h' to CURRENT.
(c) Mark the minimum cost path a s the best out of CURRENT.
(d) Mark CURRENT SOLVED if all of the nodes connected to it through the new
marked are have been labeled SOLVED.
(e) If CURRENT has been marked SOLVED or its h ' has just changed, its new status
mus
t be propagate backwards up the graph . hence all the ancestors of CURRENT are
added to S.
(Refered From Artificial Intelligence
2. Using the search tree, compute the most promising solution tree TP .
3. Select node n that is both on open and a part of tp, remove n from open and place it no closed.
4. If n is a goal node, label n as solved. If the start node is solved, exit with success
where tp is the solution tree, remove all nodes from open with a solved ancestor.
5. If n is not solvable node, label n as unsolvable. If the start node is labeled as
unsolvable, exit with failure. Remove all nodes from open ,with unsolvable ancestors.
6. Otherwise, expand node n generating all of its successor compute the cost of for
each newly generated node and place all such nodes on open.
7. Go back to step(2)
CONSTRAINT SATISFACTION:-
Until a complete solution is found or until all paths have led to lead ends, do
2. Apply the constraint inference rules to the selected node to generate all
possible new constraints.
3. If the set of constraints contains a contradiction, then report that this path is a dead end.
5. If neither a constraint nor a complete solution has been found then apply the rules
to generate new partial solutions. Insert these partial solutions into the search graph.
SEND
+ MORE
MONEY
Assign decimal digit to each of the letters in such a way that the answer to the problem
is correct to the same letter occurs more than once , it must be assign the same digit each
time . no two different letters may be assigned the same digit. Consider the crypt
arithmetic problem.
SEND
+ MORE
MONEY
CONSTRAINTS:-
2. Assumption can be made at various levels such that they do not contradict each other.
Initial state of
problem.
D=?
E=?
Y=?
N=?
R=?
O=?
S=?
M=?
C1=
?
C2=
?
C1 ,C 2, C3 stands for the carry variables respectively.
Goal State: the digits to the letters must be assigned in such a manner so that the sum is satisfied.
Solution Process:
1. initial guess m=1 because the sum of two single digits can generate at most a carry '1'.
2. When n=1 o=0 or 1 because the largest single digit number added to m=1 can
generate the sum of either 0 or 1 depend on the carry received from the carry sum. By
this we conclude that o=0 because m is already 1 hence we cannot assign same digit
another letter(rule no.)
3. We have m=1 and o=0 to get o=0 we have s=8 or 9, again depending on the carry
received from the earlier sum.
The same process can be repeated further. The problem has to be composed into
various constraints. And each constraints is to be satisfied by guessing the possible
digits that the letters can be assumed that the initial guess has been already made . rest
of the process is being shown in the form of a tree, using depth-first search for the
clear understandability of the solution process.
D>6(Controduction)
MEANS - ENDS ANALYSIS:-
Most of the search strategies either reason forward of backward however, often a
mixture o the two directions is appropriate. Such mixed strategy would make it
possible to solve the major parts of problem first and solve the smaller problems the
arise when combining them together. Such a technique is called "Means - Ends
Analysis".
The means -ends analysis process centers around finding the difference between current
state and goal state. The problem space of means - ends analysis has an initial state and
one or more goal state, a set of operate with a set of preconditions their application and
difference functions that computes the difference between two state a(i) and s(j). A
problem is solved using means - ends analysis by
1. Computing the current state s1 to a goal state s2 and computing their difference D12.
3. The operator OP is applied if possible. If not the current state is solved a goal is
created and means- ends analysis is applied recursively to reduce the sub goal.
4. If the sub goal is solved state is restored and work resumed on the original problem.
( the first AI program to use means - ends analysis was the GPS General problem solver)
means- ends analysis I useful for many human planning activities. Consider the
example of planing for an office worker. Suppose we have a different table of
three rules:
1. If in our current state we are hungry , and in our goal state we are not hungry , then
either the "visit hotel" or "visit Canteen " operator is recommended.
2. If our current state we do not have money , and if in your goal state we have
money, then the "Visit our bank" operator or the "Visit secretary" operator is
recommended.
3. If our current state we do not know where something is , need in our goal state we
do know, then either the "visit office enquiry" , "visit secretary" or "visit co worker "
operator is recommended.
67
68