AI Mod-3

1.Illustrate non monotonic reasoning with an example?
Discuss the three key issues

with respect to non monotonic reasoning system.
Non-monotonic Reasoning
In Non-monotonic reasoning, some conclusions may be invalidated if we add some more information to our
knowledge base.
Logic will be said as non-monotonic if some conclusions can be invalidated by adding more knowledge into
our knowledge base.
Non-monotonic reasoning deals with incomplete and uncertain models.
"Human perceptions for various things in daily life, "is a general example of non-monotonic reasoning.
Example: Let suppose the knowledge base contains the following knowledge:
o Birds can fly

o Penguins cannot fly
o Pitty is a bird
So from the above sentences, we can conclude that Pitty can fly.
However, if we add one another sentence into knowledge base "Pitty is a penguin", which concludes "Pitty
cannot fly", so it invalidates the above conclusion.
Advantages of Non-monotonic reasoning:
o For real-world systems such as Robot navigation, we can use non-monotonic reasoning.
o In Non-monotonic reasoning, we can choose probabilistic facts or can make assumptions.
Disadvantages of Non-monotonic Reasoning:
o In non-monotonic reasoning, the old facts may be invalidated by adding new sentences.
o It cannot be used for theorem proving.
2. Elaborate on default reasoning in non-monotonic reasoning.
Default reasoning
This is a very common from of non-monotonic reasoning. Here We want to draw conclusions
based on what is most likely to be true.
We have already seen examples of this and possible ways to represent this knowledge.
We will discuss two approaches to do this:
 Non-Monotonic logic.
 Default logic.
DO NOT get confused about the label Non-Monotonic and Default being applied to reasoning
and a particular logic. Non-Monotonic reasoning is generic descriptions of a class of reasoning.
Non-Monotonic logic is a specific theory. The same goes for Default reasoning and Default
logic.
Non-Monotonic Logic
This is basically an extension of first-order predicate logic to include a modal operator, M. The
purpose of this is to allow for consistency.
For example: : plays_instrument(x) improvises(x) jazz_musician(x)
states that for all x is x plays an instrument and if the fact that x can improvise is consistent with
all other knowledge then we can conclude that x is a jazz musician.
How do we define consistency?
One common solution (consistent with PROLOG notation) is
to show that fact P is true attempt to prove . If we fail we may say that P is consistent
(since is false).
However consider the famous set of assertions relating to President Nixon.
: Republican(x) Pacifist(x) Pacifist(x)
: Quaker(x) Pacifist(x) Pacifist(x)
Now this states that Quakers tend to be pacifists and Republicans tend not to be.
BUT Nixon was both a Quaker and a Republican so we could assert:
Quaker(Nixon)
Republican(Nixon)
This now leads to our total knowledge becoming inconsistent.
Default Logic
Default logic introduces a new inference rule:
which states if A is deducible and it is consistent to assume B then conclude C.
Now this is similar to Non-monotonic logic but there are some distinctions:
 New inference rules are used for computing the set of plausible extensions. So in the
Nixon example above Default logic can support both assertions since is does not say
anything about how choose between them -- it will depend on the inference being made.
 In Default logic any nonmonotonic expressions are rules of inference rather than
expressions.
3.Compare Dependency-Directed Backtracking with Chronological Backtracking with

example.
4. Demonstrate the implementation of BFS in Nonmonotonic Reasoning with example.

https://www.javatpoint.com/breadth-first-search-algorithm
BFS algorithm
In this article, we will discuss the BFS algorithm in the data structure. Breadth-first search is a graph
traversal algorithm that starts traversing the graph from the root node and explores all the neighboring
nodes. Then, it selects the nearest node and explores all the unexplored nodes. While using BFS for
traversal, any node in the graph can be considered as the root node.
There are many ways to traverse the graph, but among them, BFS is the most commonly used approach. It is
a recursive algorithm to search all the vertices of a tree or graph data structure. BFS puts every vertex of the
graph into two categories - visited and non-visited. It selects a single node in a graph and, after that, visits all
the nodes adjacent to the selected node.
Applications of BFS algorithm

The applications of breadth-first-algorithm are given as follows -
o BFS can be used to find the neighboring locations from a given source location.
o In a peer-to-peer network, BFS algorithm can be used as a traversal method to find all the neighboring nodes.
Most torrent clients, such as BitTorrent, uTorrent, etc. employ this process to find "seeds" and "peers" in the
network.
o BFS can be used in web crawlers to create web page indexes. It is one of the main algorithms that can be used
to index web pages. It starts traversing from the source page and follows the links associated with the page.
Here, every web page is considered as a node in the graph.
o BFS is used to determine the shortest path and minimum spanning tree.
o BFS is also used in Cheney's technique to duplicate the garbage collection.
o It can be used in ford-Fulkerson method to compute the maximum flow in a flow network.
Algorithm
The steps involved in the BFS algorithm to explore a graph are given as follows -
Step 1: SET STATUS = 1 (ready state) for each node in G
Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 3: Repeat Steps 4 and 5 until QUEUE is empty

Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS = 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
Step 6: EXIT
Example of BFS algorithm

Now, let's understand the working of BFS algorithm by using an example. In the example given below, there
is a directed graph having 7 vertices.
In the above graph, minimum path 'P' can be found by using the BFS that will start from Node A and end at
Node E. The algorithm uses two queues, namely QUEUE1 and QUEUE2. QUEUE1 holds all the nodes that
are to be processed, while QUEUE2 holds all the nodes that are processed and deleted from QUEUE1.
Now, let's start examining the graph starting from Node A.
Step 1 - First, add A to queue1 and NULL to queue2.
1. QUEUE1 = {A}
2. QUEUE2 = {NULL}
Step 2 - Now, delete node A from queue1 and add it into queue2. Insert all neighbors of node A to queue1.
1. QUEUE1 = {B, D}
2. QUEUE2 = {A}
Step 3 - Now, delete node B from queue1 and add it into queue2. Insert all neighbors of node B to queue1.
1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}
Step 4 - Now, delete node D from queue1 and add it into queue2. Insert all neighbors of node D to queue1.
The only neighbor of Node D is F since it is already inserted, so it will not be inserted again.
1. QUEUE1 = {C, F}
2. QUEUE2 = {A, B, D}
Step 5 - Delete node C from queue1 and add it into queue2. Insert all neighbors of node C to queue1.
1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
Step 5 - Delete node F from queue1 and add it into queue2. Insert all neighbors of node F to queue1. Since
all the neighbors of node F are already present, we will not insert them again.
1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}
Step 6 - Delete node E from queue1. Since all of its neighbors have already been added, so we will not
insert them again. Now, all the nodes are visited, and the target node E is encountered into queue2.
1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}
Complexity of BFS algorithm

Time complexity of BFS depends upon the data structure used to represent the graph. The time complexity
of BFS algorithm is O(V+E), since in the worst case, BFS algorithm explores every node and edge. In a
graph, the number of vertices is O(V), whereas the number of edges is O(E).
The space complexity of BFS can be expressed as O(V), where V is the number of vertices.
Implementation of BFS algorithm

Now, let's see the implementation of BFS algorithm in java.
In this code, we are using the adjacency list to represent our graph. Implementing the Breadth-First Search
algorithm in Java makes it much easier to deal with the adjacency list since we only have to travel through
the list of nodes attached to each node once the node is dequeued from the head (or start) of the queue.
In this example, the graph that we are using to demonstrate the code is given as follows -
5.Discuss Bayes’ Theorem with suitable example.
https://www.javatpoint.com/bayes-theorem-in-artifical-intelligence
What is the Bayes’ Theorem?
In statistics and probability theory, the Bayes’ theorem (also known as the Bayes’ rule) is a
mathematical formula used to determine the conditional probability of events. Essentially, the
Bayes’ theorem describes the probability of an event based on prior knowledge of the conditions
that might be relevant to the event.
The theorem is named after English statistician, Thomas Bayes, who discovered the formula in
1763. It is considered the foundation of the special statistical inference approach called the
Bayes’ inference.
Besides statistics, the Bayes’ theorem is also used in various disciplines, with medicine and
pharmacology as the most notable examples. In addition, the theorem is commonly employed in
different fields of finance. Some of the applications include but are not limited to, modeling the
risk of lending money to borrowers or forecasting the probability of the success of an
investment.
Formula for Bayes’ Theorem
The Bayes’ theorem is expressed in the following formula:

Where:
 P(A|B) – the probability of event A occurring, given event B has occurred

 P(B|A) – the probability of event B occurring, given event A has occurred
 P(A) – the probability of event A
 P(B) – the probability of event B
Note that events A and B are independent events (i.e., the probability of the outcome of event A
does not depend on the probability of the outcome of event B).
A special case of the Bayes’ theorem is when event A is a binary variable. In such a case, the
theorem is expressed in the following way:
Where:
 P(B|A–) – the probability of event B occurring given that event A– has occurred
 P(B|A+) – the probability of event B occurring given that event A+ has occurred
In the special case above, events A– and A+ are mutually exclusive outcomes of event A.
Example of Bayes’ Theorem
Imagine you are a financial analyst at an investment bank. According to your research
of publicly-traded companies, 60% of the companies that increased their share price by more
than 5% in the last three years replaced their CEOs during the period.
At the same time, only 35% of the companies that did not increase their share price by more
than 5% in the same period replaced their CEOs. Knowing that the probability that the stock
prices grow by more than 5% is 4%, find the probability that the shares of a company that fires
its CEO will increase by more than 5%.
Before finding the probabilities, you must first define the notation of the probabilities.
 P(A) – the probability that the stock price increases by 5%

 P(B) – the probability that the CEO is replaced
 P(A|B) – the probability of the stock price increases by 5% given that the CEO has been
replaced
 P(B|A) – the probability of the CEO replacement given the stock price has increased by
5%.
Using the Bayes’ theorem, we can find the required probability:
Thus, the probability that the shares of a company that replaces its CEO will grow by more than
5% is 6.67%.
6.Describe Bayesian Networks with illustration

Bayesian Belief Network in artificial intelligence
Bayesian belief network is key computer technology for dealing with probabilistic events and to solve a
problem which has uncertainty. We can define a Bayesian network as:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and their
conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability distribution, and
also use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between multiple events,
we need a Bayesian network. It can also be used in various tasks including prediction, anomaly detection,
diagnostics, automated insight, reasoning, time series prediction, and decision making under
uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it consists of two
parts:
o Directed Acyclic Graph

o Table of conditional probabilities.
The generalized form of Bayesian network that represents and solve decision problems under uncertain
knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:
o Each node corresponds to the random variables, and a variable can be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities between random
variables. These directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed link that
means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes of the
network graph.
o If we are considering node B, which is connected with node A by a directed arrow, then node A
is called the parent of Node B.
o Node C is independent of node A.
Note: The Bayesian network graph does not contain any cyclic graph. Hence, it is known as a directed acyclic
graph or DAG.
The Bayesian network has mainly two components:
o Causal Component
o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ), which
determines the effect of the parent on that node.
Bayesian network is based on Joint probability distribution and conditional probability. So let's first
understand the joint probability distribution:
Joint probability distribution:
If we have variables x1, x2, x3,....., xn, then the probabilities of a different combination of x1, x2, x3.. xn,
are known as Joint probability distribution.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint probability distribution.
= P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].
In general for each variable Xi, we can write the equation as:
P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))
Explanation of Bayesian network:
Let's understand the Bayesian network through an example by creating a directed acyclic graph:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably responds at
detecting a burglary but also responds for minor earthquakes. Harry has two neighbors David and Sophia,
who have taken a responsibility to inform Harry at work when they hear the alarm. David always calls Harry
when he hears the alarm, but sometimes he got confused with the phone ringing and calls at that time too.
On the other hand, Sophia likes to listen to high music, so sometimes she misses to hear the alarm. Here we
would like to compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary, nor an earthquake
occurred, and David and Sophia both called the Harry.
Solution:
o The Bayesian network for the above problem is given below. The network structure is showing that burglary
and earthquake is the parent node of the alarm and directly affecting the probability of alarm's going off, but
David and Sophia's calls depend on alarm probability.
o The network is representing that our assumptions do not directly perceive the burglary and also do not notice
the minor earthquake, and they also not confer before calling.
o The conditional distributions for each node are given as conditional probabilities table or CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table represent an exhaustive set of cases
for the variable.
o In CPT, a boolean variable with k boolean parents contains 2K probabilities. Hence, if there are two parents,
then CPT will contain 4 probability values
List of all events occurring in this network:
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B, E], can rewrite the
above probability statement using joint probability distribution:
P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]
=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]
= P [D| A]. P [ S| A, B, E]. P[ A, B, E]
= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]
= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]
Let's take the observed probability for the Burglary and earthquake component:
P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
We can provide the conditional probabilities as per the below tables:
Conditional probability table for Alarm A:
The Conditional probability of Alarm A depends on Burglar and earthquake:

B E P(A= True) P(A= False)
True True 0.94 0.06
True False 0.95 0.04
False True 0.31 0.69
False False 0.001 0.999
Conditional probability table for David Calls:
The Conditional probability of David that he will call depends on the probability of Alarm.
A P(D= True) P(D= False)
True 0.91 0.09
False 0.05 0.95
Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."
A P(S= True) P(S= False)
True 0.75 0.25
False 0.02 0.98
From the formula of joint distribution, we can write the problem statement in the form of probability
distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
The semantics of Bayesian Network:

There are two ways to understand the semantics of the Bayesian network, which is given below:
1. To understand the network as the representation of the Joint probability distribution.
It is helpful to understand how to construct the network.
3. To understand the network as an encoding of a collection of conditional independence

statements.
7.Explain Dempster – Shafer Theory with example.

What Dempster Shafer Theory was given by Arthure P.Dempster in 1967 and his student Glenn
Shafer in 1976.
This theory was released because of following reason:-
 Bayesian theory is only concerned about single evidences.

 Bayesian probability cannot describe ignorance.
DST is an evidence theory, it combines all possible outcomes of the problem. Hence it is used to
solve problems where there may be a chance that a different evidence will lead to some different
result.
The uncertainty in this model is given by:-
1. Consider all possible outcomes.
2. Belief will lead to believe in some possibility by bringing out some evidence.(What is this
supposed to mean?)
3. Plausibility will make evidence compatible with possible outcomes.
For eg:-
Let us consider a room where four people are present, A, B, C and D. Suddenly the lights go out
and when the lights come back, B has been stabbed in the back by a knife, leading to his death. No
one came into the room and no one left the room. We know that B has not committed suicide.
Now we have to find out who the murderer is.
To solve these there are the following possibilities:
 Either {A} or {C} or {D} has killed him.
 Either {A, C} or {C, D} or {A, C} have killed him.
 Or the three of them have killed him i.e; {A, C, D}
 None of them have killed him {o} (let’s say).
There will be the possible evidence by which we can find the murderer by measure of
plausibility.
Using the above example we can say:
Set of possible conclusion (P): {p1, p2….pn}
where P is set of possible conclusions and cannot be exhaustive, i.e. at least one (p)i must be true.
(p)i must be mutually exclusive.
Power Set will contain 2 n elements where n is number of elements in the possible set.
For eg:-
If P = { a, b, c}, then Power set is given as
{o, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, {a, b, c}}= 2 3 elements.
Mass function m(K): It is an interpretation of m({K or B}) i.e; it means there is evidence for {K
or B} which cannot be divided among more specific beliefs for K and B.
Belief in K: The belief in element K of Power Set is the sum of masses of element which are
subsets of K. This can be explained through an example
Lets say K = {a, b, c}
Bel(K) = m(a) + m(b) + m(c) + m(a, b) + m(a, c) + m(b, c) + m(a, b, c)
Plausibility in K: It is the sum of masses of set that intersects with K.
i.e; Pl(K) = m(a) + m(b) + m(c) + m(a, b) + m(b, c) + m(a, c) + m(a, b, c)
Characteristics of Dempster Shafer Theory:
 It will ignorance part such that probability of all events aggregate to 1.(What is this
supposed to mean?)
 Ignorance is reduced in this theory by adding more and more evidences.
 Combination rule is used to combine various types of possibilities.
Advantages:
 As we add more information, uncertainty interval reduces.
 DST has much lower level of ignorance.
 Diagnose hierarchies can be represented using this.
 Person dealing with such problems is free to think about evidences.
Disadvantages:
 In this, computation effort is high, as we have to deal with 2n of sets.
8.Illustrate Fuzzy system in detail.

What is Fuzzy Logic?
Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. The approach of FL imitates
the way of decision making in humans that involves all intermediate possibilities between digital values
YES and NO.
The conventional logic block that a computer can understand takes precise input and produces a definite
output as TRUE or FALSE, which is equivalent to human’s YES or NO.
The inventor of fuzzy logic, Lotfi Zadeh, observed that unlike computers, the human decision making
includes a range of possibilities between YES and NO, such as −
CERTAINLY YES
POSSIBLY YES
CANNOT SAY
POSSIBLY NO
CERTAINLY NO
The fuzzy logic works on the levels of possibilities of input to achieve the definite output.
Implementation
 It can be implemented in systems with various sizes and capabilities ranging from small micro-
controllers to large, networked, workstation-based control systems.
 It can be implemented in hardware, software, or a combination of both.
Why Fuzzy Logic?

Fuzzy logic is useful for commercial and practical purposes.
 It can control machines and consumer products.

 It may not give accurate reasoning, but acceptable reasoning.
 Fuzzy logic helps to deal with the uncertainty in engineering.
Fuzzy Logic Systems Architecture

It has four main parts as shown −
 Fuzzification Module − It transforms the system inputs, which are crisp numbers, into fuzzy sets. It
splits the input signal into five steps such as −
LP x is Large Positive
MP x is Medium Positive
S x is Small
MN x is Medium Negative
LN x is Large Negative
 Knowledge Base − It stores IF-THEN rules provided by experts.

 Inference Engine − It simulates the human reasoning process by making fuzzy inference on the
inputs and IF-THEN rules.
 Defuzzification Module − It transforms the fuzzy set obtained by the inference engine into a crisp
value.
The membership functions work on fuzzy sets of variables.
Membership Function
Membership functions allow you to quantify linguistic term and represent a fuzzy set graphically.
A membership function for a fuzzy set A on the universe of discourse X is defined as μA:X → [0,1].
Here, each element of X is mapped to a value between 0 and 1. It is called membership value or degree of
membership. It quantifies the degree of membership of the element in X to the fuzzy set A.
 x axis represents the universe of discourse.

 y axis represents the degrees of membership in the [0, 1] interval.
There can be multiple membership functions applicable to fuzzify a numerical value. Simple membership
functions are used as use of complex functions does not add more precision in the output.
All membership functions for LP, MP, S, MN, and LN are shown as below −
The triangular membership function shapes are most common among various other membership function
shapes such as trapezoidal, singleton, and Gaussian.
Here, the input to 5-level fuzzifier varies from -10 volts to +10 volts. Hence the corresponding output also
changes.
Example of a Fuzzy Logic System

Let us consider an air conditioning system with 5-level fuzzy logic system. This system adjusts the
temperature of air conditioner by comparing the room temperature and the target temperature value.
Algorithm
 Define linguistic Variables and terms (start)
 Construct membership functions for them. (start)
 Construct knowledge base of rules (start)
 Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
 Evaluate rules in the rule base. (Inference Engine)
 Combine results from each rule. (Inference Engine)
 Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are input and output variables in the form of simple words or sentences. For room
temperature, cold, warm, hot, etc., are linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is a linguistic term and it can cover some portion of overall temperature values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −
Step3 − Construct knowledge base rules

Create a matrix of room temperature values versus target temperature values that an air conditioning system
is expected to provide.
RoomTemp.
Very_Cold Cold Warm Hot Very_Hot
/Target
Very_Cold No_Change Heat Heat Heat Heat
Cold Cool No_Change Heat Heat Heat
Warm Cool Cool No_Change Heat Heat
Hot Cool Cool Cool No_Change Heat
Very_Hot Cool Cool Cool Cool No_Change
Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Sr. No. Condition Action
1 IF temperature=(Cold OR Very_Cold) AND target=Warm THEN Heat

2 IF temperature=(Hot OR Very_Hot) AND target=Warm THEN Cool
3 IF (temperature=Warm) AND (target=Warm) THEN No_Change
Step 4 − Obtain fuzzy value

Fuzzy set operations perform evaluation of rules. The operations used for OR and AND are Max and Min
respectively. Combine all results of evaluation to form a final result. This result is a fuzzy value.
Step 5 − Perform defuzzification
Defuzzification is then performed according to membership function for output variable.
Application Areas of Fuzzy Logic

The key application areas of fuzzy logic are as given −
Automotive Systems
 Automatic Gearboxes
 Four-Wheel Steering
 Vehicle environment control
Consumer Electronic Goods
 Hi-Fi Systems
 Photocopiers
 Still and Video Cameras
 Television
Domestic Goods
 Microwave Ovens
 Refrigerators
 Toasters
 Vacuum Cleaners
 Washing Machines
Environment Control
 Air Conditioners/Dryers/Heaters
 Humidifiers
Advantages of FLSs
 Mathematical concepts within fuzzy reasoning are very simple.
 You can modify a FLS by just adding or deleting rules due to flexibility of fuzzy logic.
 Fuzzy logic Systems can take imprecise, distorted, noisy input information.
 FLSs are easy to construct and understand.
 Fuzzy logic is a solution to complex problems in all fields of life, including medicine, as it
resembles human reasoning and decision making.
Disadvantages of FLSs
 There is no systematic approach to fuzzy system designing.
 They are understandable only when simple.
 They are suitable for the problems which do not need high accuracy.

AI Mod-3

Uploaded by

Copyright:

Available Formats

AI Mod-3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI Mod-3

Uploaded by

Copyright:

Available Formats

1.Illustrate non monotonic reasoning with an example?

Discuss the three key issues

Non-monotonic reasoning deals with incomplete and uncertain models.

o Birds can fly

Advantages of Non-monotonic reasoning:

Disadvantages of Non-monotonic Reasoning:

2. Elaborate on default reasoning in non-monotonic reasoning.

We will discuss two approaches to do this:

For example: : plays_instrument(x) improvises(x) jazz_musician(x)

How do we define consistency?

One common solution (consistent with PROLOG notation) is

However consider the famous set of assertions relating to President Nixon.

: Republican(x) Pacifist(x) Pacifist(x)

: Quaker(x) Pacifist(x) Pacifist(x)

BUT Nixon was both a Quaker and a Republican so we could assert:

This now leads to our total knowledge becoming inconsistent.

Default logic introduces a new inference rule:

which states if A is deducible and it is consistent to assume B then conclude C.

3.Compare Dependency-Directed Backtracking with Chronological Backtracking with

4. Demonstrate the implementation of BFS in Nonmonotonic Reasoning with example.

Applications of BFS algorithm

Step 1: SET STATUS = 1 (ready state) for each node in G

Step 3: Repeat Steps 4 and 5 until QUEUE is empty

Example of BFS algorithm

Now, let's start examining the graph starting from Node A.

Step 1 - First, add A to queue1 and NULL to queue2.

Complexity of BFS algorithm

Implementation of BFS algorithm

What is the Bayes’ Theorem?

Formula for Bayes’ Theorem

The Bayes’ theorem is expressed in the following formula:

 P(A|B) – the probability of event A occurring, given event B has occurred

Example of Bayes’ Theorem

 P(A) – the probability that the stock price increases by 5%

Using the Bayes’ theorem, we can find the required probability:

6.Describe Bayesian Networks with illustration

o Directed Acyclic Graph

The Bayesian network has mainly two components:

= P[x1| x2, x3,....., xn]P[x2, x3,....., xn]

= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].

P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))

Explanation of Bayesian network:

List of all events occurring in this network:

P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]

=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]

= P [D| A]. P [ S| A, B, E]. P[ A, B, E]

= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]

= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]

P(B= True) = 0.002, which is the probability of burglary.

P(B= False)= 0.998, which is the probability of no burglary.

P(E= True)= 0.001, which is the probability of a minor earthquake

We can provide the conditional probabilities as per the below tables:

Conditional probability table for Alarm A:

The Conditional probability of Alarm A depends on Burglar and earthquake:

True True 0.94 0.06

True False 0.95 0.04

False True 0.31 0.69

False False 0.001 0.999

Conditional probability table for David Calls:

P(S, D, A, ¬B, ¬E) = P (S|A) P (D|A)P (A|¬B ^ ¬E) P (¬B) P (¬E).