AI Mod-3
AI Mod-3
AI Mod-3
Non-monotonic Reasoning
In Non-monotonic reasoning, some conclusions may be invalidated if we add some more information to our
knowledge base.
Logic will be said as non-monotonic if some conclusions can be invalidated by adding more knowledge into
our knowledge base.
"Human perceptions for various things in daily life, "is a general example of non-monotonic reasoning.
Example: Let suppose the knowledge base contains the following knowledge:
So from the above sentences, we can conclude that Pitty can fly.
However, if we add one another sentence into knowledge base "Pitty is a penguin", which concludes "Pitty
cannot fly", so it invalidates the above conclusion.
o For real-world systems such as Robot navigation, we can use non-monotonic reasoning.
o In Non-monotonic reasoning, we can choose probabilistic facts or can make assumptions.
o In non-monotonic reasoning, the old facts may be invalidated by adding new sentences.
o It cannot be used for theorem proving.
Default reasoning
This is a very common from of non-monotonic reasoning. Here We want to draw conclusions
based on what is most likely to be true.
We have already seen examples of this and possible ways to represent this knowledge.
Non-Monotonic logic.
Default logic.
DO NOT get confused about the label Non-Monotonic and Default being applied to reasoning
and a particular logic. Non-Monotonic reasoning is generic descriptions of a class of reasoning.
Non-Monotonic logic is a specific theory. The same goes for Default reasoning and Default
logic.
Non-Monotonic Logic
This is basically an extension of first-order predicate logic to include a modal operator, M. The
purpose of this is to allow for consistency.
states that for all x is x plays an instrument and if the fact that x can improvise is consistent with
all other knowledge then we can conclude that x is a jazz musician.
to show that fact P is true attempt to prove . If we fail we may say that P is consistent
(since is false).
Now this states that Quakers tend to be pacifists and Republicans tend not to be.
Quaker(Nixon)
Republican(Nixon)
Default Logic
Now this is similar to Non-monotonic logic but there are some distinctions:
New inference rules are used for computing the set of plausible extensions. So in the
Nixon example above Default logic can support both assertions since is does not say
anything about how choose between them -- it will depend on the inference being made.
In Default logic any nonmonotonic expressions are rules of inference rather than
expressions.
BFS algorithm
In this article, we will discuss the BFS algorithm in the data structure. Breadth-first search is a graph
traversal algorithm that starts traversing the graph from the root node and explores all the neighboring
nodes. Then, it selects the nearest node and explores all the unexplored nodes. While using BFS for
traversal, any node in the graph can be considered as the root node.
There are many ways to traverse the graph, but among them, BFS is the most commonly used approach. It is
a recursive algorithm to search all the vertices of a tree or graph data structure. BFS puts every vertex of the
graph into two categories - visited and non-visited. It selects a single node in a graph and, after that, visits all
the nodes adjacent to the selected node.
o BFS can be used to find the neighboring locations from a given source location.
o In a peer-to-peer network, BFS algorithm can be used as a traversal method to find all the neighboring nodes.
Most torrent clients, such as BitTorrent, uTorrent, etc. employ this process to find "seeds" and "peers" in the
network.
o BFS can be used in web crawlers to create web page indexes. It is one of the main algorithms that can be used
to index web pages. It starts traversing from the source page and follows the links associated with the page.
Here, every web page is considered as a node in the graph.
o BFS is used to determine the shortest path and minimum spanning tree.
o BFS is also used in Cheney's technique to duplicate the garbage collection.
o It can be used in ford-Fulkerson method to compute the maximum flow in a flow network.
Algorithm
The steps involved in the BFS algorithm to explore a graph are given as follows -
Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS = 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
Step 6: EXIT
In the above graph, minimum path 'P' can be found by using the BFS that will start from Node A and end at
Node E. The algorithm uses two queues, namely QUEUE1 and QUEUE2. QUEUE1 holds all the nodes that
are to be processed, while QUEUE2 holds all the nodes that are processed and deleted from QUEUE1.
1. QUEUE1 = {A}
2. QUEUE2 = {NULL}
Step 2 - Now, delete node A from queue1 and add it into queue2. Insert all neighbors of node A to queue1.
1. QUEUE1 = {B, D}
2. QUEUE2 = {A}
Step 3 - Now, delete node B from queue1 and add it into queue2. Insert all neighbors of node B to queue1.
1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}
Step 4 - Now, delete node D from queue1 and add it into queue2. Insert all neighbors of node D to queue1.
The only neighbor of Node D is F since it is already inserted, so it will not be inserted again.
1. QUEUE1 = {C, F}
2. QUEUE2 = {A, B, D}
Step 5 - Delete node C from queue1 and add it into queue2. Insert all neighbors of node C to queue1.
1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
Step 5 - Delete node F from queue1 and add it into queue2. Insert all neighbors of node F to queue1. Since
all the neighbors of node F are already present, we will not insert them again.
1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}
Step 6 - Delete node E from queue1. Since all of its neighbors have already been added, so we will not
insert them again. Now, all the nodes are visited, and the target node E is encountered into queue2.
1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}
The space complexity of BFS can be expressed as O(V), where V is the number of vertices.
In this code, we are using the adjacency list to represent our graph. Implementing the Breadth-First Search
algorithm in Java makes it much easier to deal with the adjacency list since we only have to travel through
the list of nodes attached to each node once the node is dequeued from the head (or start) of the queue.
In this example, the graph that we are using to demonstrate the code is given as follows -
5.Discuss Bayes’ Theorem with suitable example.
https://www.javatpoint.com/bayes-theorem-in-artifical-intelligence
In statistics and probability theory, the Bayes’ theorem (also known as the Bayes’ rule) is a
mathematical formula used to determine the conditional probability of events. Essentially, the
Bayes’ theorem describes the probability of an event based on prior knowledge of the conditions
that might be relevant to the event.
The theorem is named after English statistician, Thomas Bayes, who discovered the formula in
1763. It is considered the foundation of the special statistical inference approach called the
Bayes’ inference.
Besides statistics, the Bayes’ theorem is also used in various disciplines, with medicine and
pharmacology as the most notable examples. In addition, the theorem is commonly employed in
different fields of finance. Some of the applications include but are not limited to, modeling the
risk of lending money to borrowers or forecasting the probability of the success of an
investment.
Note that events A and B are independent events (i.e., the probability of the outcome of event A
does not depend on the probability of the outcome of event B).
A special case of the Bayes’ theorem is when event A is a binary variable. In such a case, the
theorem is expressed in the following way:
Where:
P(B|A–) – the probability of event B occurring given that event A– has occurred
P(B|A+) – the probability of event B occurring given that event A+ has occurred
In the special case above, events A– and A+ are mutually exclusive outcomes of event A.
Imagine you are a financial analyst at an investment bank. According to your research
of publicly-traded companies, 60% of the companies that increased their share price by more
than 5% in the last three years replaced their CEOs during the period.
At the same time, only 35% of the companies that did not increase their share price by more
than 5% in the same period replaced their CEOs. Knowing that the probability that the stock
prices grow by more than 5% is 4%, find the probability that the shares of a company that fires
its CEO will increase by more than 5%.
Before finding the probabilities, you must first define the notation of the probabilities.
Thus, the probability that the shares of a company that replaces its CEO will grow by more than
5% is 6.67%.
Bayesian belief network is key computer technology for dealing with probabilistic events and to solve a
problem which has uncertainty. We can define a Bayesian network as:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and their
conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability distribution, and
also use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between multiple events,
we need a Bayesian network. It can also be used in various tasks including prediction, anomaly detection,
diagnostics, automated insight, reasoning, time series prediction, and decision making under
uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it consists of two
parts:
The generalized form of Bayesian network that represents and solve decision problems under uncertain
knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:
o Each node corresponds to the random variables, and a variable can be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities between random
variables. These directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed link that
means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes of the
network graph.
o If we are considering node B, which is connected with node A by a directed arrow, then node A
is called the parent of Node B.
o Node C is independent of node A.
Note: The Bayesian network graph does not contain any cyclic graph. Hence, it is known as a directed acyclic
graph or DAG.
o Causal Component
o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ), which
determines the effect of the parent on that node.
Bayesian network is based on Joint probability distribution and conditional probability. So let's first
understand the joint probability distribution:
Joint probability distribution:
If we have variables x1, x2, x3,....., xn, then the probabilities of a different combination of x1, x2, x3.. xn,
are known as Joint probability distribution.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint probability distribution.
In general for each variable Xi, we can write the equation as:
Let's understand the Bayesian network through an example by creating a directed acyclic graph:
Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably responds at
detecting a burglary but also responds for minor earthquakes. Harry has two neighbors David and Sophia,
who have taken a responsibility to inform Harry at work when they hear the alarm. David always calls Harry
when he hears the alarm, but sometimes he got confused with the phone ringing and calls at that time too.
On the other hand, Sophia likes to listen to high music, so sometimes she misses to hear the alarm. Here we
would like to compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a burglary, nor an earthquake
occurred, and David and Sophia both called the Harry.
Solution:
o The Bayesian network for the above problem is given below. The network structure is showing that burglary
and earthquake is the parent node of the alarm and directly affecting the probability of alarm's going off, but
David and Sophia's calls depend on alarm probability.
o The network is representing that our assumptions do not directly perceive the burglary and also do not notice
the minor earthquake, and they also not confer before calling.
o The conditional distributions for each node are given as conditional probabilities table or CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table represent an exhaustive set of cases
for the variable.
o In CPT, a boolean variable with k boolean parents contains 2K probabilities. Hence, if there are two parents,
then CPT will contain 4 probability values
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A, B, E], can rewrite the
above probability statement using joint probability distribution:
Let's take the observed probability for the Burglary and earthquake component:
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
The Conditional probability of David that he will call depends on the probability of Alarm.
The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."
From the formula of joint distribution, we can write the problem statement in the form of probability
distribution:
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
POSSIBLY YES
CANNOT SAY
POSSIBLY NO
CERTAINLY NO
The fuzzy logic works on the levels of possibilities of input to achieve the definite output.
Implementation
It can be implemented in systems with various sizes and capabilities ranging from small micro-
controllers to large, networked, workstation-based control systems.
It can be implemented in hardware, software, or a combination of both.
LP x is Large Positive
MP x is Medium Positive
S x is Small
MN x is Medium Negative
LN x is Large Negative
Membership Function
Membership functions allow you to quantify linguistic term and represent a fuzzy set graphically.
A membership function for a fuzzy set A on the universe of discourse X is defined as μA:X → [0,1].
Here, each element of X is mapped to a value between 0 and 1. It is called membership value or degree of
membership. It quantifies the degree of membership of the element in X to the fuzzy set A.
The triangular membership function shapes are most common among various other membership function
shapes such as trapezoidal, singleton, and Gaussian.
Here, the input to 5-level fuzzifier varies from -10 volts to +10 volts. Hence the corresponding output also
changes.
Algorithm
Define linguistic Variables and terms (start)
Construct membership functions for them. (start)
Construct knowledge base of rules (start)
Convert crisp data into fuzzy data sets using membership functions. (fuzzification)
Evaluate rules in the rule base. (Inference Engine)
Combine results from each rule. (Inference Engine)
Convert output data into non-fuzzy values. (defuzzification)
Development
Step 1 − Define linguistic variables and terms
Linguistic variables are input and output variables in the form of simple words or sentences. For room
temperature, cold, warm, hot, etc., are linguistic terms.
Temperature (t) = {very-cold, cold, warm, very-warm, hot}
Every member of this set is a linguistic term and it can cover some portion of overall temperature values.
Step 2 − Construct membership functions for them
The membership functions of temperature variable are as shown −
RoomTemp.
Very_Cold Cold Warm Hot Very_Hot
/Target
Build a set of rules into the knowledge base in the form of IF-THEN-ELSE structures.
Automatic Gearboxes
Four-Wheel Steering
Vehicle environment control
Consumer Electronic Goods
Hi-Fi Systems
Photocopiers
Still and Video Cameras
Television
Domestic Goods
Microwave Ovens
Refrigerators
Toasters
Vacuum Cleaners
Washing Machines
Environment Control
Air Conditioners/Dryers/Heaters
Humidifiers
Advantages of FLSs
Mathematical concepts within fuzzy reasoning are very simple.
You can modify a FLS by just adding or deleting rules due to flexibility of fuzzy logic.
Fuzzy logic Systems can take imprecise, distorted, noisy input information.
FLSs are easy to construct and understand.
Fuzzy logic is a solution to complex problems in all fields of life, including medicine, as it
resembles human reasoning and decision making.
Disadvantages of FLSs
There is no systematic approach to fuzzy system designing.
They are understandable only when simple.
They are suitable for the problems which do not need high accuracy.