AI Unit-3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 64

Artificial Intelligence

RCA-403
Syllabus
UNIT-I INTRODUCTION: - Introduction to Artificial Intelligence, Foundations and
History of Artificial Intelligence, Applications of Artificial Intelligence, Intelligent
Agents, Structure of Intelligent Agents. Computer vision, Natural Language
Possessing.
UNIT-II INTRODUCTION TO SEARCH: - Searching for solutions, uniformed
search strategies, informed search strategies, Local search algorithms and optimistic
problems, Adversarial Search, Search for Games, Alpha - Beta pruning.
UNIT-III KNOWLEDGE REPRESENTATION & REASONING: -
Propositional logic, Theory of first order logic, Inference in First order logic,
Forward & Backward chaining, Resolution, Probabilistic reasoning, Utility
theory, Hidden Markov Models (HMM), Bayesian Networks.
Syllabus
UNIT-IV MACHINE LEARNING: - Supervised and unsupervised learning,
Decision trees, Statistical learning models, learning with complete data - Naive
Bayes models, Learning with hidden data – EM algorithm, Reinforcement
learning.

UNIT-V PATTERN RECOGNITION: - Introduction, Design principles of


pattern recognition system, Statistical Pattern recognition, Parameter estimation
methods - Principle Component Analysis (PCA) and Linear Discriminant Analysis
(LDA), Classification Techniques – Nearest Neighbor (NN) Rule, Bayes Classifier,
Support Vector Machine (SVM), K – means clustering.
UNIT-III
What is Knowledge Representation?

Knowledge Representation (KR) in AI describes the representation of knowledge.


It is responsible for representing information about the real world so that a computer
can understand and can utilize this knowledge to solve the complex real-world
problems such as diagnosis a medical condition or communicating with humans in
natural language.
Knowledge Representation and Reasoning (KRR) represents information from
the real world for a computer to understand and then utilize this knowledge to solve
complex real-life problems like communicating with human beings in natural
language
Types of Knowledge

There are 5 types of Knowledge such as:


• Declarative Knowledge
• Structural Knowledge
• Procedural Knowledge
• Meta Knowledge
• Heuristic Knowledge
Declarative Knowledge
• Declarative knowledge is to know about something.
• It includes concepts, facts, and objects.
• It is also called descriptive knowledge and expressed in declarative sentences.
• It is simpler than procedural language.

Procedural Knowledge
• It is also known as imperative knowledge.
• Procedural knowledge is a type of knowledge which is responsible for
knowing how to do something.
• It can be directly applied to any task.
• It includes rules, strategies, procedures, agendas, etc.
• Procedural knowledge depends on the task on which it can be applied.
Meta-knowledge
• Knowledge about the other types of knowledge is called Meta-knowledge.
Heuristic knowledge
• Heuristic knowledge is representing knowledge of some experts in a filed or
subject.
• Heuristic knowledge is rules of thumb based on previous experiences,
awareness of approaches, and which are good to work but not guaranteed.
Structural knowledge
• Structural knowledge is basic knowledge to problem-solving.
• It describes relationships between various concepts such as kind of, part of,
and grouping of something.
• It describes the relationship that exists between concepts or objects.
Process Cycle of Knowledge Representation in AI:
Techniques of Knowledge Representation

There are mainly four ways of knowledge representation which are given as
follows:
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
Logical Representation
Logical representation is a language with some concrete rules which deals with
propositions and has no ambiguity in representation. Logical representation means
drawing a conclusion based on various conditions. Each sentence can be translated
into logics using syntax and semantics.
Syntax: Syntaxes are the rules which decide how we can construct legal
sentences in the logic. It determines which symbol we can use in knowledge
representation. How to write those symbols?
Semantics: Semantics are the rules by which we can interpret the sentence
in the logic. Semantic also involves assigning a meaning to each sentence.
Logical representation can be categorized into mainly two logics:
• Propositional Logics
• Predicate logics
Propositional Logics (PL) or Propositional Calculus
A proposition or sentence is classified as declarative sentence whose value is either
‘TRUE’ or ‘FALSE’.
For example:
• New Delhi is the capital of India Propositions
• Square root of 4 is 2
• 1 + 2 = 4, this is also a propositional whose truth value is false.
Examples of not proposition:
• No, thank you → Assertions, we can’t assign TRUE or FALSE with it.
• “Read this carefully” is not because it is not a declarative sentence.
• x + 1 = 2, This is also not, here x is variable, where x can take any value make
the statement true or false.
Propositional Logics (PL) or Propositional Calculus
Syntax of Propositions:
Uppercase letters are used to denote the atomic symbols like P, Q, R and so on.
Logical connectives:  (negation),  (conjunction),  (disjunction), →
(implication),  (bi-conditional)
Semantics of propositions: The semantics of the propositional calculus defines the
meanings of its sentences. The truth assignments for sentence involving connectives
is defined by the following table:
Atomic Propositions: In logic, a statement which cannot be broken down into
smaller statements, also simply called an “atom”. Small letters like p, q, r, s etc are
used to represent atomic propositions. The examples of atomic propositions are-
p : Sun rises in the east.
q : Sun sets in the west.
r : Apples are red.
s : Grapes are green.
Compound propositions are those propositions that are formed by combining one
or more atomic propositions using connectives. Capital letters like P, Q, R, S etc are
used to represent compound propositions. It is a proposition formed using the
logical operators (Negation(~), Conjunction(^), Disjunction(v) etc.) with the
existing propositions.
Examples-
P : Sun rises in the east and Sun sets in the west.
Q : Apples are red and Grapes are green.
Propositional Logics (PL) or Propositional Calculus
Semantics of propositions - Example: Consider the following propositions:
• P: It is cloudy
• Q: It is raining
Now,
•  P: It is not cloudy
•  Q: It is not raining
• P  Q: It is cloudy and it is raining
• P  Q: It is cloudy or it is raining
• P → Q: It is cloudy indicates it is raining
• P  Q: It is cloudy indicates it is raining and it is raining indicates it is cloudy
Well-formed formulas (wff) in Propositional Logic
Well-formed formula consists of atomic symbols joined with connectives. Thus, P
is propositional variable then it is wff.
• P is wff then  P is wff
• P and Q are wff then
P  Q, P  Q, P → Q, P  Q are wff

Laws of Propositional Logic:


Well-formed formulas (wff) in Propositional Logic
Properties of Propositional Logics (PL)
• Valid
• Satisfiable
• Unsatisfiable
• Logical Equivalence
• Logical Consequence
Propositional Logic: Properties
• A wff is valid if all interpretations satisfy it or it is TRUE for all values of its inputs.
Otherwise (i.e. if at least one does not satisfy it), it is invalid. All TRUE statements
are called TAUTOLOGY. Example: P  P
• A wff is satisfiable if there is at least one interpretation that satisfies it or at least one
interpretation for which it is TRUE. Example: P  Q
• A wff is unsatifiable or CONTRADICTION if there is no interpretation for which it
is TRUE. Example P  P
• Two proposition forms are called logical equivalent if, and only if, they have
identical truth values for each possible substitutions of propositions for their
proposition variables
• A wff W (the conclusion) is a logical consequence of a set of wffs P (the premises) if
and only if for every interpretation in which all the premises are true, the conclusion
is also true
Propositional Logic: Inference Rules
• Addition Rule: It states that If P is true, then P∨Q will be true.
• Simplification: It states that if P∧Q is true, then Q or P will also be true
• Modus Ponens: It states that if P and P→Q is true, then we can infer that Q will be
true
• Modus Tollens: It states that if P→ Q is true and ¬ Q is true, then ¬ P will also true
• Disjunctive Syllogism: It states that if P∨Q is true, and ¬P is true, then Q will be
true
• Hypothetical Syllogism: It states that if P→R is true whenever P→Q is true, and
Q→R is true
• Conjunction: It states that if P is true and Q is true then P∧Q is true
Propositional Logic: Inference Rules
Example 1: Derive S from the following premises using a valid argument.
P→Q
Q → R
PS
R
Example 2: The crop is good, but there is not enough water. If there is a lot of rain or
not a lot of sun, then there is enough water. Therefore, the crop is good and there is a
lot of sun. (Use letters C, W, R, S) . Write the above in propositional logic argument.
Example 3: Derive P from the following premises using a valid argument.
(P  Q) → (R  S)
(R  S)
(P  Q) → P
Example 4: Consider the argument:
• If you send me an email message, then I will finish the program
• If you do not send me an email message, then I will go to sleep early
• If I go to sleep early, then I will wake up feeling refreshed
• Therefore, if I do not finish writing the program, I will wake up feeling refreshed.
Step 1: Translate the argument into logic
Step 2: Prepare premises & conclusion
Step 3: Apply rules of inferences
Step 4: Derive the conclusion
Example 5:
Conjunctive Normal Form (CNF)
A formula is in conjunctive normal form (CNF) or clausal normal form if it is a
conjunction of one or more clauses, where a clause is a disjunction of literals;
otherwise put, it is an AND of ORs. As a normal form, it is useful in automated
theorem proving. Also called Clausal Normal Form.
Steps to convert PL into CNF 4. Apply distributive and/ or
commutative law
1. Remove ↔ using rule:
α  (β  γ) ≡ (α  β)  (α  γ)
α ↔ β ≡ (α → β)  (β → α)
α  (β  γ) ≡ (α  β)  (α  γ)
2. Remove → using rule: a) α → β ≡ α  β αβ≡βα
3. Move  (negation) inwards αβ≡βα
 (α  β) ≡ α  β
(α  β) ≡ α  β Example: Convert P  (Q  R) in
(α) ≡ α CNF
Propositional Logic: Limitations
Limitations of PL:
1. PL applies only to atomic propositions. There is no way to handle about properties
that apply to categories of objects or about relationships between properties
2. PL is too weak and have very limited expressiveness
3. It is compositional i.e. PQR is derived from the meaning of P, Q and R. PL
assumes the world contains only facts.
4. It is context-independent. For example: “All human are mortal” 6.7 Billion
statements required in PL to express it
Forward Chaining & Backward Chaining
Forward chaining is also known as a forward Example: We have, P, P → Q and
deduction or forward reasoning method when using Q → R. Infer R
an inference engine. Forward chaining is a form of
reasoning which start with atomic sentences in the
knowledge base and applies inference rules FC: P → Q → R
(Modus Ponens) in the forward direction to extract
more data until a goal is reached.
Backward-chaining is also known as a backward
deduction or backward reasoning method when
using an inference engine. A backward chaining
algorithm is a form of reasoning, which starts with
the goal and works backward, chaining through BC: R  Q  P
rules to find known facts that support the goal.
Forward Chaining & Backward Chaining
Example: Consider the given rules as:
Rule 1: If A and C Then F
Rule 2: If A and E Then G
Rule 3: If B Then E
Rule 4: If G Then D
• Prove that if A and B are TRUE then D is TRUE by using forward chaining?
• Prove that if A and B are TRUE then D is TRUE by using backward chaining with
the same inference rules?
Forward Chaining vs. Backward Chaining
Predicate Logic or First Order Logic
First-order logic (FOL) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world:
• Objects: A, B, people, numbers, colours, wars, theories, squares and so on
• Relations: It can be unary relation such as: red, round, is adjacent, or n-any
relation such as: the sister of, brother of, has colour, comes between and so on
• Function: Father of, best friend, third inning of, end of, and so on
Syntax of First-Order logic
Predicate Logic or First Order Logic
First-order logic statements can be divided into two parts:
• Subject: Subject is the main part of the statement.
• Predicate: A predicate can be defined as a relation, which binds two atoms
together in a statement.
Atomic sentences: Atomic sentences are the most basic sentences of first-order logic.
These sentences are formed from a predicate symbol followed by a parenthesis with a
sequence of terms. We can represent atomic sentences as
Predicate (term1, term2, ......, term n)
Example: Ravi and Ajay are brothers can be written in FOL as brothers (Ravi, Ajay).
Complex Sentences: Complex sentences are made by combining atomic sentences
using connectives.
Predicate Logic or First Order Logic
Quantifiers in First-order logic
A quantifier is a language element which generates quantification, and quantification
specifies the quantity of specimen in the universe of discourse. These are the symbols
that permit to determine or identify the range and scope of the variable in the logical
expression. There are two types of quantifier:
• Universal Quantifier, (for all, everyone, everything)
• Existential Quantifier, (for some, at least one).
Properties of Quantifiers:
• In universal quantifier, ∀x∀y is similar to ∀y∀x.
• In Existential quantifier, ∃x∃y is similar to ∃y∃x.
• ∃x∀y is not similar to ∀y∃x.
Predicate Logic or First Order Logic
Some Examples of converting the following into FOL using quantifier:
1. All birds fly.
2. Every man respects his parent.
3. Some boys play cricket.
4. Not all students like both Mathematics and Science
5. Only one student failed in Mathematics.
6. Every student loves some student
7. Every student loves some other student
Points to remember:
The main connective for universal quantifier ∀ is implication →.
The main connective for existential quantifier ∃ is and ∧.
Convert the following into FOL:
1. Marcus was a man.
2. Marcus was a Roman.
3. All men are people.
4. Caesar was a ruler.
5. All Romans were either loyal to Caesar or hated him (or both).
6. Everyone is loyal to someone.
7. People only try to assassinate rulers they are not loyal to.
8. Marcus tried to assassinate Caesar.
1. man(marcus)
2. roman(marcus)
3. ∀ x man(x) → person(x)
4. ruler(caesar)
5. ∀ x roman(x) → loyal(x,caesar) ∨ hate(x,caesar)
6. ∀x ∃y loyal(x,y)
7. ∀x ∀y person(x) ∧ tryassasin(x,y) → ¬ loyal(x,y)
8. tryassasin(marcus,caesar)
Predicate Logic or First Order Logic
Inference Rules of FOL
1. Modus Ponens
2. Modus Tollens
3. Hypothetical Syllogism
4. Disjunctive Syllogism
5. Addition
6. Simplification
7. Conjunction
8. Resolution
Predicate Logic or First Order Logic
Inference Rules of FOL with quantifiers:
1. Universal Instantiation (∀x)P(x) then P(c) where c is some constant
2. Universal Generalization P(c) then ∀ x P(x)
3. Existential Introduction P(c) then (∃x)P(x)
4. Existential Instantiation (∃x)P(x) then P(c)
5. Unification: Unification takes two similar sentences and computes the substitution
that makes them look the same, if it exists.
Example: Knows(John, X) and Knows(John, Jane) then x/Jane
Predicate Logic or First Order Logic
Conjunctive Normal Formula (CNF): First Order Logic conversion to CNF
1. Eliminate bi-conditionals and implications:
• Eliminate , replacing α  β with (α → β) ∧ (β → α).
• Eliminate →, replacing α → β with ¬α ∨ β.
2. Move ¬ inwards:
• ¬(∀x p) ≡ ∃x ¬p
• ¬(∃x p) ≡ ∀x ¬p
• ¬(α ∨ β) ≡ ¬α ∧ ¬β
• ¬(α ∧ β) ≡ ¬α ∨ ¬β
• ¬(¬α) ≡ α.
3. Standardize variables apart by renaming them: each quantifier should use a
different variable.
CONTINUED…
Predicate Logic or First Order Logic
4. Skolemize: each existential variable is replaced by a Skolem constant or Skolem
function of the enclosing universally quantified variables.
For instance, ∃x Rich(x) becomes Rich(G1) where G1 is a new Skolem
constant.
“Everyone has a heart” ∀ x Person(x) → ∃ y Heart(y) ∧ Has(x, y) becomes ∀ x
Person(x) → Heart(H(x)) ∧ Has(x, H(x)), where H is a new symbol (Skolem
function).
5. Drop universal quantifiers
For instance, ∀ x Person(x) becomes Person(x).
6. Distribute ∧ over ∨:
(α ∧ β) ∨ γ ≡ (α ∨ γ) ∧ (β ∨ γ).
Predicate Logic or First Order Logic
Conjunctive Normal Formula (CNF) - Example: Convert “Everybody who loves
animals is loved by someone” to CNF
FOL: x (y (animal(y) → love(x, y)) → y love(y, x))
Predicate Logic or First Order Logic
Predicate Logic or First Order Logic

Example:
1. Convert the following sentence into predicate logic and then prove "Is someone
smiling?” using resolution:
• All people who are graduating are happy
• All happy people smile
• Someone is graduating
2. Anyone passing his history exams and winning the lottery is happy. But anyone
who studies or is lucky can pass all his exams. John did not study but John is
lucky. Anyone who is lucky wins the lottery. Is John happy?
Predicate Logic or First Order Logic

Solution 2:
Probabilistic reasoning in Artificial intelligence

Probabilistic reasoning is a way of knowledge representation where we apply the


concept of probability to indicate the uncertainty in knowledge. In probabilistic
reasoning, we combine probability theory with logic to handle the uncertainty.
Basic idea of probability reasoning in AI: Use probability theory to represent and
process uncertainty

In probabilistic reasoning, there are two ways to solve problems with uncertain
knowledge:
1. Bayes' rule
2. Bayesian Statistics
Probabilistic reasoning in Artificial intelligence
Conditional probability:
Conditional probability is a probability of occurring an event when another event
has already happened. Consider the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:
𝑃(𝐴 ‫)𝐵 ځ‬
𝑃 𝐴𝐵 =
𝑃(𝐵)
Where P(A ‫ ځ‬B)= Joint probability of A and B, P(B)= Marginal probability of B.

If the probability of A is given and we need to find the probability of B, then it will
be given as:
𝑃(𝐴 ‫)𝐵 ځ‬
𝑃 𝐵𝐴 =
𝑃(𝐴)
Probabilistic reasoning in Artificial intelligence
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning,
which determines the probability of an event with uncertain knowledge. In
probability theory, it relates the conditional probability and marginal probabilities of
two random events. It is a way to calculate the value of P(B|A) with the knowledge
of P(A|B).
As from product rule we can write: 𝑃(𝐴 ‫𝐵 𝑃 𝐵 𝐴 𝑃 =)𝐵 ځ‬
Similarly, the probability of event B with known event A: 𝑃(𝐴 ‫)𝐴(𝑃 𝐴 𝐵 𝑃 =)𝐵 ځ‬
Equating right hand side of both the equations, we will get: 𝑃 𝐴 𝐵 𝑃(𝐵) =
𝑃 𝐵 𝐴 𝑃(𝐴)
𝑃 𝐵 𝐴 𝑃(𝐴)
𝑃 𝐴𝐵 =
𝑃(𝐵)
The above equation is called as Bayes' rule or Bayes' theorem. This equation is
basic of most modern AI systems for probabilistic inference.
Q.1: At a certain university, 4% of men are over 6 feet tall and 1% of women are
over 6 feet tall. The total student population is divided in the ratio 3:2 in favour of
women. If a student is selected at random from among all those over six feet tall,
what is the probability that the student is a woman using Bayes’ Theorem.

Q.2: A factory production line is manufacturing bolts using three machines, A, B


and C. Of the total output, machine A is responsible for 25%, machine B for 35%
and machine C for the rest. It is known from previous experience with the
machines that 5% of the output from machine A is defective, 4% from machine B
and 2% from machine C. A bolt is chosen at random from the production line and
found to be defective. What is the probability that it came from (a) machine A (b)
machine B (c) machine C? [Hint: Use Bayes’ Theorem]
Probabilistic reasoning in Artificial intelligence
Utility theory
• The main idea of Utility Theory is: an agent's preferences over possible outcomes
can be captured by a function that maps these outcomes to a real number; the
higher the number the more that agent likes that outcome. The function is called a
utility function.
• Utility Theory uses the notion of Expected Utility (EU) as a value that represents
the average utility of all possible outcomes of a state, weighted by the probability
that the outcome occurs.
• The agent can use probability theory to reason about uncertainty. The agent can
use utility theory for rational selection of actions based on preferences. Decision
theory is a general theory for combining probability with rational decisions
Decision theory = Probability theory + Utility theory
Hidden Markov Model (HMM)
HMM is a statistical Markov Model in which the system being modelled is assumed
to be a markov process with unobserved (hidden) states.
In probability theory, markov model or markov chain is a stochastic model used to
model randomly changing systems. It is assumed that the future states depend on the
current state, not on the events that occurred before it which is called Markov
Property.
• If system state is fully observable → Markov Chain or Markov Model
• If system state is partially observable → Hidden Markov Model (HMM)
Hidden Markov Model (HMM)
Definition of Markov Model: If S = (s1, s2, s3, …, sn) be a set of states and the
process moves from one state to another generating a sequence state i.e., si1, si2, si3,
…., sik. Then Markov chain property is used to calculate the probability of each
subsequent state depends only on what was the previous state, i.e.,
P(sik | si1, si2, si3, …., sik-1) = P (sik | Sik-1)
For example: If a person is feeling healthy today, then the probability will be high
for the goodness of his health tomorrow. Probability of any individual’s health on
day four can be represented as:
P(H4) = P(H4|H3).P(H3) + P(H4|I3).P(I3)
=P(H4|H3).(P(H3|H2).P(H2)+P(H3|I2).P(I2))+P(H4|I3).(P(I3|H2).P(H2)+P(I3|I2).P(
I2))
= ………. That continues in the same manner and such expression represents a
chain which is called Markov Chain
Ria is an imaginary friend and during the day she does either of these four things:
• Painting
• Cleaning the house
• Biking
• Shopping for groceries

This is a list of what are called observables. From this observation sequence we
want to know whether the day has been sunny or rainy. These two are going to be
our hidden states.

Graph of our potential HMM on next slide


To start off, a Hidden Markov Model consists of the following properties:
• Hidden States S: in the example above the hidden states are Sunny and Rainy, and
they get grouped into a set S.
• Observables O: Paint, Clean, Shop and Bike. They get grouped into a set O.
• Initial Probabilities 𝜋: a matrix of the initial likelihood of the state at time t = 0. In
this case the likelihood that it is Sunny on the first day is 0.6, while the likelihood
that it is Rainy is 0.4.
𝜋 = |0.6, 0.4|
Note: every row of the following matrices must add up to 1 since they represent a
probability.
• Transition Probabilities A: a matrix that represents the probability of transitioning
to another state given the current state. For example, if the current state is Sunny the
probability that the day after is Sunny as well is 0.8, whereas the probability that the
day after is Rainy is 0.2. Similarly, if today is Rainy, the probability that tomorrow
is Rainy as well is 0.6, while the probability that tomorrow is Sunny is 0.4.
Transition probabilities between states
Emission Probabilities B: a matrix that represents the probability of seeing a
specific observable given a hidden state. For example, the probability of Clean
on a Sunny day is 0.1, whereas the probability of Clean on a Rainy day is 0.45.
Under more mathematical terms, we would describe this model’s properties as such:
Hidden Markov Model (HMM)
Example 1: Consider the following initial probability as:
• P(R) = 0.7
• P(S) = 0.3
Transition Probability:
• R → R = 0.4
• S → S = 0.7
• R → S = 0.6
• S → R = 0.3
What is the probability of Day 2 to have raining?
What is the probability of Day 4 to have raining?
Hidden Markov Model (HMM)
A hidden Markov model (HMM) is a Markov chain for which the state is only
partially observable. In other words, observations are related to the state of the
system, but they are typically insufficient to precisely determine the state.
An HMM is specified by the following components:
• Transition Probability: probability of transition of one state to another
• Emission Probability: probability of observed variable at a particular time
given the state of the hidden variable at that time
• Initial Probability: probability of starting the observation sequence.
Hidden Markov Model (HMM)
Let’s assume that we observe only three different kinds of weather, namely sunny,
rainy or foggy weather. We will now use a Markov Model to model the weather. The
Markov Model can be build using three states which is given by
{S = sunny, R = rainy, F = foggy}
Given that today is sunny, what's the probability that tomorrow is sunny and the day
after is rainy?
Transition probability is given as:

Assume the weather yesterday was rainy, today is foggy. What is the probability that
tomorrow is raining?
Bayesian Network
Bayesian networks are a type of Probabilistic Graphical Model that can be used to
build models from data and/or expert opinion. They can be used for a wide range of
tasks including prediction, anomaly detection, diagnostics, reasoning, time series
prediction and decision making under uncertainty. They are also commonly referred
to as Bayes nets, Belief networks and sometimes Causal networks. Let {X1, X2, …
Xn} be some events. In Bayesian Network, they can be represented as nodes. Now
if a node has some dependency on another node then an arrow/arc is drawn from
one node to another as shown below:
It is interpreted as the child node’s
occurrence is influenced by the occurrence
of its parent node. So Bayesian Network
represents a directed acyclic graph(DAG).
Bayesian Network
A Bayesian network is a DAG:
• Each node corresponds to a random variable
• Directed edges link nodes
• Edge goes from parent to child
• Each node has a conditional probability dist. P(Xi | Parents(Xi) )
• P(node|parent(node))
• Topology of network represents causality
Example: The Alarm Problem by Pearl 1990: You have a new burglary alarm
installed. It is reliable about detecting burglary, but responds to minor earthquakes.
Two neighbours (John, Mary) promise to call you at work when they hear the alarm.
John always calls when hears alarm, but confuses alarm with phone ringing (and
calls then also). Mary likes loud music and sometimes misses alarm! Given
evidence about who has and hasn’t called, estimate the probability of a burglary.
Bayesian Network
Solution: Represent problem using 5 binary variables:
• B = a burglary occurs at your house
• E = an earthquake occurs at your house
• A = the alarm goes off
• J = John calls to report the alarm
• M = Mary calls to report the alarm
Constructing the Bayesian Network:
1. Order the variables in terms of causality (may be a partial order) e.g., {E, B} →
{A} → {J, M}
2. Use these assumptions to create the graph structure of the Bayesian network
(Above):
3. Fill in Conditional Probability Table (CPT) a. One for each node b. 2p entries,
where p is the number of parents Note: Where do these probabilities come from:
expert knowledge, data (retrieval frequency estimates)
Bayesian Network

A Bayesian network is directed acyclic graph (DAG) with the following


components:
• A set of nodes, one for each random variable of the “world” represented
• A set of directed arcs connecting nodes
• If there is an arc connecting X with Y, we say that X is a parent of Y
(parents(X) denotes the set of parents’ variable of X)
• From the concept of parent variable, we define also the concepts of ancestors
and descendants of a random variable.
• Each node Xi has an associated conditional probability distribution (CPT) P(Xi |
parents(Xi))
• If Xi is Boolean, we usually omit the probability of the false value.
Bayesian Network - Example
Q1: What is the probability that alarm has sounded, both neither burglary nor an
earthquake has occurred, and both John and Mary calls the police.

Q2: Formulate the probability that John calls.


Q3: Formulate the probability of burglary knowing that both John and Mary calls the
police?
Bayesian Network

Advantages:
• A Bayesian network is a complete and non-redundant representation of the
domain
• Each subcomponent interacts directly with only a bounded number of other
components, regardless of the total number of components
• Local structure is usually associated with linear rather than exponential growth
in complexity
• In domain of a Bayesian network, if each of the n random variables is
influenced by at most k others, then specifying each conditional probability
table will require at most 2k numbers
Bayesian Network

Disadvantages:
• BN tend to perform poorly on high dimensional data
• It may be wiser to leave out very weak dependencies from the network in
order to restrict the complexity, but this yields a lower accuracy
• Only useful when prior knowledge is reliable
• Adding nodes in the false order makes the network unnecessarily complex and
unintuitive
End of UNIT-III

You might also like