Bayesian Learning Introduction Bayes08 PDF
Bayesian Learning Introduction Bayes08 PDF
Bayesian Learning Introduction Bayes08 PDF
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Introduction
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline
Introduction
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Given:
a set of micro-array experiments, each done with mRNA from
a different patient (same cell type from every patient)
Patients expression values for each gene constitute the
features, and
patients disease constitutes the class
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Person
A28202 ac AB00014 at AB00015 at
Person1
1144.0
321.0
2567.2
Person2
105.2
586.1
759.2
Person3
586.3
559.0
3210.2
Person4
42.8
692.0
812.2
Learning Problem:
Find groups of genes particularly active for groups of
...
...
...
...
...
Persons.
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Person
Person1
Person2
Person3
Person4
A28202 ac
1144.0
105.2
586.3
42.8
AB00014 at
321.0
586.1
559.0
692.0
AB00015 at
2567.2
759.2
3210.2
812.2
...
...
...
...
...
Class
normal
cancer
normal
cancer
Learning Problems:
Find a function: Class = f(A28202 ac, AB00014 at,
AB00015 at, . . .)
Given the expression level of genes of a Person, predict if he
has cancer or not.
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
DNA: T G C A G C T C C G G A C T C C A T
mRNA: A C G U C G A G G C C U G A G G U A
Exons: Sequences of nucleotides that are expressed
(translated to proteins)
Introns: Intercalated sequences eliminated during the
translation Non-coding regions
Splice-junctions: Frontiers between an exon and an intron
Donors: border exon-intron
Acceptors: border intron-exon
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
PAP
SHUNT
MINVOLSET
KINKEDTUBE
INTUBATION
VENTMACH
VENTLUNG
DISCONNECT
VENITUBE
PRESS
MINOVL
ANAPHYLAXIS
SAO2
TPR
HYPOVOLEMIA
LVEDVOLUME
CVP
PCWP
LVFAILURE
STROEVOLUME
FIO2
VENTALV
PVSAT
ARTCO2
EXPCO2
INSUFFANESTH
CATECHOL
HISTORY
ERRBLOWOUTPUT
CO
HR
HREKG
ERRCAUTER
HRSAT
HRBP
BP
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
A Bayesian net:
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
A Bayesian net:
Diagnosis:
Observing
Positive X-ray and propagating this evidence
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline
Introduction
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Bayes Theorem
What is the most probable hypothesis h, given training data D?
A method to calculate the probability of a hypothesis based on
its prior probability,
the probability of observing the data given the hypothesis,
The data itself
P(h|D) =
P(h)P(D|h)
P(D)
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Illustrative Example
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Illustrative Example
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Illustrative Example
Prior Probabilities:
P(Win) =
P(Lose) =
P(red) =
P(black) =
P(black|Win) =
P(red|Win) =
After seeing the bead:
P(Win|black) =
P(Win|red) =
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Illustrative Example
Prior Probabilities:
P(Win) = 1/2
P(Lose) = 1/2
P(red) = 3/7
P(black) = 4/7
P(black|Win) = 1/2
P(red|Win) = 1/2
After seeing the bead:
If bead = black:
=
P(Win|black) = P(Win)P(black|Win)
P(black)
If bead = red: P(Win|red) =
1/21/2
= 0.4375
4/7
P(Win)P(red|Win)
= 1/21/2
P(red)
3/7
= 0.583
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Bayesians ...
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Bayesians ...
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline
Introduction
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
An Illustrative Problem:
A patient takes a lab test and the result comes back positive. The
test returns a correct positive result in only 75% of the cases in
which the disease is actually present, and a correct negative result
in only 96% of the cases in which the disease is not present.
Furthermore, 8% of the entire population have this cancer.
How to represent that information?
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Representation:
It is useful to represent this information in a graph.
The graphical information is
qualitative
The nodes represent
variables.
Arcs specify the
(in)dependence between
variables. Direct arcs
represent influence between
variables.
The direction of the arc tell us that the value of the variable
disease influences the value of the variable test.
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Inference:
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
C
A
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
YMAP
= argmax
yj Y
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Naive Bayes
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
ln(P(xj |yi )
P(y+ |~x )
P(y+ ) X P(xj |y+ )
ln
+
ln
P(y |~x )
P(y )
P(xj |y )
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
yj Y
xi X
i |yj )
P(x
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Weather
Rainy
Sunny
Sunny
Overcast
Rainy
Rainy
Overcast
Overcast
Sunny
Rainy
Overcast
Sunny
Sunny
Rainy
Temperature
71
69
80
83
70
65
64
72
75
68
81
85
72
75
Humidity
91
70
90
86
96
70
65
90
70
80
75
85
95
80
Wind
Yes
No
Yes
No
No
Yes
Yes
Yes
Yes
No
No
No
No
No
Play
No
Yes
No
Yes
Yes
No
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Two representations:
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Nr. examples: 14
Play =0 Yes 0 : 9
Play =0 No 0 : 5
Weather
Yes
Sunny
2
Overcast
4
Rainy
3
No
3
0
2
Temperature
Yes
No
Mean
73
74.6
SD
6.2
7.9
Humidity
Yes
No
Mean
79.1 86.2
SD
10.2
9.3
Wind
Yes
False
6
True
3
No
2
3
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Nr. examples: 14
P(Play =0 Yes 0 ) = 9/14
P(Play =0 No 0 ) = 5/14
Weather
Yes
Sunny
2/9
Overcast
4/9
Rainy
3/9
No
3/5
0/5
2/5
Temperature
Yes
No
Mean
73
74.6
SD
6.2
7.9
Humidity
Yes
No
Mean
79.1 86.2
SD
10.2
9.3
Wind
Yes
False
6/9
True
3/9
No
2/5
3/5
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Weather
Sunny
Temperature
66
Humidity
90
Wind
Yes
Play
?
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Discretization
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
j)
argmax P(y
yj Y
Y
i
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Successful Stories
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline
Introduction
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Mutual Information
Uncertainty of a random Variable (Entropy):
X
H(X ) =
P(x)log2 (P(x))
Uncertainty about X after knowing Y (Conditional Entropy):
X
X
H(X |Y ) =
P(y )
P(x|y )logP(x|Y )
y
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Example: TAN
Tree augmented naive Bayes
Compute the Mutual Information
between all pairs of variables given the
Class
X
I (X , Y |C ) =
P(c)I (X , Y |C = c)
c
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Weka: TAN
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Outline
Introduction
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Software Available
R (package e1071)
Weka (naive Bayes, TAN models, k-dependence Bayesian classifiers)
Genie: Bayesian Networks, Influence Diagrams, Dynamic Bayesian Networks
(http://genie.sis.pitt.edu/)
Hugin (http://www.hugin.com/)
Elvira (http://www.ia.uned.es/ elvira)
Kevin Murphys MATLAB toolbox - supports dynamic BNs, decision networks,
many exact and approximate inference algorithms, parameter estimation, and
structure learning
(http://www.ai.mit.edu/ murphyk/Software/BNT/bnt.html)
Free Windows software for creation, assessment and evaluation of belief
networks.
(http://www.research.microsoft.com/dtas/msbn/default.htm)
Open source package for the technical computing language R, developed by
Aalborg University
(http://www.math.auc.dk/novo/deal)
http://www.snn.ru.nl/nijmegen
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a
Bibliography
Outline Motivation: Information Processing Introduction Bayesian Network Classifiers k-Dependence Bayesian Classifiers Links a