0% found this document useful (0 votes)

9 views65 pages

Bayesian Networks for AI Experts

Uploaded by

haifa.zaidi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views65 pages

Bayesian Networks for AI Experts

Uploaded by

haifa.zaidi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Bayesian Networks

Philipp Koehn

6 April 2017

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Outline 1

● Bayesian Networks

● Parameterized distributions

● Exact inference

● Approximate inference

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

bayesian networks

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Bayesian Networks 3

● A simple, graphical notation for conditional independence assertions

and hence for compact specification of full joint distributions

● Syntax
– a set of nodes, one per variable
– a directed, acyclic graph (link ≈ “directly influences”)
– a conditional distribution for each node given its parents:
P(Xi∣P arents(Xi))

● In the simplest case, conditional distribution represented as

a conditional probability table (CPT) giving the
distribution over Xi for each combination of parent values

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 4

● Topology of network encodes conditional independence assertions:

● W eather is independent of the other variables

● T oothache and Catch are conditionally independent given Cavity

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 5

● I’m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary
doesn’t call. Sometimes it’s set off by minor earthquakes.
Is there a burglar?

● Variables: Burglar, Earthquake, Alarm, JohnCalls, M aryCalls

● Network topology reflects “causal” knowledge

– A burglar can set the alarm off
– An earthquake can set the alarm off
– The alarm can cause Mary to call
– The alarm can cause John to call

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 6

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Compactness 7

● A conditional probability table for Boolean Xi with k Boolean parents has 2k

rows for the combinations of parent values

● Each row requires one number p for Xi = true

(the number for Xi = f alse is just 1 − p)

● If each variable has no more than k parents,

the complete network requires O(n ⋅ 2k ) numbers

● I.e., grows linearly with n, vs. O(2n) for the full joint distribution

● For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 25 − 1 = 31)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Global Semantics 8

● Global semantics defines the full joint distribution as the product of the local
conditional distributions:
n
P (x1, . . . , xn) = ∏ P (xi∣parents(Xi))
i=1

● E.g., P (j ∧ m ∧ a ∧ ¬b ∧ ¬e)

= P (j∣a)P (m∣a)P (a∣¬b, ¬e)P (¬b)P (¬e)

= 0.9 × 0.7 × 0.001 × 0.999 × 0.998
≈ 0.00063

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Local Semantics 9

● Local semantics: each node is conditionally independent

of its nondescendants given its parents

● Theorem: Local semantics ⇔ global semantics

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Markov Blanket 10

● Each node is conditionally independent of all others given its

Markov blanket: parents + children + children’s parents

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Constructing Bayesian Networks 11

● Need a method such that a series of locally testable assertions of

conditional independence guarantees the required global semantics
1. Choose an ordering of variables X1, . . . , Xn
2. For i = 1 to n
add Xi to the network
select parents from X1, . . . , Xi−1 such that
P(Xi∣P arents(Xi)) = P(Xi∣X1, . . . , Xi−1)

● This choice of parents guarantees the global semantics:

n
P(X1, . . . , Xn) = ∏ P(Xi∣X1, . . . , Xi−1) (chain rule)
i=1
n
= ∏ P(Xi∣P arents(Xi)) (by construction)
i=1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 12

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 13

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 14

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)? No
● P (B∣A, J, M ) = P (B∣A)?
● P (B∣A, J, M ) = P (B)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 15

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)? No
● P (B∣A, J, M ) = P (B∣A)? Yes
● P (B∣A, J, M ) = P (B)? No
● P (E∣B, A, J, M ) = P (E∣A)?
● P (E∣B, A, J, M ) = P (E∣A, B)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 16

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)? No
● P (B∣A, J, M ) = P (B∣A)? Yes
● P (B∣A, J, M ) = P (B)? No
● P (E∣B, A, J, M ) = P (E∣A)? No
● P (E∣B, A, J, M ) = P (E∣A, B)? Yes

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 17

● Deciding conditional independence is hard in noncausal directions

● (Causal models and conditional independence seem hardwired for humans!)
● Assessing conditional probabilities is hard in noncausal directions
● Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example: Car Diagnosis 18

● Initial evidence: car won’t start

● Testable variables (green), “broken, so fix it” variables (orange)
● Hidden variables (gray) ensure sparse structure, reduce parameters

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example: Car Insurance 19

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Compact Conditional Distributions 20

● CPT grows exponentially with number of parents

CPT becomes infinite with continuous-valued parent or child

● Solution: canonical distributions that are defined compactly

● Deterministic nodes are the simplest case:

X = f (P arents(X)) for some function f

● E.g., Boolean functions

N orthAmerican ⇔ Canadian ∨ U S ∨ M exican

● E.g., numerical relationships among continuous variables

∂Level
= inflow + precipitation - outflow - evaporation
∂t

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Compact Conditional Distributions 21

● Noisy-OR distributions model multiple noninteracting causes

– parents U1 . . . Uk include all causes (can add leak node)
– independent failure probability qi for each cause alone
Ô⇒ P (X∣U1 . . . Uj , ¬Uj+1 . . . ¬Uk ) = 1 − ∏ji = 1 qi

Cold F lu M alaria P (F ever) P (¬F ever)

F F F 0.0 1.0
F F T 0.9 0.1
F T F 0.8 0.2
F T T 0.98 0.02 = 0.2 × 0.1
T F F 0.4 0.6
T F T 0.94 0.06 = 0.6 × 0.1
T T F 0.88 0.12 = 0.6 × 0.2
T T T 0.988 0.012 = 0.6 × 0.2 × 0.1

● Number of parameters linear in number of parents

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Hybrid (Discrete+Continuous) Networks 22

● Discrete (Subsidy? and Buys?); continuous (Harvest and Cost)

● Option 1: discretization—possibly large errors, large CPTs

Option 2: finitely parameterized canonical families

● 1) Continuous variable, discrete+continuous parents (e.g., Cost)

2) Discrete variable, continuous parents (e.g., Buys?)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Continuous Child Variables 23

● Need one conditional density function for child variable given continuous
parents, for each possible assignment to discrete parents

● Most common is the linear Gaussian model, e.g.,:

P (Cost = c∣Harvest = h, Subsidy? = true)

= N (ath + bt, σt)(c)
2
1 1 c − (ath + bt)
= √ exp (− ( ) )
σt 2π 2 σt

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Continuous Child Variables 24

● All-continuous network with LG distributions

Ô⇒ full joint distribution is a multivariate Gaussian

● Discrete+continuous LG network is a conditional Gaussian network i.e., a

multivariate Gaussian over all continuous variables for each combination of
discrete variable values

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Discrete Variable w/ Continuous Parents 25

● Probability of Buys? given Cost should be a “soft” threshold:

● Probit distribution uses integral of Gaussian:

x
Φ(x) = ∫−∞ N (0, 1)(x)dx
P (Buys? = true ∣ Cost = c) = Φ((−c + µ)/σ)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Why the Probit? 26

● It’s sort of the right shape

● Can view as hard threshold whose location is subject to noise

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Discrete Variable 27

● Sigmoid (or logit) distribution also used in neural networks:

1
P (Buys? = true ∣ Cost = c) =
1 + exp(−2 −c+µ
σ )

● Sigmoid has similar shape to probit but much longer tails:

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

inference

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference Tasks 29

● Simple queries: compute posterior marginal P(Xi∣E = e)

e.g., P (N oGas∣Gauge = empty, Lights = on, Starts = f alse)

● Conjunctive queries: P(Xi, Xj ∣E = e) = P(Xi∣E = e)P(Xj ∣Xi, E = e)

● Optimal decisions: decision networks include utility information;

probabilistic inference required for P (outcome∣action, evidence)

● Value of information: which evidence to seek next?

● Sensitivity analysis: which probability values are most critical?

● Explanation: why do I need a new starter motor?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference by Enumeration 30

● Slightly intelligent way to sum out variables from the joint without actually
constructing its explicit representation

● Simple query on the burglary network

P(B∣j, m)
= P(B, j, m)/P (j, m)
= αP(B, j, m)
= α ∑e ∑a P(B, e, a, j, m)

● Rewrite full joint entries using product of CPT entries:

P(B∣j, m)
= α ∑e ∑a P(B)P (e)P(a∣B, e)P (j∣a)P (m∣a)
= αP(B) ∑e P (e) ∑a P(a∣B, e)P (j∣a)P (m∣a)

● Recursive depth-first enumeration: O(n) space, O(dn) time

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Enumeration Algorithm 31

function E NUMERATION -A SK(X, e, bn) returns a distribution over X

inputs: X, the query variable
e, observed values for variables E
bn, a Bayesian network with variables {X} ∪ E ∪ Y
Q(X ) ← a distribution over X, initially empty
for each value xi of X do
extend e with value xi for X
Q(xi ) ← E NUMERATE -A LL(VARS[bn], e)
return N ORMALIZE(Q(X ))
function E NUMERATE -A LL(vars, e) returns a real number
if E MPTY ?(vars) then return 1.0
Y ← F IRST(vars)
if Y has value y in e
then return P (y ∣ P a(Y )) × E NUMERATE -A LL(R EST(vars), e)
else return ∑y P (y ∣ P a(Y )) × E NUMERATE -A LL(R EST(vars), ey )
where ey is e extended with Y = y

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Evaluation Tree 32

● Enumeration is inefficient: repeated computation

e.g., computes P (j∣a)P (m∣a) for each value of e

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference by Variable Elimination 33

● Variable elimination: carry out summations right-to-left,

storing intermediate results (factors) to avoid recomputation
P(B∣j, m)
= α P(B) ∑e P (e) ∑a P(a∣B, e) P (j∣a) P (m∣a)
² ² ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¶
B E A J M

= αP(B) ∑e P (e) ∑a P(a∣B, e)P (j∣a)fM (a)

= αP(B) ∑e P (e) ∑a P(a∣B, e)fJ (a)fM (a)
= αP(B) ∑e P (e) ∑a fA(a, b, e)fJ (a)fM (a)
= αP(B) ∑e P (e)fĀJM (b, e) (sum out A)
= αP(B)fĒ ĀJM (b) (sum out E)
= αfB (b) × fĒ ĀJM (b)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Variable Elimination Algorithm 34

function E LIMINATION -A SK(X, e, bn) returns a distribution over X

inputs: X, the query variable
e, evidence specified as an event
bn, a belief network specifying joint distribution P(X1, . . . , Xn)
factors ← [ ]; vars ← R EVERSE(VARS[bn])
for each var in vars do
factors ← [M AKE -FACTOR(var , e)∣factors]
if var is a hidden variable then factors ← S UM -O UT(var, factors)
return N ORMALIZE(P OINTWISE -P RODUCT(factors))

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Irrelevant Variables 35

● Consider the query P (JohnCalls∣Burglary = true)

P (J∣b) = αP (b) ∑ P (e) ∑ P (a∣b, e)P (J∣a) ∑ P (m∣a)
e a m
Sum over m is identically 1; M is irrelevant to the query

● Theorem 1: Y is irrelevant unless Y ∈ Ancestors({X} ∪ E)

● Here
– X = JohnCalls, E = {Burglary}
– Ancestors({X} ∪ E) = {Alarm, Earthquake}
⇒ M aryCalls is irrelevant

● Compare this to backward chaining from the query in Horn clause KBs

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Irrelevant Variables 36

● Definition: moral graph of Bayes net: marry all parents and drop arrows

● Definition: A is m-separated from B by C iff separated by C in the moral graph

● Theorem 2: Y is irrelevant if m-separated from X by E

● For P (JohnCalls∣Alarm = true), both

Burglary and Earthquake are irrelevant

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Complexity of Exact Inference 37

● Singly connected networks (or polytrees)

– any two nodes are connected by at most one (undirected) path
– time and space cost of variable elimination are O(dk n)

● Multiply connected networks

– can reduce 3SAT to exact inference Ô⇒ NP-hard
– equivalent to counting 3SAT models Ô⇒ #P-complete

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

approximate inference

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference by Stochastic Simulation 39

● Basic idea
– Draw N samples from a sampling distribution S
– Compute an approximate posterior probability P̂
– Show this converges to the true probability P

● Outline
– Sampling from an empty network
– Rejection sampling: reject samples disagreeing with evidence
– Likelihood weighting: use evidence to weight samples
– Markov chain Monte Carlo (MCMC): sample from a stochastic process
whose stationary distribution is the true posterior

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Sampling from an Empty Network 40

function P RIOR -S AMPLE(bn) returns an event sampled from bn

inputs: bn, a belief network specifying joint distribution P(X1, . . . , Xn)
x ← an event with n elements
for i = 1 to n do
xi ← a random sample from P(Xi ∣ parents(Xi))
given the values of P arents(Xi) in x
return x

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 41

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 42

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 43

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 44

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 45

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 46

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 47

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Sampling from an Empty Network 48

● Probability that P RIOR S AMPLE generates a particular event

SP S (x1 . . . xn) = ∏ni= 1 P (xi∣parents(Xi)) = P (x1 . . . xn)
i.e., the true prior probability

● E.g., SP S (t, f, t, t) = 0.5 × 0.9 × 0.8 × 0.9 = 0.324 = P (t, f, t, t)

● Let NP S (x1 . . . xn) be the number of samples generated for event x1, . . . , xn

● Then we have lim P̂ (x1, . . . , xn) = lim NP S (x1, . . . , xn)/N

N →∞ N →∞
= SP S (x1, . . . , xn)
= P (x1 . . . xn)

● That is, estimates derived from P RIOR S AMPLE are consistent

● Shorthand: P̂ (x1, . . . , xn) ≈ P (x1 . . . xn)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Rejection Sampling 49

● P̂(X∣e) estimated from samples agreeing with e

function R EJECTION -S AMPLING(X, e, bn, N) returns an estimate of P (X ∣e)

local variables: N, a vector of counts over X, initially zero
for j = 1 to N do
x ← P RIOR -S AMPLE(bn)
if x is consistent with e then
N[x] ← N[x]+1 where x is the value of X in x
return N ORMALIZE(N[X])

● E.g., estimate P(Rain∣Sprinkler = true) using 100 samples

27 samples have Sprinkler = true
Of these, 8 have Rain = true and 19 have Rain = f alse
● P̂(Rain∣Sprinkler = true) = N ORMALIZE(⟨8, 19⟩) = ⟨0.296, 0.704⟩
● Similar to a basic real-world empirical estimation procedure

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Analysis of Rejection Sampling 50

● P̂(X∣e) = αNP S (X, e) (algorithm defn.)

= NP S (X, e)/NP S (e) (normalized by NP S (e))
≈ P(X, e)/P (e) (property of P RIOR S AMPLE)
= P(X∣e) (defn. of conditional probability)

● Hence rejection sampling returns consistent posterior estimates

● Problem: hopelessly expensive if P (e) is small

● P (e) drops off exponentially with number of evidence variables!

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting 51

● Idea: fix evidence variables, sample only nonevidence variables,

and weight each sample by the likelihood it accords the evidence

function L IKELIHOOD -W EIGHTING(X, e, bn, N) returns an estimate of P (X ∣e)

local variables: W, a vector of weighted counts over X, initially zero
for j = 1 to N do
x, w ← W EIGHTED -S AMPLE(bn)
W[x ] ← W[x ] + w where x is the value of X in x
return N ORMALIZE(W[X ])

function W EIGHTED -S AMPLE(bn, e) returns an event and a weight

x ← an event with n elements; w ← 1
for i = 1 to n do
if Xi has a value xi in e
then w ← w × P (Xi = xi ∣ parents(Xi ))
else xi ← a random sample from P(Xi ∣ parents(Xi ))
return x, w

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 52

w = 1.0

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 53

w = 1.0

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 54

w = 1.0

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 55

w = 1.0 × 0.1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 56

w = 1.0 × 0.1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 57

w = 1.0 × 0.1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 58

w = 1.0 × 0.1 × 0.99 = 0.099

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Analysis 59

● Sampling probability for W EIGHTED S AMPLE is

SW S (z, e) = ∏li = 1 P (zi∣parents(Zi))

● Note: pays attention to evidence in ancestors only

Ô⇒ somewhere “in between” prior and
posterior distribution

● Weight for a given sample z, e is

w(z, e) = ∏mi = 1 P (ei ∣parents(Ei ))

● Weighted sampling probability is

SW S (z, e)w(z, e)
= ∏li = 1 P (zi∣parents(Zi)) ∏m
i = 1 P (ei ∣parents(Ei ))
= P (z, e) (by standard global semantics of network)

● Hence likelihood weighting returns consistent estimates

but performance still degrades with many evidence variables
because a few samples have nearly all the total weight

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Approximate Inference using MCMC 60

● “State” of network = current assignment to all variables

● Generate next state by sampling one variable given Markov blanket
Sample each variable in turn, keeping evidence fixed

function MCMC-A SK(X, e, bn, N) returns an estimate of P (X ∣e)

local variables: N[X ], a vector of counts over X, initially zero
Z, the nonevidence variables in bn
x, the current state of the network, initially copied from e
initialize x with random values for the variables in Y
for j = 1 to N do
for each Zi in Z do
sample the value of Zi in x from P(Zi ∣mb(Zi ))
given the values of M B(Zi ) in x
N[x ] ← N[x ] + 1 where x is the value of X in x
return N ORMALIZE(N[X ])

● Can also choose a variable to sample at random each time

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

The Markov Chain 61

● With Sprinkler = true, W etGrass = true, there are four states:

● Wander about for a while, average what you see

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

MCMC Example 62

● Estimate P(Rain∣Sprinkler = true, W etGrass = true)

● Sample Cloudy or Rain given its Markov blanket, repeat.

Count number of times Rain is true and false in the samples.

● E.g., visit 100 states

31 have Rain = true, 69 have Rain = f alse

● P̂(Rain∣Sprinkler = true, W etGrass = true)

= N ORMALIZE(⟨31, 69⟩) = ⟨0.31, 0.69⟩

● Theorem: chain approaches stationary distribution:

long-run fraction of time spent in each state is exactly
proportional to its posterior probability

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Markov Blanket Sampling 63

● Markov blanket of Cloudy is Sprinkler and Rain

● Markov blanket of Rain is

Cloudy, Sprinkler, and W etGrass

● Probability given the Markov blanket is calculated as follows:

P (x′i∣mb(Xi)) = P (x′i∣parents(Xi)) ∏Zj ∈Children(Xi) P (zj ∣parents(Zj ))

● Easily implemented in message-passing parallel systems, brains

● Main computational problems

– difficult to tell if convergence has been achieved
– can be wasteful if Markov blanket is large:
P (Xi∣mb(Xi)) won’t change much (law of large numbers)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Summary 64

● Bayes nets provide a natural representation for (causally induced)

conditional independence
● Topology + CPTs = compact representation of joint distribution
● Generally easy for (non)experts to construct
● Canonical distributions (e.g., noisy-OR) = compact representation of CPTs
● Continuous variables Ô⇒ parameterized distributions (e.g., linear Gaussian)
● Exact inference by variable elimination
– polytime on polytrees, NP-hard on general graphs
– space = time, very sensitive to topology
● Approximate inference by LW, MCMC
– LW does poorly when there is lots of (downstream) evidence
– LW, MCMC generally insensitive to topology
– Convergence can be very slow with probabilities close to 1 or 0
– Can handle arbitrary combinations of discrete and continuous variables

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Unit 5
No ratings yet
Unit 5
27 pages
A.I.question Bank
100% (1)
A.I.question Bank
28 pages
Lecture Bayesian Networks
No ratings yet
Lecture Bayesian Networks
50 pages
PPT06-Probabilistic Reasoning
No ratings yet
PPT06-Probabilistic Reasoning
31 pages
Bayesian and Inference
No ratings yet
Bayesian and Inference
86 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
58 pages
Bayes Nets 2016
No ratings yet
Bayes Nets 2016
62 pages
Chapter 13
No ratings yet
Chapter 13
65 pages
13 Bayes Nets
No ratings yet
13 Bayes Nets
38 pages
Bayesian Neworks
No ratings yet
Bayesian Neworks
32 pages
Unit-5 Bayes' Rule and Bayesian Network
No ratings yet
Unit-5 Bayes' Rule and Bayesian Network
9 pages
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
No ratings yet
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
58 pages
Lecture 5 Bayesian Networks
No ratings yet
Lecture 5 Bayesian Networks
12 pages
Bayesian Networks
No ratings yet
Bayesian Networks
8 pages
Bayesian Belief Network in Artificial Intelligence
No ratings yet
Bayesian Belief Network in Artificial Intelligence
10 pages
Probabilistic Reasoning in AI
No ratings yet
Probabilistic Reasoning in AI
98 pages
Bayesian Networks and Inference
No ratings yet
Bayesian Networks and Inference
50 pages
Unit 6
No ratings yet
Unit 6
126 pages
Bayesian Networks
No ratings yet
Bayesian Networks
45 pages
Bayesian Networks: Chapter 14, Sections 1-4
No ratings yet
Bayesian Networks: Chapter 14, Sections 1-4
22 pages
21 BN 20
No ratings yet
21 BN 20
59 pages
Slides Module9
No ratings yet
Slides Module9
53 pages
Aiml Unit 2
No ratings yet
Aiml Unit 2
15 pages
Uncertain Knowledge
No ratings yet
Uncertain Knowledge
31 pages
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
No ratings yet
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
31 pages
Mount Zion College of Engineering and Technology
No ratings yet
Mount Zion College of Engineering and Technology
22 pages
2021 Lecture09 BayesianNetworks
No ratings yet
2021 Lecture09 BayesianNetworks
60 pages
2025 MMP AI-KRR Unit 4 Uncertain Knowledge and Reasoning
No ratings yet
2025 MMP AI-KRR Unit 4 Uncertain Knowledge and Reasoning
79 pages
AI14
No ratings yet
AI14
6 pages
Baes Rule
No ratings yet
Baes Rule
8 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
37 pages
Bayesian Networks in AI
100% (1)
Bayesian Networks in AI
8 pages
Libpgm For Bayesian Networks: Dr. A. Obulesh Associate Professor
No ratings yet
Libpgm For Bayesian Networks: Dr. A. Obulesh Associate Professor
59 pages
Ai Unit 2
No ratings yet
Ai Unit 2
30 pages
4.2 Bayesian Networks
No ratings yet
4.2 Bayesian Networks
18 pages
AI Unit-3
No ratings yet
AI Unit-3
51 pages
AIFA 25 Bayesian Logic 120324
No ratings yet
AIFA 25 Bayesian Logic 120324
33 pages
4.2 Bayes-Nets
No ratings yet
4.2 Bayes-Nets
33 pages
CS480 Lecture October24th
No ratings yet
CS480 Lecture October24th
90 pages
Unit 2
No ratings yet
Unit 2
45 pages
Bayesian Belief Network
No ratings yet
Bayesian Belief Network
41 pages
EECS6895 AdvancedBigDataAnalytics Lecture6
No ratings yet
EECS6895 AdvancedBigDataAnalytics Lecture6
81 pages
L08 Probabilistic Reasoning
No ratings yet
L08 Probabilistic Reasoning
90 pages
Learning Bayesian Networks Richard E. Neapolitan Instant Download Full Chapters
No ratings yet
Learning Bayesian Networks Richard E. Neapolitan Instant Download Full Chapters
158 pages
Bayesian Networks
No ratings yet
Bayesian Networks
16 pages
ANU July2001 Tutorial 4
No ratings yet
ANU July2001 Tutorial 4
28 pages
Learning Bayesian Networks Richard E. Neapolitan Download Full Chapters
No ratings yet
Learning Bayesian Networks Richard E. Neapolitan Download Full Chapters
153 pages
Question Bank AI 3,4,5
No ratings yet
Question Bank AI 3,4,5
28 pages
Aiml QB - Unit 2
No ratings yet
Aiml QB - Unit 2
33 pages
Ai Notes
No ratings yet
Ai Notes
68 pages
cs221 LEC 7 Slides
No ratings yet
cs221 LEC 7 Slides
24 pages
Lec7 - Bayesian Network I
No ratings yet
Lec7 - Bayesian Network I
62 pages
Ai Pro
No ratings yet
Ai Pro
11 pages
Artificial Intelligence: Adina Magda Florea
No ratings yet
Artificial Intelligence: Adina Magda Florea
36 pages
Aids Lab PDF
No ratings yet
Aids Lab PDF
53 pages
Contact Session6
No ratings yet
Contact Session6
57 pages
Exp1 A09 DS
No ratings yet
Exp1 A09 DS
6 pages
AI Unit 5
No ratings yet
AI Unit 5
35 pages
Probabilistic Reasoning in AI
No ratings yet
Probabilistic Reasoning in AI
11 pages
Cs3491 Aiml Unit 2 Qbank
No ratings yet
Cs3491 Aiml Unit 2 Qbank
33 pages
Industrial Applications of Machine Learning PDF
100% (5)
Industrial Applications of Machine Learning PDF
349 pages
Unit 3.5 & 5 ML
No ratings yet
Unit 3.5 & 5 ML
16 pages
On The Contribution To The Alignment During An Organizational Change: Measurement of Job Satisfaction With Working Conditions
No ratings yet
On The Contribution To The Alignment During An Organizational Change: Measurement of Job Satisfaction With Working Conditions
13 pages
AI Unit 5
No ratings yet
AI Unit 5
295 pages
Localized Vending Machine Recommendations
No ratings yet
Localized Vending Machine Recommendations
10 pages
Classification
No ratings yet
Classification
81 pages
6CS4 AI Unit-3@Zammers
No ratings yet
6CS4 AI Unit-3@Zammers
134 pages
Eled2412 DL
No ratings yet
Eled2412 DL
42 pages
A Holistic Approach To Food Fraud Vulnerability Assessment 2022
No ratings yet
A Holistic Approach To Food Fraud Vulnerability Assessment 2022
12 pages
Data Mining Attrition Analysis
No ratings yet
Data Mining Attrition Analysis
14 pages
Android Malware Detection Tool
No ratings yet
Android Malware Detection Tool
6 pages
Predicting Flight Delays With Error Calculation Using Machine Learned Classifiers
No ratings yet
Predicting Flight Delays With Error Calculation Using Machine Learned Classifiers
6 pages
ML - Unit4pdf
No ratings yet
ML - Unit4pdf
65 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
73 pages
A Review On Bayesian Modeling Approach To Quantify Failure Risk Assessment of Oil and Gas Pipelines Due To Corrosion
No ratings yet
A Review On Bayesian Modeling Approach To Quantify Failure Risk Assessment of Oil and Gas Pipelines Due To Corrosion
19 pages
AI & ML Basics for Business Students
No ratings yet
AI & ML Basics for Business Students
32 pages
Unmanned Vehicle Sensor Systems
No ratings yet
Unmanned Vehicle Sensor Systems
175 pages
Unit-4 Knowledge Representation
No ratings yet
Unit-4 Knowledge Representation
31 pages
DNA and RNA Structure FAQs
No ratings yet
DNA and RNA Structure FAQs
11,493 pages
Advanced Neural Network: Multiple Choice Questions and Answers
No ratings yet
Advanced Neural Network: Multiple Choice Questions and Answers
39 pages
ML Lab Manual-1 (1) (M.tech CSE) NEW
No ratings yet
ML Lab Manual-1 (1) (M.tech CSE) NEW
51 pages
Viva Questions
100% (1)
Viva Questions
37 pages
Instant Download Bayesian Nets and Causality Philosophical and Computational Foundations Williamson PDF All Chapters
100% (2)
Instant Download Bayesian Nets and Causality Philosophical and Computational Foundations Williamson PDF All Chapters
77 pages
Alarm Problem
No ratings yet
Alarm Problem
4 pages
AI Knowledge Representation Guide
No ratings yet
AI Knowledge Representation Guide
25 pages
Classification & Prediction Techniques
No ratings yet
Classification & Prediction Techniques
71 pages
Exact Solution For Partition Function of General Ising Model in Magnetic Fields and Bayesian Networks
No ratings yet
Exact Solution For Partition Function of General Ising Model in Magnetic Fields and Bayesian Networks
8 pages
Quantum ML: Challenges & Future
No ratings yet
Quantum ML: Challenges & Future
38 pages