Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views5 pages

IdeCozmanRamos ECAI04

Uploaded by

jside.sjc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

IdeCozmanRamos ECAI04

Uploaded by

jside.sjc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Generating Random Bayesian Networks with Constraints

on Induced Width
Jaime S. Ide and Fabio G. Cozman and Fabio T. Ramos1

Abstract. We present algorithms for the generation of uniformly gested by T. Kocka (personal communication), would be to produce
distributed Bayesian networks with constraints on induced width. Bayesian networks with a large number of equivalent graphs, as this
The algorithms use ergodic Markov chains to generate samples. The is a property observed in real networks. However we would like to
introduction of constraints on induced width leads to realistic net- use properties with clear intuitive meaning, so that users of our algo-
works but requires new techniques. A tool that generates random rithms would quickly grasp the properties of generated networks.
networks is presented and applications are discussed. A quantity that characterizes the algorithmic complexity of
Bayesian networks, and is easy to explain and to understand, is the in-
duced width. Indirectly, the induced width captures how dense a net-
1 INTRODUCTION work is. Besides, it makes sense to control induced width, as we are
It is often the case that theoretical questions involving artificial in- usually interested in comparing algorithms or parameterizing results
telligence techniques are hard to answer exactly. Many such ques- with respect to the complexity of the underlying network.3 Unfortu-
tions appear in the theory of Bayesian networks; for example, How nately, the generation of random graphs with constraints on induced
does quasi-random sampling algorithms compare to pseudo-random width is significantly more involved than the generation of graphs
sampling? Significant insight into such questions could be obtained with constraints on node degree and number of edges. In this paper
by analyzing large samples of Bayesian networks. However it may we report on new algorithms that accomplish generation of graphs
be difficult to collect hundreds of “real” Bayesian networks for an with simultaneous constraints on all these quantities: induced width,
experiment, or it may be the case that an experiment must be con- node degree, and number of edges.
ducted for a specific type of Bayesian network for which few “real” Following the work of Ide and Cozman [5], we divide the genera-
examples are available. One must then randomly generate Bayesian tion of random Bayesian networks into two steps. First we generate
networks that are somehow close to “real” networks. In fact, many a random directed acyclic graph that satisfies constraints on induced
researchers have used random processes to generate networks in the width, node degree, and number of edges; then we generate proba-
past, but without guarantees of that every allowed graph is produced bility distributions for the graph. To generate the random graph, we
with the same uniform probability (for example, [14, 15]). construct ergodic Markov chains with appropriate stationary distribu-
We would like to have a method that generates Bayesian networks tions, so that successive sampling from the chains leads to the gen-
uniformly; that is, we would like to guarantee that averages taken eration of properly distributed networks. The necessary theory and
with generated networks produce unbiased estimates. We would also algorithms are presented in Sections 2 and 3.
like to have generation methods that are flexible in the sense that con- The methods presented in this paper focus on Bayesian networks,
straints on generated networks can be added with relative ease. For but they convey a general method for generation of testing examples
example, it should be possible to add a constraint on the maximum in artificial intelligence. The idea is to generate uniformly distributed
number of parents for nodes, the average number of children, or the examples using Markov chains. This strategy allows one to easily
maximum number of loops. Ad hoc methods are usually concocted add and modify constraints on the generated examples, provided that
for a particular set of constraints, and it is hard to imagine ways to a few steps are taken. The theory in Section 3 can serve as a guide
add constraints to them. for exactly what steps must be taken to guarantee appropriate results.
Finally, we would like to generate “realistic” networks, however A freely distributed program for Bayesian network generation is
hard it may be to define what is a “real” Bayesian network. A rea- presented in Section 4. In Section 4 we also discuss applications of
sonable strategy is to look for properties that are commonly used random networks.
to characterize Bayesian networks, and to allow some control over
them. This is the strategy followed by Ide and Cozman [5]: they allow 2 BASIC CONCEPTS
control over the degree of a node, thus allowing some control over
the “density” of the connections in the generated Bayesian networks. This section summarizes material from [5] and [3].
We have found that such a strategy is reasonable but not perfect. Re- A directed graph is composed of a set of nodes and a set of edges.
strictions solely on node degree and number of edges lead to “overly An edge (u; v ) goes from a node u (the parent) to a node v (the
random” edges — real networks often have their variables distributed child). A path is a sequence of nodes such that each pair of consecu-
in groups, with few edges between groups.2 Another strategy, sug- tive nodes is adjacent. A path is a cycle if it contains more than two
nodes and the first and last nodes are the same. A cycle is directed
1 Escola Politécnica, Univ. de São Paulo, São Paulo, Brazil. Email: if we can reach the same nodes while following arcs that are in the
[email protected]
2 Tomas Kocka brought this fact to our attention. 3 Carlos Brito suggested this strategy.
same direction. A directed graph is acyclic (a DAG) if it contains no s
( )
j s
which pii > 0, is equal to one (that is, G:C:D:(s pii > 0) = 1).
( )

directed cycles. A graph is connected if there exists a path between Aperiodicity is ensured if pii > 0 (pii is a self-loop probability).
every pair of nodes. A graph is singly-connected, also called a poly- A Markov chain is ergodic if there exists a vector  (the stationary
(s)
tree, if there exists exactly one path between every pair of nodes; distribution) satisfying lims !1 pij = j , for all i and j ; a finite
otherwise, the graph is multiply-connected (or multi-connected for aperiodic, irreducible and positive recurrent chain is ergodic. A tran-
short). An extreme sub-graph of a polytree is a sub-graph that is
connected to the remainder of the polytree by a single path. In an P P
N pij = 1 and N pij = 1). A Markov
sition matrix is called doubly stochastic if the rows and columns sum
to one (that is, if j=1 i=1
undirected graph, the direction of the edges is ignored. An ordered chain with such a transition matrix has a uniform stationary distribu-
graph is a pair containing an undirected graph and an ordering of tion [11].
nodes. The width of a node in an ordered graph is the number of its
neighbors that precede it in the ordering. The width of an ordering is
the maximum width over all nodes. The induced width of an ordered 3 GENERATING RANDOM DAGS
graph is the width of the ordered graph obtained as follows: nodes are In this section we show how to generate random DAGs with con-
processed from last to first; when node X is processed, all its preced- straints on induced width, node degree and number of edges. After
ing nodes are connected (call these connections induced connections such a random DAG is generated, it is easy to construct a complete
and the resulting graph induced graph). An example is presented at Bayesian network by randomly generating associated probability dis-
Figure 1. The induced width of a graph is the minimal induced width tributions — if all variables in the Bayesian network are categorical,
over any ordering; the computation of induced width is an NP-hard probability distributions are produced by sampling Dirichlet distri-
problem [3], and computations are usually based on heuristics [6]. butions. More general methods can be contemplated (for example,
A Bayesian network represents a joint probability density over a
set of variables X
[10]. The density is specified through a directed
it may be interesting to generate logical nodes together with proba-
bilistic nodes) and are left for future work.
acyclic graph; every node in the graph is associated with a variable
X
Xi in , and with a conditional probability density p(Xi pa(Xi )), j To generate random DAGs with specific constraints, we construct
an ergodic Markov chain with uniform limiting distribution, such that
where pa(Xi ) denotes the parents of Xi in the graph. A Bayesian
Q
network represents a unique joint probability density [10]: p( ) =
j
i p(Xi pa(Xi )) (consequence of a Markov condition). The moral
X every state of the chain is a DAG satisfying the constraints. By run-
ning the chain for many iterations, eventually we obtain a satisfactory
DAG.
graph of a Bayesian network is obtained by connecting parents of Algorithm PMMixed produces an ergodic Markov chain with the
any variable and ignoring direction of edges. The induced width of a required properties (Figure 2). The algorithm is significantly more
Bayesian network is the induced width of its moral graph. An infer- complex than the algorithms presented by Ide and Cozman [5]. The
ence is a computation of a posterior probability density for a query added complexity comes from the constraints in induced width. Such
variable given observed variables; the complexity of inferences is a price is worth paying as the induced width is a property that charac-
directly related to the induced width of the underlying Bayesian net- terizes a Bayesian network much more accurately than node degree.
work [3]. The algorithm works as follows. We create a set of n nodes (from
We use Markov chains to generate random graphs, following [8]. 0 to n
Consider a Markov chain Xt ; t f  g
0 over finite domains S and
1) and a simple network to start. The loop between lines 03

P = (pij )M
and 08 constructs the next state (next DAG) from the current state.
ij=1 to be a M x M matrix representing transition prob- Lines 05 and 08 verify whether the induced width of the current DAG
abilities, where M is the number of states and pij = P r(Xt+1 =
j
j Xt = i), for all t [11, 13]. The s-step transition probabilities is
satisfies the maximum value allowed; constraints on maximum node
(s)
given by P s = pij = P r(Xt+s = j Xt = i), independent of t.
degree and maximum number of edges must also be checked there.
j If the current DAG is a polytree, the next DAG is constructed in lines
A Markov chain is irreducible if for all i,j there exists s that satis-
(s)
04 and 05; if the current DAG is multi-connected, the next DAG is
fies pij > 0. A Markov chain is irreducible if and only if all pair constructed in lines 07 and 08. Depending on the current graph, dif-
of states intercommunicate. A Markov chain is positive recurrent if ferent operations are performed (the procedures AorR and AR cor-
every state i 2
S can be returned to in a finite number of steps; it respond to the valid operations). Note that the particular procedure
follows a that finite irreducible chain is positive recurrent. A Markov to be performed and the acceptance (or not) of the resulting DAG is
chain is aperiodic if the greatest common divisor of all those s for probabilistic, parameterized by p.
Algorithm PMMixed is essentially a mixture of procedures AorR
and AR. These procedures are used by Ide and Cozman [5] to pro-
B B F L duce respectively multi-connected graphs and polytrees with con-
F F
straints on node degree. We need both to guarantee irreducibility of
L H Markov chains when constraints on induced width are present; the
D D
procedure AR creates a needed “path” in the space of polytrees that
L L
D D is used in Theorem 3. The mixture of procedures has two other ben-
H H efits: first, it creates more complex transitions, hopefully increasing
H B the convergence of the chain; second, it eliminates a restriction on
(a) (b) node degree that was needed by Ide and Cozman [5].
B F The PMMixed algorithm can be understood as a sequence of prob-
(c) (d)
abilistic transitions that follow the scheme in Figure 3.
Figure 1. a) Network, b) moral graph, c) induced graph for ordering We now establish ergodicity of Algorithm PMMixed.
F; L; D; H; B , and d) induced graph for ordering L; H; D; B; F . Dashed
lines represent induced connections. Theorem 1 The Markov chain generated by Algorithm PMMixed is
aperiodic.
Algorithm PMMixed: Generating DAGs with induced Proof. If we have symmetric transition probabilities between two
width control neighbor states, its rows and columns sum one, because the self-
Input: Number of nodes (n), number of iterations (N ), max- loop probabilities are complementary to all other probabilities. Pro-
imum induced width, and possibly constraints on node degree cedure AorR is clearly symmetric; procedure AR is also symmet-
and number of nodes. ric [5]. We just have to check that transitions between polytrees and
Output: A connected DAG with n nodes. multi-connected graphs are symmetric; this is true because transi-
01. Create a network with n nodes, where all nodes have just tions from polytree to multi-connected are accepted with probability
one parent, except the first node that does not have any parent; p, and multi-connected to polytree transitions are also accepted with
02. Repeat N times: the same probability. QED
03. If current graph is a polytree:
04. With probability p, call Procedure AorR; with We need the following lemma to prove Theorem 3.
probability (1 p), call Procedure AR. Lemma 1 After removal of an arc from a multi-connected DAG, its
05. If the resulting graph satisfies imposed induced width does not increase.
constraints, accept the graph;
otherwise, keep previous graph; Proof. When we remove an arc, the moral graph stays the same or
06. else (graph is multi-connected): contains less arcs; by just keeping the same ordering, the induced
07. Call Procedure AorR. width cannot increase. QED
08. If the resulting graph is a polytree and satisfies
imposed constraints, accept with probabil- Theorem 3 The Markov chain generated by Algorithm PMMixed is
ity p; else accept if it satisfies imposed irreducible.
constraints; otherwise keep previous graph.
09. Return current graph after N iterations. Proof. Suppose that we have a multi-connected DAG with n nodes; if
we prove that from this graph we can reach a simple sorted tree (Fig-
Procedure AR: Add and Remove ure 4 (c)), the opposite transformation is also true, because of the
01. Generate uniformly a pair of distinct nodes i; j ; symmetry of our transition matrix — and therefore we could reach
02. If the arc (i; j ) exists in the current graph, keep the same any state from any other (during these transitions, graphs must re-
state; else main acyclic, connected and must satisfy imposed constraints). So,
03. Invert the arc with probability 1/2 to (j; i), and then we start by finding a loop cutset and removing enough arcs to obtain
04. Find the predecessor node k in the path between i and j , a polytree from the multi-connected DAG [10]. The induced width
remove the arc between k and j , and add an arc (i; j ) or arc does not increase during removal operations by Lemma 1. From a
(j; i) depending on the result of line 03. polytree we can move to a simple polytree (Figure 4 (b)) in a recur-
Procedure AorR: Add or Remove sive way. For all extreme sub-graphs of our polytree, for each pair
01. Generate uniformly a pair of distinct nodes i; j ; of extreme sub-graphs (call them branches), it is possible to “cut” a
02. If the arc (i; j ) exists in the current graph, delete the arc, branch and add it in the other branch, by the procedure AR, with-
provided that the underlying graph remains connected; else out ever increasing the induced width. Doing this we get a unique
03. Add the arc if the underlying graph remains acyclic, other- branch. If we have more than two branches connected to a node, we
wise keep same state. repeat this process by pairs; we do this recursively until get a simple
polytree. Now that we have a simple polytree, we get a simple tree
(Figure 4 (a)) just inverting arcs to the same direction, without ever
Figure 2. Algorithm for generating DAGs, mixing operations AR and getting an induced width greater than two. The last step is to get a
AorR.
simple sorted tree (Figure 4 (c)) from the simple tree. The idea here
is illustrated in Figure 5. We want to sort labelled nodes from 1 to n.
Start removing arc (n; k) and adding arc (l; i) (step 1 to 2). Remove
arc (j; n) and add arc (n 1; n) (step 2 and 3). Note that in this con-
p
AorR multiconnected
polytree
1-p figuration, the induced width is one. Now, remove arc (n 1; o) and
AR polytree add arc (j; k) (step 3 and 4). Repeat steps 2 and 4 for all nodes. So,
If
p
accept polytree
from any multi-connected DAG it is possible to reach a simple sorted
polytree
tree. The opposite path is clearly analogous, so we can go from any
1-p DAG to any other DAG, and the chain is irreducible. Note that con-
reject
multiconnected AorR If
straints on node degree and maximum number of edges can be dealt
multiconnected multiconnected with within the same processes. QED
By the previous theorems we obtain:
Figure 3. Structure of PMMixed.
Theorem 4 The Markov chain generated by Algorithm PMMixed is
ergodic and its unique stationary converges to a uniform distribution.

Proof. It is always possible to stay in the same state for procedures


AR and AorR; therefore, all states have a self-loop probability greater i j k i j k 1 2 n
than zero. QED
(a) (b) (c)

Theorem 2 The transition matrix defined by the Algorithm PM- Figure 4. Simple trees used in our proofs: (a) Simple tree, (b) Simple
Mixed is doubly stochastic. polytree, (c) Simple sorted tree.
p
AorR multiconnected
step 1 i j n k l
polytree
q
AR polytree

1-p-q Jump multiconnected


step 2 k l i j n

If p/(p+q)
n accept polytree
polytree
step 3 k n-1 o j p+q reject
AorR If q/(p+q)

multiconnected multiconnected
multiconnected

step 4 o j k n-1 n 1-p-q Jump polytree

Figure 5. Basic moves to obtain a simple sorted tree. Figure 7. Structure of PMMixed with procedure J.

The algorithm PMMixed can be implemented quite efficiently, ex- line 04 and after line 07 in the algorithm PMMixed. The complete al-
cept for the computation of induced width — finding this value is gorithm can be understood as a sequence of probabilistic transitions
a NP-hard problem with no easy solution. There are heuristics for that follow the scheme in Figure 7. All previous theorems can be
computing induced width; some of which have been found to be of easily extended to this new situation; the only one that must be sub-
high quality [6]. Consequently, we must change our goal: instead of stantially modified is Theorem 3. Transitions from polytree to multi-
adopting constraints on exact induced width, we assume that the user connected DAGs are performed with probability (1 q ); transitions
specifies a maximum width given a particular heuristic. We call this
q = 1 q. The value of p and q control the
from multi-connected DAGs to polytrees are performed with proba-
width the heuristic width. Our goal then is to produce random DAGs bility 1 (p + q ) p+ q
on the space of DAGs that have constraints on heuristic width. mixing rate of the chain; we have observed remarkable insensitivity
Apparently we could still use the PMMixed algorithm here, with to these values.
the obvious change that lines 05 and 08 must check heuristic width
instead of induced width. However such a simple modification is
not sufficient: because heuristic width is usually computed with lo- 4 THE BNGenerator AND APPLICATIONS
cal operations, we cannot predict the effect of adding and removing
edges on it. That is, we cannot adapt Lemma 1 to heuristic width The algorithm PMMixed (with the modifications indicated in
in general, and then we cannot predict whether a “path” between Figure 7) can be efficiently implemented with existing ordering
DAGs can in fact be followed by the chain without violating heuris- heuristics, and the resulting DAGs are quite similar to existing
Bayesian networks. We have implemented the algorithm using a
O
tic width constraints. We must create a mechanism that would allow
the chain to transit between arbitrary DAGs regardless of the adopted (n log n) implementation of the minimum weight heuristic. The
heuristic. Our solution is to add a new type of operation, specified result is the BNGenerator package, freely distributed under the GNU
by procedure J (Figure 6) — this procedure allows “jumps” from license (at http://www.pmr.poli.usp.br/ltd/Software/BNGenerator).
arbitrary multi-connected DAGs to polytrees. We also assume that The software uses the facilities in the JavaBayes system, in-
any adopted heuristic is such that, if the DAG is a polytree, then the cluding the efficient implementation of ordering heuristics
heuristic width is equal to the induced width. Even if a given heuris- (http://www.cs.cmu.edu/˜javabayes). The BNGenerator accepts
tic does not satisfy this property, the heuristic can be easily modi- specification of number of nodes, maximum node degree, maximum
fied to do so: test whether the DAG is a polytree and, if so, return number of edges, and maximum heuristic width (for minimum
the induced width of the polytree (the maximum number of parents weight heuristic, but other heuristics can be added). The software
amongst all nodes). also performs uniformity tests using a 2 test. Such tests can be
Procedure J must be called with probability (1 p q ) both after performed only for small number of nodes (as the number of possible
DAGs grows extremely quickly [12]), but they allowed us to test the
Procedure J: Sequence of AorR
01. If the current graph is polytree:
02. Generate uniformly a pair of distinct nodes i; j ;
03. If arc (i; j ) does not exist in current graph,
add the arc; otherwise, keep the same state.
04. If the current graph is multi-connected:
05. Generate uniformly a pair of distinct nodes i; j .
06. If arc (i; j ) exists in current graph, remove the arc;
otherwise, keep the same state.
07. If the new graph satisfies imposed constraints, accept the
graph; otherwise, keep previous graph.

Figure 6. Procedure J. Figure 8. Bayesian network generated with BNGenerator: 30 nodes,


maximum degree 20, maximum induced width 2.
algorithm and its procedures. We have observed the relatively fast firmed comments in the literature that suggest that standard Gibbs
mixing of the chain with the transitions we have designed. sampling cannot profit from quasi-random samples, while straight-
To show how to use our previous results, we discuss the evalu- forward importance sampling presents essentially the same behav-
ation of a particular inference algorithm that has received attention ior under pseudo- and quasi-random sampling for medium-sized net-
in the literature but have no conclusive analysis yet. Due to the lack works. We have also investigated the relationship between heuristic
of space, we present a brief summary of rather extensive tests; more width and d-connectivity and the performance of loopy propagation,
details can be found in a longer technical report [4]. Two other appli- and reported on those issues elsewhere.
cations can be found in that technical report: a study on the relation
between heuristic width and d-connectivity, and a study of conver-
ACKNOWLEDGEMENTS
gence for loopy propagation in networks with non-zero probabilities.
Consider the behavior of Monte Carlo methods associated with We thank Carlos Brito for suggesting the use of induced width,
quasi-random numbers. That is, numbers that form low discrep- Robert Castelo for pointing us to Melançon et al’s work, Guy
ancy sequences — numbers that progressively cover the space in Melançon for confirming some initial thoughts, Nir Friedman for
the “most uniform” manner [7, 9]. There have been quite success- indicating how to generate distributions, and Haipeng Guo for test-
ful applications of quasi-Monte Carlo methods for integration in ing the BNGenerator. We also thank Jaap Suermondt, Tomas Kocka,
low-dimensional problems; in high-dimensional problems, there has Alessandra Potrich and Márcia D’Elia Branco for providing impor-
been conflicting evidence regarding the performance of quasi-Monte tant ideas, and Y. Xiang, P. Smets, D. Dash, M. Horsh, E. Santos,
Carlo methods. As a positive example, Cheng and Druzdzel obtained and B. D’Ambrosio for suggesting valuable procedures. The first au-
good results in Bayesian network inference with importance sam- thor was supported by FAPESP grant 00/11067-9. This work was
pling using quasi-random numbers [1]. We have investigated the fol- (partially) developed in collaboration with HP Brazil R&D; the third
lowing question: How does quasi-random numbers affect standard author was supported by HP Labs and was responsible for investigat-
importance sampling and Gibbs sampling algorithms in Bayesian ing loopy propagation. The second author was partially supported by
networks? We have used the importance sampling scheme derived by CNPq through grant 300183/98-4.
Dagum and Luby [2], and have investigated the behavior of Halton
sequences in random networks. The summary of our investigation REFERENCES
is as follows. First, pseudo-random numbers are clearly better than [1] J. Cheng and M. Druzdzel, ‘Computational investigation of low-
quasi-random numbers in medium-sized networks for Gibbs sam- discrepancy sequences in simulation algorithms for Bayesian net-
pling. Second, pseudo-random number have a small edge over quasi- works’, in Conf. on Uncertainty in Artificial Intelligence, pp. 72–81,
SF, CA. Morgan Kaufmann.
random numbers for importance sampling; however the differences [2] P. Dagum and M. Luby, ‘An optimal approximation algorithm for
are so small that both can be used. In fact it is not hard to find net- Bayesian inference’, Artificial Intelligence, 93(1–2), 1–27, (1997).
works that behave better under quasi-random importance sampling [3] R. Dechter, ‘Bucket elimination: An unifying framework for proba-
than under pseudo-random importance sampling.4 bilistic inference’, in Conf. on Uncertainty in Artificial Intelligence, pp.
211–219, SF, CA. Morgan Kaufmann.
The methodology indicated in this example can be applied to other
[4] J. S. Ide and F. G. Cozman and F. T. Ramos, Generation of Random
inference algorithms and theoretical questions related to directed Bayesian Networks with Constraints on Induced Width, with Applica-
acyclic graphs and Bayesian networks. tions to the Average Analysis of d-Connectivity, Quasi-random Sam-
pling, and Loopy Propagation, Tech. Report BT/PMR, University of
São Paulo, Brazil, 2004.
5 CONCLUSION [5] J. S. Ide and F. G. Cozman, ‘Random generation of Bayesian networks’,
in Brazilian Symp. on Artificial Intelligence. Springer-Verlag, (2002).
In this paper we have presented a solution for the generation of uni- [6] U. Kjaerulff, ‘Triangulation of graphs — algorithms giving small total
formly distributed random Bayesian networks with control over key state space’, Technical Report R-90-09, Department of Mathematics
and Computer Science, Aalborg University, Denmark, (March 1990).
quantities. The main idea is to generate DAGs with control on in- [7] J. G. Liao, ‘Variance reduction in Gibbs sampler using quasi random
duced width, and then generate distributions associated with the gen- numbers’, Journal of Computational and Graphical Statistics, 7(3),
erated DAG. Given the NP-hardness of induced width, we have re- 253–266, (September 1998).
sorted to “heuristic width” — the width produced by one of the many [8] G. Melançon and M. Bousque-Melou, ‘Random generation of dags for
high-quality heuristics available. We generate DAGs using Markov graph drawing’, Technical Report technical report INS-R0005, Dutch
Research Center for Mathematical and Computer Science-CWI, (2000).
chains, and the need to guarantee heuristic width constraints leads to [9] H. Niederreiter, Random Number Generation and Quasi-Monte Carlo
a reasonably complex transition scheme encoded by algorithm PM- Methods, volume 63 of CBMS-NSF regional conference series in Appl.
Mixed and procedure J. The algorithm can be modified to accom- Math., SIAM, Philadelphia, 1992.
modate a number of other constraints (say constraints on the maxi- [10] J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan-
Kaufman, 1988.
mum number of parents). The methodology used to derive these al- [11] S. I. Resnick, Adventures in Stochastic Processes, Birkhäuser, Cam-
gorithms and proving their convergence can be employed to generate bridge, MA, USA; Berlin, Germany; Basel, Switzerland, 1992.
testing examples in other fields of artificial intelligence. The reliance [12] R. W. Robinson, ‘Counting labeled acyclic digraphs’, in New Directions
on Markov chains demands convergence proofs and mixing times, in the Theory of Graphs, ed., F. Harary, pp. 28–43, Michigan, (1973).
but it allows the manipulation of constraints and guarantees of uni- Academic Press.
[13] S. M. Ross, Stochastic Processes, John Wiley & Sons; New York, 1983.
formity that do not seem to be handled by other methods. [14] P. Spirtes, C. Glymour, and R. Scheines, Causation, Prediction, and
We have observed that this strategy does produce “realistic- Search (second edition), MIT Press, 2000.
looking” Bayesian networks. Using such networks, we have con- [15] Y. Xiang and T. Miller, ‘A well-behaved algorithm for simulating de-
pendence structure of Bayesian networks’, in International Journal of
4 As a notable (not randomly generated) example of this phenomenon, the Applied Mathematics, volume 1, pp. 923–932, (1999).
Alarm network does behave slightly better with quasi-random than with
pseudo-random importance sampling (corroborating results by Cheng and
Druzdzel [1]).

You might also like