ramBLe - A Parallel Framework for Bayesian Learning

ramBLe (A Parallel Framework for Bayesian Learning) supports multiple constraint-based algorithms for structure learning from data in parallel.

Requirements

gcc (with C++14 support) is used for compiling the project.
This project has been tested only on Linux platform, using version 9.2.0.
Boost libraries are used for parsing the command line options, logging, and a few other purposes.
Tested with version 1.70.0.
MPI is used for execution in parallel.
Tested with MVAPICH2 version 2.3.3.
CMake is required for building the project.
Tested with version 3.29.
The following repositories are used as submodules:
- BN Utils contains common utilities for BN learning in parallel and scripts for post-processing.
- mxx is used as a C++ wrapper for MPI.
- Graph API is used as a lightweight wrapper around Boost.Graph.
- C++ Utils are used for logging and timing.
Google Test (optional) framework is used for unit testing in this project.
If this dependency is not satisfied, then the unit tests are not built. See the relevant section in Building for more information.
Tested with version 1.10.0.

Building

After the dependencies have been installed, the project can be built in a build directory as follows as:

mkdir build
cd
cmake ..
make

cmake command searches the default paths for the dependencies and configures the build. make command builds the executable named ramble, which can be used for constraint-based structure learning.
By default, all the paths from the environment in CPATH and LIBRARY_PATH variables are used as include paths and library paths.

Unit Tests

The unit tests are not built by default. cmake can be configured as follows for building the tests:

cmake -DENABLE_TESTING=ON  ..

Debug

For building the debug version of the executable, cmake can be run as follows:

cmake -DCMAKE_BUILD_TYPE=Debug ..

Logging

By default, logging is disabled in the release build and enabled in the debug build. In order to change the default behavior, cmake can be configured as follows:

cmake -DENABLE_LOGGING=ON ..

Please be aware that enabling logging will affect the performance.

Timing

Timing of high-level operations can be enabled by passing -DENABLE_TIMER=ON argument to cmake.

Execution

Once the project has been built, the executable can be used for learning BN as follows:

./ramble -f test/coronary.csv -n 6 -m 1841 -d -o test/coronary.dot

For running in parallel, the following can be executed:

 mpirun -np 8 ./ramble -f test/coronary.csv -n 6 -m 1841 -d -o test/coronary.dot

Please execute the following for more information on all the options that the executable accepts:

./ramble --help

Algorithms

The algorithm for learning BNs can be chosen by specifying the desired algorithm as an option to the executable, using -a option. The currently supported algorithms are listed below.

Local-to-Global Learning

The algorithms in this category first learn the local neighborhood of each variable separately and then combine these neighborhoods to get the complete network.

Blanket Learning

This class of algorithms first finds the Markov blanket (MB) of the variable to get the parents and the children (PC).

gs corresponds to the Grow-Shrink (GS) algorithm by Margaritis & Thrun.
iamb corresponds to the Incremental Association MB (IAMB) algorithm by Tsamardinos et al.
inter.iamb corresponds to the Interleaved Incremental Association MB (InterIAMB) by Tsamardinos et al.

Direct Learning

This class of algorithms directly finds the PC sets of nodes.

mmpc corresponds to the Max-Min PC (MMPC) algorithm by Tsamardinos et al. and corrected by Pena et al.
si.hiton.pc corresponds to the Semi-interleaved HITON-PC algorithm by Aliferis et al.
hiton (sequential only) corresponds to the HITON-PC algorithm by Aliferis et al. and corrected by Pena et al.
getpc (sequential only) corresponds to the Get PC algorithm by Pena et al.

Global Learning

This class of algorithms learn the network directly by iteratively eliminating edges between variables which are found to be independent.

pc.stable corresponds to the PC-stable algorithm by Colombo et al.
pc.stable.2 is an alternate parallel algorithm for PC-stable that learns the same network as pc.stable

Publication

Ankit Srivastava, Sriram Chockalingam, and Srinivas Aluru. "A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery." In 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), IEEE Computer Society, 2020.

The experiments reported in the publication can be reproduced using EXPERIMENTS.md.

Licensing

Our code is licensed under the Apache License 2.0 (see LICENSE).

Name		Name	Last commit message	Last commit date
Latest commit History 357 Commits
.github/workflows		.github/workflows
data		data
ext		ext
include/ramble		include/ramble
src		src
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
CPPLINT.cfg		CPPLINT.cfg
EXPERIMENTS.md		EXPERIMENTS.md
LICENSE		LICENSE
README.md		README.md
compile_flags.txt		compile_flags.txt
hive_environment.log		hive_environment.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

ramBLe - A Parallel Framework for Bayesian Learning

Requirements

Building

Unit Tests

Debug

Logging

Timing

Execution

Algorithms

Local-to-Global Learning

Blanket Learning

Direct Learning

Global Learning

Publication

Licensing

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

AluruLab/ramBLe

Folders and files

Latest commit

History

Repository files navigation

ramBLe - A Parallel Framework for Bayesian Learning

Requirements

Building

Unit Tests

Debug

Logging

Timing

Execution

Algorithms

Local-to-Global Learning

Blanket Learning

Direct Learning

Global Learning

Publication

Licensing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages