Thanks to visit codestin.com
Credit goes to github.com

Skip to content
forked from asrivast28/ramBLe

A Parallel Framework for Bayesian Learning

License

AluruLab/ramBLe

 
 

Repository files navigation

ramBLe - A Parallel Framework for Bayesian Learning

Apache 2.0 License DOI

ramBLe (A Parallel Framework for Bayesian Learning) supports multiple constraint-based algorithms for structure learning from data in parallel.

Requirements

  • gcc (with C++14 support) is used for compiling the project.
    This project has been tested only on Linux platform, using version 9.2.0.
  • Boost libraries are used for parsing the command line options, logging, and a few other purposes.
    Tested with version 1.70.0.
  • MPI is used for execution in parallel.
    Tested with MVAPICH2 version 2.3.3.
  • CMake is required for building the project.
    Tested with version 3.29.
  • The following repositories are used as submodules:
    • BN Utils contains common utilities for BN learning in parallel and scripts for post-processing.
    • mxx is used as a C++ wrapper for MPI.
    • Graph API is used as a lightweight wrapper around Boost.Graph.
    • C++ Utils are used for logging and timing.
  • Google Test (optional) framework is used for unit testing in this project.
    If this dependency is not satisfied, then the unit tests are not built. See the relevant section in Building for more information.
    Tested with version 1.10.0.

Building

After the dependencies have been installed, the project can be built in a build directory as follows as:

mkdir build
cd
cmake ..
make

cmake command searches the default paths for the dependencies and configures the build. make command builds the executable named ramble, which can be used for constraint-based structure learning.
By default, all the paths from the environment in CPATH and LIBRARY_PATH variables are used as include paths and library paths.

Unit Tests

The unit tests are not built by default. cmake can be configured as follows for building the tests:

cmake -DENABLE_TESTING=ON  ..

Debug

For building the debug version of the executable, cmake can be run as follows:

cmake -DCMAKE_BUILD_TYPE=Debug ..

Logging

By default, logging is disabled in the release build and enabled in the debug build. In order to change the default behavior, cmake can be configured as follows:

cmake -DENABLE_LOGGING=ON ..

Please be aware that enabling logging will affect the performance.

Timing

Timing of high-level operations can be enabled by passing -DENABLE_TIMER=ON argument to cmake.

Execution

Once the project has been built, the executable can be used for learning BN as follows:

./ramble -f test/coronary.csv -n 6 -m 1841 -d -o test/coronary.dot

For running in parallel, the following can be executed:

 mpirun -np 8 ./ramble -f test/coronary.csv -n 6 -m 1841 -d -o test/coronary.dot

Please execute the following for more information on all the options that the executable accepts:

./ramble --help

Algorithms

The algorithm for learning BNs can be chosen by specifying the desired algorithm as an option to the executable, using -a option. The currently supported algorithms are listed below.

Local-to-Global Learning

The algorithms in this category first learn the local neighborhood of each variable separately and then combine these neighborhoods to get the complete network.

Blanket Learning

This class of algorithms first finds the Markov blanket (MB) of the variable to get the parents and the children (PC).

Direct Learning

This class of algorithms directly finds the PC sets of nodes.

  • mmpc corresponds to the Max-Min PC (MMPC) algorithm by Tsamardinos et al. and corrected by Pena et al.
  • si.hiton.pc corresponds to the Semi-interleaved HITON-PC algorithm by Aliferis et al.
  • hiton (sequential only) corresponds to the HITON-PC algorithm by Aliferis et al. and corrected by Pena et al.
  • getpc (sequential only) corresponds to the Get PC algorithm by Pena et al.

Global Learning

This class of algorithms learn the network directly by iteratively eliminating edges between variables which are found to be independent.

  • pc.stable corresponds to the PC-stable algorithm by Colombo et al.
  • pc.stable.2 is an alternate parallel algorithm for PC-stable that learns the same network as pc.stable

Publication

Ankit Srivastava, Sriram Chockalingam, and Srinivas Aluru. "A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery." In 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), IEEE Computer Society, 2020.

The experiments reported in the publication can be reproduced using EXPERIMENTS.md.

Licensing

Our code is licensed under the Apache License 2.0 (see LICENSE).

About

A Parallel Framework for Bayesian Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 92.8%
  • CMake 7.2%