ramBLe (A Parallel Framework for Bayesian Learning) supports multiple constraint-based algorithms for structure learning from data in parallel.
- gcc (with C++14 support) is used for compiling the project.
This project has been tested only on Linux platform, using version 9.2.0. - Boost libraries are used for parsing the command line options, logging, and a few other purposes.
Tested with version 1.70.0. - MPI is used for execution in parallel.
Tested with MVAPICH2 version 2.3.3. - CMake is required for building the project.
Tested with version 3.29. - The following repositories are used as submodules:
- Google Test (optional) framework is used for unit testing in this project.
If this dependency is not satisfied, then the unit tests are not built. See the relevant section in Building for more information.
Tested with version 1.10.0.
After the dependencies have been installed, the project can be built in a build directory as follows as:
mkdir build
cd
cmake ..
make
cmake command searches the default paths for the dependencies and configures the build.
make command builds the executable named ramble, which can be used for constraint-based structure learning.
By default, all the paths from the environment in CPATH and LIBRARY_PATH variables are used as include paths and library paths.
The unit tests are not built by default. cmake can be configured as follows for building the tests:
cmake -DENABLE_TESTING=ON ..
For building the debug version of the executable, cmake can be run as follows:
cmake -DCMAKE_BUILD_TYPE=Debug ..
By default, logging is disabled in the release build and enabled in the debug build. In order to change the default behavior, cmake can be configured as follows:
cmake -DENABLE_LOGGING=ON ..
Please be aware that enabling logging will affect the performance.
Timing of high-level operations can be enabled by passing -DENABLE_TIMER=ON argument to cmake.
Once the project has been built, the executable can be used for learning BN as follows:
./ramble -f test/coronary.csv -n 6 -m 1841 -d -o test/coronary.dot
For running in parallel, the following can be executed:
mpirun -np 8 ./ramble -f test/coronary.csv -n 6 -m 1841 -d -o test/coronary.dot
Please execute the following for more information on all the options that the executable accepts:
./ramble --help
The algorithm for learning BNs can be chosen by specifying the desired algorithm as an option to the executable, using -a option. The currently supported algorithms are listed below.
The algorithms in this category first learn the local neighborhood of each variable separately and then combine these neighborhoods to get the complete network.
This class of algorithms first finds the Markov blanket (MB) of the variable to get the parents and the children (PC).
gscorresponds to the Grow-Shrink (GS) algorithm by Margaritis & Thrun.iambcorresponds to the Incremental Association MB (IAMB) algorithm by Tsamardinos et al.inter.iambcorresponds to the Interleaved Incremental Association MB (InterIAMB) by Tsamardinos et al.
This class of algorithms directly finds the PC sets of nodes.
mmpccorresponds to the Max-Min PC (MMPC) algorithm by Tsamardinos et al. and corrected by Pena et al.si.hiton.pccorresponds to the Semi-interleaved HITON-PC algorithm by Aliferis et al.hiton(sequential only) corresponds to the HITON-PC algorithm by Aliferis et al. and corrected by Pena et al.getpc(sequential only) corresponds to the Get PC algorithm by Pena et al.
This class of algorithms learn the network directly by iteratively eliminating edges between variables which are found to be independent.
pc.stablecorresponds to the PC-stable algorithm by Colombo et al.pc.stable.2is an alternate parallel algorithm for PC-stable that learns the same network aspc.stable
The experiments reported in the publication can be reproduced using EXPERIMENTS.md.
Our code is licensed under the Apache License 2.0 (see LICENSE).