multiVIB: A Unified Probabilistic Contrastive Learning Framework for Atlas-Scale Integration of Single-Cell Multi-Omics Data
multiVIB is a unified framework to integrate single-cell multi-omics datasets across different scenarios. The model backbone of multiVIB consists of three parts: (1) a modality-specific linear translator, (2) a shared encoder, and (3) a shared projector.
Comprehensive brain cell atlases are essential for understanding neural functions and enabling translational insights. As single-cell technologies proliferate across experimental platforms, species, and modalities, these atlases must scale accordingly, calling for integration frameworks capable of aligning heterogeneous datasets without erasing biologically meaningful variations.
Existing tools typically focus on narrow integration scenarios, forcing researchers to assemble ad hoc workflows that often introduce artifacts. multiVIB addresses this limitation by providing a unified probabilistic contrastive learning framework that supports diverse single-cell integration tasks.
With the model backbone fixed, multiVIB adapts to different integration scenarios by altering only the training strategy, not the architecture. For horizontal integration, in which no jointly-profiled cells, datasets are anchored through shared features, and multiVIB aligns cells by enforcing consistency across shared genomic signals while ensuring that technical covariates do not drive the alignment. For vertical integration, jointly-profiled multi-omics data covers individual cells with multiple modality views. These cells serve as direct biological anchors, allowing multiVIB to learn cross-modality correspondence without relying on engineered feature mappings. Finally, mosaic integration is achieved by combining horizontal and vertical steps tailored to the pattern of modality overlap.
The multiVIB repository is organized as follows:
<repo_root>/
├─ multiVIB/ # multiVIB python package
└─ doc/ # Package documentation
└─ tutorial/
└─ notebooks/ # Example jupyter notebooks
We suggest creating a new conda environment to run multiVIB
conda create -n multiVIB python=3.10
conda activate multiVIB
git clone https://github.com/broadinstitute/multiVIB.git
cd multiVIB
pip install .
We provide end-to-end Jupyter notebooks demonstrating how to use multiVIB across common integration tasks.
-
01_vertical_integration_test.ipynb
Apply multiVIB to the conceptual experiment we set up in Figure 2 of our manuscript. -
02_multimodal_integration.ipynb
Integration of multi-omics mouse cortex datasets (RNA + ATAC). -
03_cross_species_integration.ipynb
Cross-species integration of mammalian basal ganlia datasets demonstrating preservation of species-specific variation.
If you use multiVIB in your research, please cite our preprint:
Yang Xu, Stephen Jordan Fleming, Brice Wang, Erin G Schoenbeck, Mehrtash Babadi, Bing-Xing Huo. multiVIB: A Unified Probabilistic Contrastive Learning Framework for Atlas-Scale Integration of Single-Cell Multi-Omics Data. bioRxiv, 2025.

