SpaTrio is a computational tool based on optimal transport that can align single-cell multi-omics data in space while preserving the spatial topology of the tissue section and local geometry of modality
This toolkit is written in both R and Python programming languages. The core optimal transport algorithm is implemented in Python, while the initial data preparation and downstream multimodal analysis are written in R.
# We recommend using Anaconda, and then you can create a new environment.
# Create and activate Python environment
conda create -n spatrio python=3.8
conda activate spatrio
# Install requirements
cd SpaTrio-main
pip install -r requirements.txt
# Install spatrio
python setup.py build
python setup.py install
install.packages("doParallel")
BiocManager::install("ConsensusClusterPlus")
# Install SpaTrio package from local file
install.packages("SpaTrio_1.0.0.tar.gz", repos = NULL, type = "source")
To use SpaTrio we require formatted .csv files as input (i.e. read in by pandas).
- multi_rna.csv/spatial_rna.csv (The gene expression matrix of cells/spots)
| Cell1 | ··· | Celln | |
|---|---|---|---|
| Gene1 | 0 | ··· | 1 |
| ··· | ··· | ··· | ··· |
| Genem | 2 | ··· | 1 |
- multi_meta.csv/spatial_meta.csv (The meta information matrix of cells/spots)
| id | type | |
|---|---|---|
| Cell1 | Cell1 | A |
| ··· | ··· | ··· |
| Celln | Celln | B |
- emb.csv (The low-dimensional embedding matrix of cells)
| emb1 | ··· | embk | |
|---|---|---|---|
| Cell1 | 1.997 | ··· | -0.307 |
| ··· | ··· | ··· | ··· |
| Celln | 2.307 | ··· | 2.119 |
- pos.csv (The spatial location matrix of spots)
| x | y | |
|---|---|---|
| Cell1 | 0.28 | 10.65 |
| ··· | ··· | ··· |
| Celln | 5.98 | 2.16 |
At the same time, we also support additional specifications of the number of cells in each spot.
- expected_num.csv (The number of cells contained in each spot)
| cell_num | |
|---|---|
| Spot1 | 5 |
| ··· | ··· |
| Spotj | 2 |
In some examples of simulated data, the number of cell types in the spot is given (ref_counts.csv). These data will be converted to expected_num for use.
- ref_counts.csv (The number of celltypes contained in each spot)
| Celltype 1 | ··· | Celltype i | |
|---|---|---|---|
| Spot1 | 0 | ··· | 2 |
| ··· | ··· | ··· | ··· |
| Spotj | 1 | ··· | 0 |
We have included two test datasets (demo1 & demo2) in the tutorial/data/ of this repository as examples to show how to use SpaTrio to align cells to space.
Simulated data in the stripe pattern:
Simulated data in the ring pattern:
More importantly, we support directly calling the core functions written in Python from the R language to facilitate downstream analysis.
DBiT-seq mouse embryo datasets (Google Drive):
10x Visium+ADT mouse liver datasets (Google Drive):
We have applied SpaTrio on different tissues of multiple species, here we give step-by-step tutorials for all application scenarios. And preprocessed datasets used can be downloaded from Google Drive.
-
Using SpaTrio to reconstruct and analyze single-cell multi-modal data of mouse cerebral cortex
-
Using SpaTrio to reconstruct and analyze single-cell multi-modal data of human steatosis liver
-
Using SpaTrio to reconstruct and analyze single-cell multi-modal data of human breast cancer
Should you have any questions, please feel free to contact the author of the manuscript, Mr. Penghui Yang ([email protected]).
Penghui Yang, et al. Revealing spatial multimodal heterogeneity in tissues with SpaTrio, Cell Genomics, 2023, https://doi.org/10.1016/j.xgen.2023.100446.