SuShiE🍣

SuShiE (Sum of Shared Single Effect) is a Python package for multiancestry SNP fine-mapping, estimating effect size correlations across ancestries, and computing ancestry-specific prediction weights using either individual-level or summary-level data for molecular or complex traits.

- We detest usage of our software or scientific outcome to promote racial discrimination.

SuShiE is described in

Improved multiancestry fine-mapping identifies cis-regulatory variants underlying molecular traits and disease risk.

Zeyun Lu, Xinran Wang, Matthew Carr, Artem Kim, Steven Gazal, Pejman Mohammadi, Lang Wu, James Pirruccello, Linda Kachuri, Alexander Gusev, Nicholas Mancuso.

Nature Genetics. July, 2025.

Check here for full documentation.

Installation

Before installation, we highly recommend to create a new environment using conda so that it will not affect the software versions of the other projects. For example, use following codes:
```
conda create -n env-sushie python=3.8
```
We currently only support Python3.8+.
If you are using a Mac with an Apple M1 or newer chip, you should install cbgen package or other required pacakges from conda-forge first to ensure compatibility (see this link for previous issue). One easy workaround is to initiate your conda using miniforge. On most HPC systems, this is usually not necessary.
```
conda install -c conda-forge cbgen
```

Last, users can download the latest repository and then use pip:

git clone https://github.com/mancusolab/sushie.git
cd sushie
pip install .

Get Started with Example

SuShiE software is very easy to use:

For fine-mapping using individual-level data:

cd ./data/
sushie finemap --pheno EUR.pheno AFR.pheno --vcf vcf/EUR.vcf vcf/AFR.vcf --covar EUR.covar AFR.covar --output ./test_result

For fine-mapping using summary-level data:

cd ./data/
sushie finemap --summary --gwas EUR.gwas AFR.gwas --vcf vcf/EUR.vcf vcf/AFR.vcf --sample-size 489 639 --gwas-header chrom snp pos a1 a0 z --output ./test_result

It can perform:

SuShiE: multi-ancestry fine-mapping accounting for ancestral correlation
Single-ancestry SuSiE (Sum of Single Effect)
Independent SuShiE: multi-ancestry SuShiE without accounting for correlation
Meta-SuSiE: single-ancestry SuSiE followed by meta-analysis
Mega-SuSiE: single-ancestry SuSiE on row-wise stacked data across ancestries (individual-level data only)
cis-molQTL effect size correlation estimation
cis-SNP heritability estimation (individual-level data only)
Cross-validation for SuShiE prediction weights (individual-level data only)
Convert prediction results to FUSION format, thus can be used in TWAS

See here for more details on how to use SuShiE.

If you want to use in-software SuShiE inference function, you can use following Python code as an example:

from sushie.infer import infer_sushie
# Xs is for genotype data, and it should be a list of numpy array whose length is the number of ancestry.
# ys is for phenotype data, and it should also be a list of numpy array whose length is the number of ancestry.
infer_sushie(Xs=X, ys=y)
# Or summary-level data
# lds is for LD data, and it should be a list of p by p numpy array whose length is the number of ancestry.
# zs is for GWAS data, and it should be a list of numpy array whose length is the number of ancestry/
infer_sushie_ss(lds=LD, zs=GWAS, ns=np.array([100, 100]))

You can customize this function with your own ideas!

Notes

SuShiE currently only supports continuous phenotype fine-mapping for individual-level data.

Version History

Version	Description
0.1	Initial Release
0.11	Fix the bug for OLS to compute adjusted r squared.
0.12	Update io.corr function so that report all the correlation results no matter cs is pruned or not.
0.13	Add `--keep` command to enable user to specify a file that contains the subjects ID SuShiE will perform on. Add `--ancestry_index` command to enable user to specify a file that contains the ancestry index for fine-mapping. With this, user can input single phenotype, genotype, and covariate file that contains all the subjects across ancestries. Implement padding to increase inference time. Record elbo at each iteration and can access it in the `infer.SuShiEResult` object. The alphas table now outputs the average purity and KL divergence for each `L`. Change `--kl_threshold` to `--divergence`. Add `--maf` command to remove SNPs that less than minor allele frequency threshold within each ancestry. Add `--max_select` command to randomly select maximum number of SNPs to compute purity to avoid unnecessary memory spending. Add a QC function to remove duplicated SNPs.
0.14	Remove KL-Divergence pruning. Enhance command line appearance and improve the output files contents. Fix small bugs on multivariate KL.
0.15	Fix several typos; add a sanity check on reading vcf genotype data by assigning gt_types==Unknown as NA; Add preprint information.
0.16	Implement summary-level data inference. Add option to remove ambiguous SNPs; fix several bugs and enhance codes quality.
0.17	Fix several bugs, add debug checkpoints, add chrom, start, and end filtering to individual-level fine-mapping, enhance codes quality, and update readme for official publication.

Support

For any questions, comments, bug reporting, and feature requests, please contact Zeyun Lu ([email protected]) and Nicholas Mancuso ([email protected]), and open a new thread in the Issue Tracker.

Other Software

Feel free to use other software developed by Mancuso Lab:

jaxQTL: a single-cell eQTL mapping tool using highly efficient count-based model (i.e., negative binomial or Poisson).
MA-FOCUS: a Bayesian fine-mapping framework using TWAS statistics across multiple ancestries to identify the causal genes for complex traits.
SuSiE-PCA: a scalable Bayesian variable selection technique for sparse principal component analysis
twas_sim: a Python software to simulate TWAS statistics.
FactorGo: a scalable variational factor analysis model that learns pleiotropic factors from GWAS summary statistics.
HAMSTA: a Python software to estimate heritability explained by local ancestry data from admixture mapping summary statistics.
Traceax: a Python library to perform stochastic trace estimation for linear operators.

This project has been set up using PyScaffold 4.1.1. For details and usage information on PyScaffold see https://pyscaffold.org/.

Name		Name	Last commit message	Last commit date
Latest commit History 230 Commits
.github/workflows		.github/workflows
data		data
docs		docs
misc		misc
sushie		sushie
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.txt		LICENSE.txt
README.md		README.md
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SuShiE🍣

Installation

Get Started with Example

Notes

Version History

Support

Other Software

About

Uh oh!

Packages

Uh oh!

Contributors 2

Languages

License

mancusolab/sushie

Folders and files

Latest commit

History

Repository files navigation

SuShiE🍣

Installation

Get Started with Example

Notes

Version History

Support

Other Software

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors 2

Languages

Packages