peakMerge is a lightweight Python tool to merge multiple peak calling experiments into:
- a set of consensus peaks
- a consensus-by-experiment sparse matrix
It supports narrowPeak files and a simple bed-like format, and can run in stranded or unstranded mode.
peakMerge is now integrated into MUFFIN, a broader suite of tools for functional sequencing data analysis:
- MUFFIN paper (PMC): https://pmc.ncbi.nlm.nih.gov/articles/PMC11091926/
- MUFFIN GitHub: https://github.com/pdelangen/Muffin
If you use MUFFIN or this peakMerge, please cite: de Langen P, Ballester B. MUFFIN: a suite of tools for the analysis of functional sequencing data. NAR Genom Bioinform. 2024;6(2):lqae051. doi:10.1093/nargab/lqae051.
python peakMerge.py <genome.tsv> <peaks_folder_or_list> <bed|narrowPeak> <output_folder/>Note: the CLI expects <output_folder/> to end with a trailing /.
Example:
python peakMerge.py hg38.chrom.sizes.tsv peaks/ narrowPeak out/peakMerge is a single script. Requirements:
- Python 3
- numpy
- scipy
- pandas
pip install numpy scipy pandas-
Genome sizes file (TSV, 2 columns):
chromandlength. -
Peak files:
- a folder containing peak files, or
- a comma-separated list of peak file paths
- Formats:
narrowPeak(ENCODE, 10 columns; strand in column 6, summit offset in column 10)- bed-like (TSV with at least: chrom, start, end, strand, and summit coordinate)
If summit information is missing or unreliable, use --inferCenter.
Written to <output_folder/>:
consensuses.bedconsensus peak intervalsmatrix.mtxMatrix Market sparse matrix (consensuses x experiments)datasets.txtexperiment names (matrix column order)dataset_stats.txtbasic statisticscommand.shcommand line used
--forceUnstranded
--inferCenter
--sigma <float|auto_1|auto_2>
--scoreMethod <binary|int>
--minOverlap <int>
from peakMerge import peakMerger
merger = peakMerger("hg38.chrom.sizes.tsv", outputPath="out/", scoreMethod="binary")
merger.mergePeaks(folderPath="peaks/", fileFormat="narrowPeak", sigma="auto_1")
merger.writePeaks()Paper in preparation for the standalone tool.
The ReMap catalogues (2022, 2020, 2018, 2015) are under CC BY-NC 4.0 international license, while ReMapEnrich, remap-pipeline, and here peakMerge are under GNU GPLv3 licence.