Code related to the publication:
Bianchi D, Borza R, De Zan E, Huelsz-Prince G, Gregoricchio S, Dekker M, Fish A, Mazouzi A, Kroese LJ, Linder S, Hernandez-Quiles M, Vermeulen M, Celie PHN, Krimpenfort P, Song JY, Zwart W, Wessels L, Nijman SMB, Perrakis A, Brummelkamp TR. Zincore, an atypical coregulator, binds zinc finger transcription factors to control gene expression. Science. 2025 Jul 3;389(6755):eadv2861. doi: 10.1126/science.adv2861. Epub 2025 Jul 3. PMID: 40608935.
This repository consists of two main scripts, namely refined-peaks.py
and motif-discovery.py
, which cover the following analysis sections:
-
Refined peaks: Standard peak callers for ChIP-seq datasets do not provide enough resolution to identify the small individual peaks found in Zincore ChIP-seq data.
refined-peaks.py
processes BAM files from WT samples to produce refined peaks consisting of 100bp intervals centered on summits (ie, local maxima) within MACS3 called peaks. Differential binding analysis between WT and KO samples is then performed based on the refined peaks. -
Motif discovery:
motif-discovery.py
takes the refined Zincore ChIP-seq peak regions (produced byrefined-peaks.py
) as input. It analyzes the DNA sequences in these regions to identify k-mers that are overrepresented relative to a control set of sequences. The control set is constructed by randomly sampling regions from the promoter regions of canonical protein-coding genes (spanning from 800 bp upstream to 200 bp downstream of annotated transcription start sites). These control regions are selected to explicitly avoid overlap with the filtered Zincore peak regions, and they are matched to the Zincore peak set in both the number of sequences and their length distribution. For each identified k-mer, an enrichment score is computed as the ratio of its frequency in the Zincore peak sequences to its frequency in the aggregate of Zincore and control sequences.
- Python 3.9.6
- Python libraries in
python_requirements.txt
file - R 4.1.2
- DiffBind 3.4.11
- ChIPseeker 1.30.3
- ensembldb 2.18.4
- AnnotationHub 3.2.2
- org.Hs.eg.db 3.14.0
- bedtools 2.29.2
- MACS3
Guizela Huelsz Prince
Danielle Bianchi
Lodewyk Wessels
Thijn Brummelkamp
The Netherlands Cancer Institute (NKI)