-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Hello,
I am trying to run wakhan in tumor only mode after using Severus. This is Nanopore data with a mean coverage of ~30X, clair3 and whatsapp were used to create ${tumor_phased_vcf file
I checked that reference files and all associated files have the same nomenclature e.g "1" and not "chr1"
I replaced --breakpoints ${SV_VCF} with --change-point-detection-for-cna but I got the same error
Would you have an explanation and a fix for this error ?
here is the command I'm running :
singularity exec --contain \
-B /mnt/ ${WAKHAN} \
python /Wakhan/wakhan.py --threads 12 \
--reference ${REF} \
--target-bam ${haplotaggued_tumor_bam} \
--tumor-phased-vcf ${tumor_phased_vcf} \
--genome-name ${SAMPLE_NAME} \
--out-dir-plots ${OUTDIR2} \
--contigs 1 \
--breakpoints ${SV_VCF} \
--loh-enable
Here is the log with the error I'm encoutering :
logs
[2025-09-23 09:06:29] INFO: Starting Wakhan 0.2.0
[2025-09-23 09:06:29] INFO: Cmd: /Wakhan/wakhan.py --threads 12 --reference /mnt/beegfs02/database/bioinfo/Index_DB/Fasta/Ensembl/GRCh38.109/homo_sapiens.GRCh38.109.fasta --target-bam /mnt/beegfs02/scratch/t_gutman/AML_merge_output_haplotagged.bam --tumor-phased-vcf /mnt/beegfs02/scratch/t_gutman/AML_merge_output_phased.vcf.gz --genome-name AML --out-dir-plots /mnt/beegfs02/scratch/t_gutman/wakhan/AML --contigs 1-22,X,Y --breakpoints /mnt/beegfs02/scratch/t_gutman/severus/AML/somatic_SVs/severus_somatic.vcf --loh-enable
[2025-09-23 09:06:29] INFO: Python version: 3.8.0 | packaged by conda-forge | (default, Nov 22 2019, 19:11:38)
[GCC 7.3.0]
[2025-09-23 09:06:29] INFO: Starting hapcorrect() module...
[2025-09-23 09:06:30] INFO: Parsing reads from /mnt/beegfs02/scratch/t_gutman/AML_merge_output_haplotagged.bam
[2025-09-23 09:20:32] INFO: Parsed 22801547 segments
[2025-09-23 09:20:32] INFO: Computing coverage histogram
[2025-09-23 09:23:17] INFO: Writing tumor coverage for bins
[2025-09-23 09:23:17] INFO: Parsing phaseblocks information
[2025-09-23 09:23:17] INFO: bcftools -> Query for phasesets and GT, DP, VAF feilds by creating a CSV file
/Wakhan/src/hapcorrect/src/utils.py:62: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
dataframe = pd.read_csv(path, sep=sept, names=names)
[2025-09-23 09:23:44] INFO: Computing coverage for bins
[2025-09-23 09:23:44] INFO: bcftools -> Query for het SNPs and creating a /mnt/beegfs02/scratch/t_gutman/wakhan/AML/data_phasing/AML_merge_output_phased.vcf_het_snps.csv CSV file
[2025-09-23 09:23:54] INFO: SNPs frequency -> CSV to dataframe conversion for heterozygous SNPs
/Wakhan/src/hapcorrect/src/utils.py:62: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
dataframe = pd.read_csv(path, sep=sept, names=names)
[2025-09-23 09:24:07] INFO: SNPs frequency -> Computing SNPs frequency from tumor BAM
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[mpileup] 1 samples in 1 input files
[2025-09-23 10:10:13] INFO: SNPs frequency -> Computing ACGTs frequencies for heterozygous SNPs
/Wakhan/src/hapcorrect/src/utils.py:62: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
dataframe = pd.read_csv(path, sep=sept, names=names)
/Wakhan/src/hapcorrect/src/utils.py:62: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
dataframe = pd.read_csv(path, sep=sept, names=names)
[2025-09-23 10:12:18] INFO: bcftools -> Query for phasesets and GT, DP, VAF feilds by creating a CSV file
/Wakhan/src/hapcorrect/src/utils.py:62: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
dataframe = pd.read_csv(path, sep=sept, names=names)
[2025-09-23 10:12:51] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 1
[2025-09-23 10:14:49] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 2
[2025-09-23 10:17:06] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 3
[2025-09-23 10:18:41] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 4
[2025-09-23 10:20:13] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 5
[2025-09-23 10:21:30] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 6
[2025-09-23 10:22:41] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 7
[2025-09-23 10:23:48] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 8
[2025-09-23 10:24:37] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 9
[2025-09-23 10:25:27] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 10
[2025-09-23 10:26:21] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 11
[2025-09-23 10:27:14] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 12
[2025-09-23 10:28:05] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 13
[2025-09-23 10:28:37] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 14
[2025-09-23 10:29:06] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 15
[2025-09-23 10:29:35] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 16
[2025-09-23 10:29:59] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 17
[2025-09-23 10:30:21] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 18
[2025-09-23 10:30:45] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 19
[2025-09-23 10:30:59] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 20
[2025-09-23 10:31:16] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 21
[2025-09-23 10:31:25] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for 22
[2025-09-23 10:31:36] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for X
[2025-09-23 10:31:59] INFO: Loading coverage (bins) and coverage (phaseblocks) datasets for Y
[2025-09-23 10:33:07] INFO: Total phased length: 1961077905
[2025-09-23 10:33:07] INFO: Phase blocks N50: 318259
[2025-09-23 10:33:08] INFO: VCF edit for phase change segments
[2025-09-23 10:35:38] INFO: Total phased length: 3335716963
[2025-09-23 10:35:38] INFO: Phase blocks N50: 1783457
[2025-09-23 10:35:40] INFO: SNPs frequencies plots generation for 1
[2025-09-23 10:36:47] INFO: SNPs frequencies plots generation for 2
[2025-09-23 10:37:58] INFO: SNPs frequencies plots generation for 3
[2025-09-23 10:38:49] INFO: SNPs frequencies plots generation for 4
[2025-09-23 10:39:42] INFO: SNPs frequencies plots generation for 5
[2025-09-23 10:40:25] INFO: SNPs frequencies plots generation for 6
[2025-09-23 10:41:05] INFO: SNPs frequencies plots generation for 7
[2025-09-23 10:42:00] INFO: SNPs frequencies plots generation for 8
[2025-09-23 10:42:31] INFO: SNPs frequencies plots generation for 9
[2025-09-23 10:42:59] INFO: SNPs frequencies plots generation for 10
[2025-09-23 10:43:30] INFO: SNPs frequencies plots generation for 11
[2025-09-23 10:43:59] INFO: SNPs frequencies plots generation for 12
[2025-09-23 10:44:29] INFO: SNPs frequencies plots generation for 13
[2025-09-23 10:44:48] INFO: SNPs frequencies plots generation for 14
[2025-09-23 10:45:04] INFO: SNPs frequencies plots generation for 15
[2025-09-23 10:45:20] INFO: SNPs frequencies plots generation for 16
[2025-09-23 10:45:35] INFO: SNPs frequencies plots generation for 17
[2025-09-23 10:45:48] INFO: SNPs frequencies plots generation for 18
[2025-09-23 10:46:04] INFO: SNPs frequencies plots generation for 19
[2025-09-23 10:46:13] INFO: SNPs frequencies plots generation for 20
[2025-09-23 10:46:23] INFO: SNPs frequencies plots generation for 21
[2025-09-23 10:46:29] INFO: SNPs frequencies plots generation for 22
[2025-09-23 10:46:35] INFO: SNPs frequencies plots generation for X
[2025-09-23 10:46:53] INFO: SNPs frequencies plots generation for Y
[2025-09-23 10:47:07] INFO: hapcorrect() module finished successfully.
[2025-09-23 10:47:07] INFO: Starting cna() module...
[2025-09-23 10:47:08] INFO: Split variants = True
[2025-09-23 10:47:08] INFO: check info = True
[2025-09-23 10:47:08] INFO: Allele symbol = 0
[2025-09-23 10:47:08] INFO: Initializing HeaderParser
[2025-09-23 10:47:08] INFO: Reading vcf form file /mnt/beegfs02/scratch/t_gutman/severus/AML/somatic_SVs/severus_somatic.vcf
[2025-09-23 10:47:08] INFO: Setting self.individuals to ['AML_merge_output_haplotagged']
[2025-09-23 10:47:09] INFO: Using existing phase corrected coverage data
[2025-09-23 10:47:09] INFO: Generating coverage plots chromosomes-wise
[2025-09-23 10:47:09] INFO: Split variants = True
[2025-09-23 10:47:09] INFO: check info = True
[2025-09-23 10:47:09] INFO: Allele symbol = 0
[2025-09-23 10:47:09] INFO: Initializing HeaderParser
[2025-09-23 10:47:09] INFO: Reading vcf form file /mnt/beegfs02/scratch/t_gutman/severus/AML/somatic_SVs/severus_somatic.vcf
[2025-09-23 10:47:09] INFO: Setting self.individuals to ['AML_merge_output_haplotagged']
[2025-09-23 10:47:09] INFO: Split variants = True
[2025-09-23 10:47:09] INFO: check info = True
[2025-09-23 10:47:09] INFO: Allele symbol = 0
[2025-09-23 10:47:09] INFO: Initializing HeaderParser
[2025-09-23 10:47:09] INFO: Reading vcf form file /mnt/beegfs02/scratch/t_gutman/severus/AML/somatic_SVs/severus_somatic.vcf
[2025-09-23 10:47:09] INFO: Setting self.individuals to ['AML_merge_output_haplotagged']
[2025-09-23 10:47:10] INFO: Plots generation for 1
Traceback (most recent call last):
File "/Wakhan/wakhan.py", line 28, in <module>
main()
File "/Wakhan/wakhan.py", line 24, in main
sys.exit(main())
File "/Wakhan/src/main.py", line 360, in main
wakhan_all(args) #hapcorrect + cna
File "/Wakhan/src/main.py", line 376, in wakhan_all
cna_process(args) #cna
File "/Wakhan/src/main.py", line 495, in cna_process
coverage_plots_chromosomes(csv_df_coverage, csv_df_phasesets, args, thread_pool)
File "/Wakhan/src/plots.py", line 380, in coverage_plots_chromosomes
snps_cpd_means, snps_cpd_lens, df_means_chr = change_point_detection_means(args, chrom, breakpoints_segemnts, df_snps_freqs_chr, ref_start_values, ref_start_values_1, df_centm_chrom, df_loh_chrom)
File "/Wakhan/src/utils.py", line 1181, in change_point_detection_means
snps_haplotype1_mean = remove_indices(snps_haplotype1_mean, list(set(indices_cent_hp1[0] + indices_loh_hp1)))
IndexError: index 0 is out of bounds for axis 0 with size 0
Thank you for your help !
Metadata
Metadata
Assignees
Labels
No labels