Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MAF liftover resulting in same reference and alternate allele #68

@skanwal

Description

@skanwal

Hello,

Thanks for this useful utility.
I have data in MAF format (NCBI build 37). I am trying to lift it over to hg38 using the following Crossmap (v0.6.6) command:

CrossMap maf b37ToHg38.over.chain \\ 
PAAD_atlas.tmp.maf \\
/work/genomes/Hsapiens/hg38/seq/hg38.fa hg38 \\
/explore/liftover_maf/PAAD_atlas.liftover.maf \\
--chromid l

Liftover file was downloaded from https://github.com/broadinstitute/gatk/blob/083aac832cb64515fd0456008bf847dd22f6c234/scripts/funcotator/data_sources/gnomAD/b37ToHg38.over.chain

The command runs successfully with following output:

2024-02-26 10:29:50 [INFO]  Read the chain file "/g/data3/gx8/extras/liftover_chains/b37ToHg38.over.chain"
2024-02-26 10:29:51 [INFO]  Lifting over ...
2024-02-26 10:33:58 [INFO]  Total entries: 6630811
2024-02-26 10:33:58 [INFO]  Failed to map: 1372

However, after inspecting the output I have realised that the Reference and Tumor_Seq_Allele2 are both the same in the lifted over maf file. For example, the head of output looks like:

$ head PAAD_atlas.liftover.maf
#liftOver: Program=CrossMapv0.6.6, Time=February26,2024, ChainFile=/g/data3/gx8/extras/liftover_chains/b37ToHg38.over.chain, NewRefGenome=/g/data3/gx8/local/development/bcbio/genomes/Hsapiens/hg38/seq/hg38.fa
Hugo_Symbol	sample_id	Hugo_Symbol	NCBI_Build	Chromosome	Start_Position	End_Position	Variant_Classification	Variant_Type	Reference_Allele	Tumor_Seq_Allele2	Tumor_Sample_Barcode	HGVSp_Short	aa_mutation
1	Avner-primary_tissue_subset	FAM231B	hg38	chr1	16539492	16539492	Missense_Mutation	SNP	C	C	p010_tumor-52fccd-somatic.pcgr.vcf	p.R143C	NA
2	Avner-primary_tissue_subset	ZMYM4	hg38	chr1	35389029	35389029	Nonsense_Mutation	SNP	G	G	p010_tumor-52fccd-somatic.pcgr.vcf	p.E795*	NA
3	Avner-primary_tissue_subset	COL8A2	hg38	chr1	36099236	36099236	Nonsense_Mutation	SNP	G	G	p010_tumor-52fccd-somatic.pcgr.vcf	p.R149*	NA
4	Avner-primary_tissue_subset	PTGER3	hg38	chr1	70953763	70953763	Missense_Mutation	SNP	T	T	p010_tumor-52fccd-somatic.pcgr.vcf	p.Q368H	NA
5	Avner-primary_tissue_subset	C1orf52	hg38	chr1	85259561	85259561	Missense_Mutation	SNP	C	C	p010_tumor-52fccd-somatic.pcgr.vcf	p.E25K	NA
6	Avner-primary_tissue_subset	AMY2A	hg38	chr1	103617550	103617550	Missense_Mutation	SNP	T	T	p010_tumor-52fccd-somatic.pcgr.vcf	p.V37D	NA
7	Avner-primary_tissue_subset	TNR	hg38	chr1	175391305	175391305	Missense_Mutation	SNP	G	G	p010_tumor-52fccd-somatic.pcgr.vcf	p.S497L	NA
8	Avner-primary_tissue_subset	LAMC2	hg38	chr1	183218424	183218424	Missense_Mutation	SNP	G	G	p010_tumor-52fccd-somatic.pcgr.vcf	p.A147T	NA

In comparison, the head of original (genome build 37) file is:

$ head PAAD_atlas.tmp.maf
Hugo_Symbol	sample_id	Hugo_Symbol	NCBI_Build	Chromosome	Start_Position	End_Position	Variant_Classification	Variant_Type	Reference_Allele	Tumor_Seq_Allele2	Tumor_Sample_Barcode	HGVSp_Short	aa_mutation
1	Avner-primary_tissue_subset	FAM231B	37	1	16865987	16865987	Missense_Mutation	SNP	C	T	p010_tumor-52fccd-somatic.pcgr.vcf	p.R143C	NA
2	Avner-primary_tissue_subset	ZMYM4	37	1	35854630	35854630	Nonsense_Mutation	SNP	G	T	p010_tumor-52fccd-somatic.pcgr.vcf	p.E795*	NA
3	Avner-primary_tissue_subset	COL8A2	37	1	36564837	36564837	Nonsense_Mutation	SNP	G	A	p010_tumor-52fccd-somatic.pcgr.vcf	p.R149*	NA
4	Avner-primary_tissue_subset	PTGER3	37	1	71419446	71419446	Missense_Mutation	SNP	T	G	p010_tumor-52fccd-somatic.pcgr.vcf	p.Q368H	NA
5	Avner-primary_tissue_subset	C1orf52	37	1	85725244	85725244	Missense_Mutation	SNP	C	T	p010_tumor-52fccd-somatic.pcgr.vcf	p.E25K	NA
6	Avner-primary_tissue_subset	AMY2A	37	1	104160172	104160172	Missense_Mutation	SNP	T	A	p010_tumor-52fccd-somatic.pcgr.vcf	p.V37D	NA
7	Avner-primary_tissue_subset	TNR	37	1	175360441	175360441	Missense_Mutation	SNP	G	A	p010_tumor-52fccd-somatic.pcgr.vcf	p.S497L	NA
8	Avner-primary_tissue_subset	LAMC2	37	1	183187559	183187559	Missense_Mutation	SNP	G	A	p010_tumor-52fccd-somatic.pcgr.vcf	p.A147T	NA
9	Avner-primary_tissue_subset	OBSCN	37	1	228434396	228434396	Missense_Mutation	SNP	G	A	p010_tumor-52fccd-somatic.pcgr.vcf	p.A1401T	NA

It seems the program is updating both reference and alternate alleles. Can you please help me debug the issue?
Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions