-
Notifications
You must be signed in to change notification settings - Fork 26
Open
Description
CrossMap.py vcf command seems to replace "*" genotype (indicating a deletion in one or more samples), with a nucleotide sequence. Here are the #CHROM POS ID REF ALT columns of the problematic position before liftover:
ch01 331 . AATATATATAT AAT,AATAT,*,A,AATATAT,AATATATATATAT
Here is the same position after liftover:
ch01 16355 . AATATATATAT AAT,AATAT,A,A,AATATAT,AATATATATATAT
You can see that the "*" genotype has been replaced by "A". When running gatk ValidateVariants to validate the VCF, this results in the following error:
htsjdk.tribble.TribbleException: The provided VCF file is malformed at approximately line number 63: Duplicate allele added to VariantContext: A
I realize this is not a lot to go on but the data is proprietary so I can't share the VCFs to make this bug reproducible. I'm curious whether this is a known issue, or if anyone has a suggestion on how to get around this problem.
Metadata
Metadata
Assignees
Labels
No labels