-
Notifications
You must be signed in to change notification settings - Fork 26
Description
I really love how easy CrossMap is to use. The flexibility of working on multiple file formats and correcting the reference is incredibly useful.
I am working on incorporating CrossMap into my MAF re-annotation pipeline (liftover -> maf2vcf -> funcotator), and noticed the odd behavior of Insertion events being turned into two Deletion events upon completion. On investigation, I saw that for insertion events, CrossMap was changing Reference_Allele from - to the reference genome at the given coordinates. With Tumor_Seq_Allele1 or Tumor_Seq_Allele2 also set to a -, this results in the MAF appearing to reference a deletion. Though it's not well explained in the original MAF Spec, when it was developed back in the days of TCGA, for Somatic MAFs it was required that Tumor_Seq_Allele1 always be equal to Reference_Allele, and most tools that produce MAF files that I'm familiar with follow that convention.
In the both the original and current MAF Spec, Reference_Allele is supposed to be - for an insertion, however, upon a liftover using the latest version of CrossMap (installed from GitHub), Reference_Allele is always set to the reference genome at the given coordinates. This is mentioned in the discussion in #68, but not resolved. It would be good to have an update that checks the current value of Reference_Allele, and changes it only if the value isn't -, due to the fact that this behavior is only incorrect in a pure Insertion event, but not an Insertion-Deletion event.