-
Notifications
You must be signed in to change notification settings - Fork 460
Closed
Description
Dear all,
we faced an issue when dealing with vcf in plant genomes, which can have big chromosomes. Briefly, we can't extract with tabix positions above 5e8, and also softwares which uses htslib have such issue. I've uploaded a gist with a sample VCF. Here are the instruction to reproduce the issue:
Download the latest htslib from github. Compile code, then download moved.vcf from gist. Inspect last vcf lines with tail:
$ tail -2 moved.vcf | cut -f 1-7
12 1060240521 . G T 203.196 .
12 1070240615 . C A 378.781 .
Now pack the file with bgzip and index it with tabix:
$ ./bgzip moved.vcf
$ ./tabix moved.vcf.gz
Finally, query the VCF using tabix and chrom 12:
$ ./tabix moved.vcf.gz 12 | tail -2 | cut -f 1-7
12 520164811 . G A 73.6504 .
12 530164817 . T C 83.1434 .
All lines after 109 (POS 530164817) were lost.
Thanks for your support.
Metadata
Metadata
Assignees
Labels
No labels