Thanks to visit codestin.com
Credit goes to github.com

Skip to content

gene14/vcf2structure

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

vcf2structure

Convert bi-allelic SNPs stored in VCF format to STRUCTURE format. Requires a VCF file with bi-allelic SNPs and a population map, which is a CSV file containing sample IDs (first column) and population labels (second column). The columns need to be named sample_id and population.

Dependencies

Requires scikit-allel, numpy and pandas. I recommend creating a Python virtual environment and installing the dependencies from the requirements file:

python3 -m venv .venv

source .venv/bin/activate

pip install -r requirements.txt

How to run

The following example reads in a VCF file called example.vcf and a population map example_popmap.csv, converts the VCF into a STRUCTURE file called example.str with samples encoded over two rows (i.e., one row per sample is set to False):

./vcf2structure.py example.vcf example_popmap.csv example.str False

Or alternatively:

python3 vcf2structure.py example.vcf example_popmap.csv example.str False

If you instead want samples encoded on a single row (and each locus encoded over two columns, with one allele per column) set the fourth argument to True.

Archival notice

The functionality of this script has been integrated into VCF2PopGen, so this repository has been archived.

About

Convert bi-allelic SNPs stored in VCF format to STRUCTURE format.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%