This is a collection of scripts and modules for bioinfomatics file access
- xzFile, xzopen()
- access to various compressed files, currently recoganize gzip (.gz), bz2(.bz2), and bgzip(.bgz, .b.gz) from samtools package
- tsvFile, tsvRecord, tsv
- tab seperated file with named fields, user could also defined some preprocess functions for field reading and writing
- vcfFile, vcf
- vcf file access, depends on PyVCF, yet provide a convinient and flexable interface
- samFile, sam
- sam file access, based on pysam. pysam also provides interface for tabix (random access tsv file with genome positions), which could be access from BioUtil.sam
- fastqFile, fastaFile:
- fasta/fastq file IO. based on lh3 readfq.
- cachedFasta
- fetch region sequence from large fasta file. This module is based on faidx through pysam pysam.FastaFile. from v0.1.2: old name fastaReader is deprecated as misleading with fastaFile reader
- faidx
- experimental, interface to pyfaidx.
- v0.4
- add logger class
- v0.3
- change fasta/fastq Writter methods
- v0.2
- add fastqFile, rename fastaReader to cachedFasta
- v0.1.1
- add fastaReader
- v0.1.0
- inital release, support xzFile, tsvFile, vcfFile, samFile and faidx
Yu XU <[email protected]>
This module is under GPLv2 Lisense