Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Incompatible Igenomes NCBI GRCh38 genes.gtf - gene_biotype annotation missing #460

@tweep

Description

@tweep

Hi,
I'm running into issues with the format of the genes.gtf which is referenced in v1.4.2 /conf/igenomes.config
The NCBI gtf does not include the gene_biotype field, so the pipeline fails in the feature_counts setting when using the default igenomes.config with the error message below. Can you point us to a valid gtf ?

ERROR: failed to find the gene identifier attribute in the 9th column of the provided GTF file.
  The specified gene identifier attribute is 'gene_biotype' 
  An example of attributes included in your GTF annotation is 'gene_id "DDX11L1"; gene_name "DDX11L1"; transcript_id "rna0"; tss_id "TSS31672"; 

S3 cmd to retrieve genes.gtf
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/ /gnet/is5/p04/data/bioinfo/ehive/reference-data/igenomes/references/Homo_sapiens/NCBI/GRCh38/Annotation/Genes/ --exclude "*" --include "genes.gtf"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions