Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Blank/missing reads in FASTQ cause malformed output FASTQ #133

@dgroot

Description

@dgroot

If one of the input FASTQ files has a missing/blank sequence, the output FASTQ will be malformed.

I.e. if the input FASTQ has a correct entry followed by a blank entry like this:

@A00545
ACGTGCTGAC
+
FFF,:,FFFF
@A00546

+

where the sequence line and quality line are both blank, then the output will have a malformed entry, like so, where there is only a "+" and a quality sequence, but no read ID or genetic sequence:

@A00545
ACGTGCTGAC
+
FFF,:,FFFF
+
F:FFFFFFFF

The program should either throw an error if the FASTQ file has a blank entry, or skip the entry entirely in the output.

(In this case, I didn't know my FASTQ file had a blank entry—it came off the sequencer like that—but Fastp ran with no errors and I only discovered the problem during downstream analysis because the output FASTQ file was not correctly formed.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions