Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Hybracter v0.11.0 - Error during LRGE step (Too few reads) #110

@llk578496

Description

@llk578496

Hi @gbouras13 ,

We encountered an error while running Hybracter v0.11.0 with this command:

hybracter long-single -l 02_fastq/BA00000001.fastq.gz -s BA00000001 -c 1900000 -o 03_assembly/03.1_hybracter/BA00000001 -t 64 --subsample_depth 120 --min_depth 100 --medakaModel r1041_e82_400bps_bacterial_methylation --medaka_override

However, we encountered an error during the lrge step:

[2024-12-04T23:43:21Z INFO  lrge] Running two-set strategy with 10000 target reads and 5000 query reads
Error: Failed to generate estimate

Caused by:
    Too few reads requested: Number of reads in FASTQ file (1) is <= query number of reads (5000)
[Thu Dec  5 07:43:21 2024]
Error in rule lrge:
    jobid: 4
    input: 02_fastq/BA00000001.fastq.gz
    output: 03_assembly/03.1_hybracter/BA00000001/processing/chrom_size/BA00000001_lrge_estimated_chromosome_size.txt
    log: 03_assembly/03.1_hybracter/BA00000001/stderr/lrge/BA00000001.log (check log file(s) for error details)
    conda-env: /home/gilmansiu3/miniforge3/envs/hybracter-v0.11.0/lib/python3.12/site-packages/hybracter/workflow/conda/024e350392d5612938570ada6f9b3299_
    shell:
        
        if [[ 02_fastq/BA00000001.fastq.gz =~ \.gz$ ]]; then
        INPUT_CMD="zcat 02_fastq/BA00000001.fastq.gz"
        else
        INPUT_CMD="cat 02_fastq/BA00000001.fastq.gz"
        fi

        # Now run the awk command on the decompressed file
        if $INPUT_CMD | awk '{if (NR % 4 == 1) count++} END {exit count > 10000 ? 0 : 1}'; then
            # There are more than 10,000 reads, run lrge with defaults
            lrge -t 8 -s 42 02_fastq/BA00000001.fastq.gz -o 03_assembly/03.1_hybracter/BA00000001/processing/chrom_size/BA00000001_lrge_estimated_chromosome_size.txt
        else
            # There are less than 10,000 reads, use the all-vs-all strategy with all available reads
            lrge -t 8 -s 42 -n 10000 02_fastq/BA00000001.fastq.gz -o 03_assembly/03.1_hybracter/BA00000001/processing/chrom_size/BA00000001_lrge_estimated_chromosome_size.txt
        fi

        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile 03_assembly/03.1_hybracter/BA00000001/stderr/lrge/BA00000001.log not found.

Removing output files of failed job lrge since they might be corrupted:
03_assembly/03.1_hybracter/BA00000001/processing/chrom_size/BA00000001_lrge_estimated_chromosome_size.txt
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-12-05T074308.272662.snakemake.log
WorkflowError:
At least one job did not complete successfully.
[2024:12:05 07:43:21] ERROR: Snakemake failed

We have also attached the log file for your reference:
hybracter.log

This sample processed successfully in v0.10.1, suggesting the input file is valid. Could you help us resolve this issue?

Thank you very much!

Best regards,
Eddie

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions