Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

AlexHoratio
Copy link
Contributor

Hi!! I can't say I completely understand how Nextflow works, but I was getting this fatal error, only running one sample in my samplesheet:

Apr-28 12:48:50.050 [Actor Thread 8] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_TAXPROFILER:TAXPROFILER:PROFILING:KRAKENUNIQ_PRELOADEDKRAKENUNIQ (microbialdb|1)'

Caused by:
  assert sequences.size() == prefixes.size()
         |         |         |        |
         |         7         |        1
         |                   ["RAMAN_R003_R003.se"]
         [sequences/scratch, sequences/s.2010961, sequences/taxprofiler, sequences/work, sequences/28, sequences/e0e6667e4f763ccf7733cd7cc7e474, sequences/RAMAN_R003_R003.unmapped_other.fastq.gz] -- Check script '/home/s.2010961/.nextflow/assets/nf-core/taxprofiler/./workflows/../subworkflows/local/../../modules/nf-core/krakenuniq/preloadedkrakenuniq/main.nf' at line: 47

With more than one sample, I was having a "file name collision" error, probably because it was splitting at every '/' and doing that to each sample in the samplesheet.

By placing [] around reads before it gets bundled with prefixes, sorted, placed into batches, etc, it solves the pathname from breaking apart.

I'm not sure if anyone else has encountered this bug, but this has fixed the pipeline for me! I can now run KrakenUniq successfully, with either one or multiple samples.

My samplesheet looks like this:

sample,run_accession,instrument_platform,fastq_1,fastq_2,fasta
RAMAN_R003,R003,OXFORD_NANOPORE,/scratch/scw1409/RAMAN/RAMAN_R003/RAMAN_R003.fastq.gz,,

databases.csv:

tool,db_name,db_params,db_type,db_path
krakenuniq,microbialdb,,long,/scratch/scw1409/dbs/KrakenUniq

I hope this is helpful. If it's a unique (hehe) error just for me (a.k.a., if I have been doing something wrong), please let me know!!

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions
Copy link

This PR is against the master branch ❌

  • Do not close this PR
  • Click Edit and change the base to dev
  • This CI test will remain failed until you push a new commit

Hi @AlexHoratio,

It looks like this pull-request is has been made against the AlexHoratio/taxprofiler master branch.
The master branch on nf-core repositories should always contain code from the latest release.
Because of this, PRs to master are only allowed if they come from the AlexHoratio/taxprofiler dev branch.

You do not need to close this PR, you can change the target branch to dev by clicking the "Edit" button at the top of this page.
Note that even after this, the test will continue to show as failing until you push a new commit.

Thanks again for your contribution!

@AlexHoratio AlexHoratio changed the base branch from master to dev April 28, 2025 12:48
@AlexHoratio AlexHoratio reopened this Apr 28, 2025
@jfy133
Copy link
Member

jfy133 commented Apr 28, 2025

Oops, yes I see the issue - we never considered running a single sample 😅

So yeah this is a combination of having to 'batch' KrakenUniq runs into multiple samples, and a really annoying behaviour of Java/Groovy that if you give .size() on a string it'll automatically just split the string on / and count 🙄

I think this PR makes sense, will check tests once they've completed :)

@jfy133
Copy link
Member

jfy133 commented Apr 28, 2025

But don't forget changelog @AlexHoratio !

@AlexHoratio
Copy link
Contributor Author

Oops, yes I see the issue - we never considered running a single sample 😅

So yeah this is a combination of having to 'batch' KrakenUniq runs into multiple samples, and a really annoying behaviour of Java/Groovy that if you give .size() on a string it'll automatically just split the string on / and count 🙄

I think this PR makes sense, will check tests once they've completed :)

Wonderful, thank you!! 😄 I just updated the changelog.

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a test to compare current -r dev -profile test to have the expected ~6 samples, then this PR with -profile test which gave the expected ~6 samples, and finally a samplesheet with just a single entry also with this PR and that produced one file as expected. So all good from me :)

@jfy133 jfy133 merged commit 88d3aa7 into nf-core:dev Apr 29, 2025
15 of 26 checks passed
@jfy133
Copy link
Member

jfy133 commented Apr 29, 2025

Thank you @AlexHoratio !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants