-
Notifications
You must be signed in to change notification settings - Fork 7
Release PR 1.0.0 #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release PR 1.0.0 #16
Conversation
Documentation update as required in the release PR
|
Thank you again for the super useful review! Important suggestion for improvements, such as :
are noted and will be implemented in a minor release to be prepared soon. |
jfy133
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output and input documentation is much improved. I think only thing missing now is a review of the parameters, for example in your test profiles you use fasta but the parameter documentation only shows genome...? Also you've added help info to the trimgalore trimming parameters but these are still hidden, as are useful things such as bwaindex/bwamem2index etc.
|
|
||
| <!-- TODO nf-core: Fill in short bullet-pointed list of the default steps in the pipeline --> | ||
| <p align="center"> | ||
| <img src="docs/images/hgtseq_pipeline_metromap.png" alt="nf-core/circdna metromap" width="70%"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok some feedback on this, but this is much clearer now I think:
- Neon green isn't great, it's very hard to read on white background. I would suggest using the same grey as 'pre-aligned' after the bam
- Are you missing an input FASTA route for the alignment step?
| intsingle = read_tsv(integrations, col_names = c("read_name", "chr", "position")) | ||
| data_integrations = rbind( | ||
| data_integrations, | ||
| cbind( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For next release
docs/usage.md
Outdated
| ### `--genome` | ||
|
|
||
| The user must specify the genome of interest. A list of genomes is available in the pipeline under the folder `conf/igenomes.config`, that contains illumina iGenomes reference file paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is it --fasta? or do you support both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above.
I prefer the usage to be simplified and be handled through the iGenomes config.
in some of our tests where genomes are not in iGenomes, we've created a config following the nf-core iGenomes page, and still passed to our pipeline only the --genome param
Co-authored-by: James A. Fellows Yates <[email protected]>
Co-authored-by: James A. Fellows Yates <[email protected]>
Hi @jfy133 just to make sure I've replied to the correct comment, I'll probably repeat myself here. |
small changes in docs
added short guideline to genome parameter
Forgot to fix metromap as suggested in release PR review
|
Hi there, |
|
Spoken in slack (unfortunately before the merge): I will do one last check before we do the actual release |
jfy133
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last few things:
-
README introduction: the pipeline brackground is still maybe a bit too detailed, now it's not friday - what about:
nf-core/hgtseq is a bioinformatics best-practice analysis pipeline built to investigate horizontal gene transfer from NGS data. The pipeline uses metagenomic classification of paired-read alignments against a reference genome to identify discrepencies in species assignment between within read pairs to identify potential integration sites into the host genome.This will also make the pipeline summary abit clearer I think.
-
README: The metromap doesn't fully work in darkmode (but not important)
- USAGE: Does the pipeline work with single-end reads? My understanding is that it requires pairs , in which case the first line of
Input Formatsis in correct. - USAGE/Nextflow_schema: make sure you mentio nin both places the krona/kraken2 databases can be tar.gz'd (currently only in parameters)
- We have broken markdown links under 'pipeline arguments' https://nf-co.re/hgtseq/dev/usage#pipeline-arguments, likely due to the backticks, I would remove this formatting.
- citations.md: missing - bamtools (gh repo only), bwamem2
- JSON schema: given you have a fix listed of aligners, you should make this with an
enum(press the cog next to the parameter when in the schema build), I think you also say you supportaln, so that should be added?
Otherwise once these are in I can give you a retroactive ✅ and we can complete the rest of the release @carpanz @lescai
|
small adjustment to otherwise wonderful summary :) |
This is a PR to make the first release.
We have successfully tested the pipeline on eight 10 exomes datasets of different species (2 Human cohorts, Mouse, Macaca Mulatta, Macaca Fascicularis, Pan Paniscus, Bos Taurus and Canis Familiaris) and an additional larger dataset of 88 Human exomes.
PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).