Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@jchchiu
Copy link

@jchchiu jchchiu commented Aug 11, 2025

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir ).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir ).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Regarding issue #1883 I have:

  • Added two params (mutect2_force_call, mutect2_force_call_tbi) to supply a path to a vcf.gz file and associated index file for force calling.
  • Added Channel to collect .vcf.gz; added Channel to collect .vcf.gz tbi (if not found will create a tbi file); added params to main workflow pipeline.
  • Added params to pipeline initialisation (raise an error if mutect2_force_call file format not in .vcf.gz format.
  • Added TABIX creation for mutect2_force_call if user does not supply it.
  • Added params to the [somatic/tumor_only]_[ALL/mutect2] subworkflows.
  • Added params to GATK4 mutect2 module; created new command in script for force calling (mfc_command) and added to Mutect2 execution.
  • Temporarily edited some tests configs in tools to test for the force call parameter (also tests if TABIX creation works if associated index file not supplied).

Some questions:

  • Regarding tests:
    • For the test data I'm just thinking of creating a VCF with a few random alleles to be used to force call. Is there any other data you prefer?
    • I created test_force_call.vcf.gz which works for both tumor only and somatic mode:
##fileformat=VCFv4.2
##contig=<ID=chr21,length=46709983>
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
chr21	46664275	.	A	C	.	.	.
chr21	46664280	.	A	C	.	.	.
chr21	46664285	.	A	G	.	.	.
chr21	46664290	.	T	A	.	.	.
  • In the Mutect2 docs there is another boolean --genotype-filtered-alleles (or maybe it's now --force-call-filtered-alleles) flag that can be used in conjunction with --alleles. Is it worth it to also implement this?
image

@jchchiu jchchiu marked this pull request as draft August 11, 2025 10:58
@jchchiu
Copy link
Author

jchchiu commented Aug 25, 2025

Also, while I was editing ./subworkflows/local/samplesheet_to_channel/main.nf, I had a small question about bcftools annotate here.

// Fails when bcftools annotate is used but no files are supplied
if (tools && tools.split(',').contains('bcfann') && !(bcftools_annotations && bcftools_annotations_tbi && bcftools_header_lines)) {
    error("Please specify --bcftools_annotations, --bcftools_annotations_tbi, and --bcftools_header_lines, when using BCFTools annotations")
}
  • In the ./main.nf NFCORE_SAREK workflow here, the PREPARE_GENOME workflow is called for bcftools_annotation to create a bcftools_annotation_tbi file if it doesn’t exist.
  • However, in the main workflow here the PIPELINE_INITIALISATION workflow (and thus, ./subworkflows/local/samplesheet_to_channel/main.nf) is called before this to create a samplesheet to be passed into the NFCORE_SAREK workflow.
  • Thus, if a user has to has to specify a bcftools_annotations_tbi file in the first place anyway for the pipeline to run, this makes the TABIX call for bcftools_annotation redundant.

Just a nitpick, but should either the !bcftools_annotations_tbi requirement, or the code for the TABIX creation of bcftools_annotation be removed?

If this is of interest and doesn't disrupt too much, I can open another PR with one of these updates.

@jchchiu jchchiu marked this pull request as ready for review August 27, 2025 03:24
Copy link
Contributor

@FriederikeHanssen FriederikeHanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good already. I added some comments on where changes should be handeled instead.

What we are missing are updates to the output documentation of Mutect2.

pon_tbi = "${params.modules_testdata_base_path}/genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz.tbi"
ngscheckmate_bed = "${params.modules_testdata_base_path}/genomics/homo_sapiens/genome/chr21/germlineresources/SNP_GRCh38_hg38_wChr.bed"
mutect2_force_call = "${projectDir}/testing/test_force_call.vcf.gz"
// mutect2_force_call_tbi = "${projectDir}/testing/test_force_call.vcf.gz.tbi"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should proabbly have one test that checks things work with a supplied tbi file

@jchchiu jchchiu force-pushed the add_mutect2_force_calling branch from 6e49ded to f1f88fb Compare September 15, 2025 05:44
@jchchiu
Copy link
Author

jchchiu commented Sep 15, 2025

Following remove validation error; handled by schema, I have:

  • Removed validation error for samplesheets (handles in schema) and removed temporary tests
  • Updated the gatk4/mutect module using nf-core and any associated subworkflows
  • Updated output documentation and README
  • Added a new test config specific to mutect2 to test the parameter, but unsure if this is the best way to add a test for it (maybe change the test to tumoronly?). As of Add config for readgroups to fq2bam #1939, I've been running with nextflow run . -profile test,tools_somatic_mutect2,mutect,docker --tools mutect2 --outdir results -step variant_calling

Sorry I also messed up the commit history, should have merged instead of rebased. The previous comments are up to:
temp: added param to config tests; testing TABIX creation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we add the params we need to the tests?

Copy link
Author

@jchchiu jchchiu Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I've tried two ways to do this, could you have a look at which one you prefer?

jchchiu#2: Is what makes most sense to me by adding test scenarios and updating snap; is this what you had in mind?

jchchiu#3: Just updates the somatic and tumor_only configs to include the parameter, but I believe this changes all mutect2 tests to always test with the force_call parameter which is not very good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2nd is definitvely best for me

@jchchiu jchchiu force-pushed the add_mutect2_force_calling branch from d1daaf8 to dcb320d Compare September 17, 2025 09:15
@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.3.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@jchchiu
Copy link
Author

jchchiu commented Nov 13, 2025

@nf-core-bot fix linting

@jchchiu
Copy link
Author

jchchiu commented Nov 13, 2025

Heya, I've seen that 'Prepare genome refactor #1979' has been merged so I've updated my PR.

  • The linting error is from the updated mutect2 module; haven't updated the module because of the extra ', path(gzi)' in the updated module that I'm guessing should be updated in a different PR

  • I think everything should be good, just wondering whether there are any issues with the new snapshots for mutect2 using newer nf-test and nextflow versions? ( "nf-test": "0.9.3", "nextflow": "25.10.0" vs "nf-test": "0.9.2", "nextflow": "25.04.6")

  • Also want to confirm that there is no need for a mutect2_force_call input in the SAMPLESHEET_TO_CHANNEL script; if so I'll also do another quick PR after to update Add mutect2 force calling using '--alleles' flag #1952 (comment) (also removing the bcftools_annotations_tbi // Path: bcftools annotations tbi input since it'll no longer be used in the script)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants