Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

althonos
Copy link
Contributor

@althonos althonos commented Jan 29, 2023

Hi, one final PR πŸ˜ƒ

In the test sequence I used for #100 I noticed the following bug: after reverse-complementing a sequence, the RBS spacer for one of the predicted gene was changing when the contig was reverse-complemented:

Forward:
CAKWEX010000332.1	Prodigal_v2.6.3	CDS	520	2178	298.9	-	0	ID=1_1;partial=00;start_type=ATG;rbs_motif=AGGA;rbs_spacer=5-10bp;gc_cont=0.636;conf=99.99;score=298.87;cscore=285.76;sscore=13.11;rscore=7.85;uscore=1.61;tscore=3.65;
CAKWEX010000332.1	Prodigal_v2.6.3	CDS	2250	2852	105.5	-	0	ID=1_2;partial=00;start_type=GTG;rbs_motif=AGGAG;rbs_spacer=5-10bp;gc_cont=0.580;conf=100.00;score=105.47;cscore=95.43;sscore=10.04;rscore=13.75;uscore=1.16;tscore=-4.87;
CAKWEX010000332.1	Prodigal_v2.6.3	CDS	2936	3754	140.5	-	0	ID=1_3;partial=00;start_type=ATG;rbs_motif=GGA/GAG/AGG;rbs_spacer=5-10bp;gc_cont=0.601;conf=100.00;score=140.50;cscore=133.74;sscore=6.75;rscore=2.03;uscore=1.08;tscore=3.65;
Reverse-complemented:
CAKWEX010000332.1_r	Prodigal_v2.6.3	CDS	12	830	136.1	+	0	ID=1_1;partial=00;start_type=ATG;rbs_motif=GGA/GAG/AGG;rbs_spacer=3-4bp;gc_cont=0.601;conf=100.00;score=136.12;cscore=133.74;sscore=2.37;rscore=-2.35;uscore=1.08;tscore=3.65;
CAKWEX010000332.1_r	Prodigal_v2.6.3	CDS	914	1516	105.5	+	0	ID=1_2;partial=00;start_type=GTG;rbs_motif=AGGAG;rbs_spacer=5-10bp;gc_cont=0.580;conf=100.00;score=105.47;cscore=95.43;sscore=10.04;rscore=13.75;uscore=1.16;tscore=-4.87;
CAKWEX010000332.1_r	Prodigal_v2.6.3	CDS	1588	3246	298.9	+	0	ID=1_3;partial=00;start_type=ATG;rbs_motif=AGGA;rbs_spacer=5-10bp;gc_cont=0.636;conf=99.99;score=298.87;cscore=285.76;sscore=13.11;rscore=7.85;uscore=1.61;tscore=3.65;

Indeed, the gene with the GGA/GAG/AGG RBS motif has a spacer detected as 3-4bp when on the forward strand, and 5-10bp on the reverse strand. The contig in question starts with the following sequence:

GGATAGGCCCCATG...

so it has both a match in the 3-4bp range (AGG) and in the 5-10bp range (GGA), but since the 5-10bp spacer has a higher score it should be the one to be selected. This actually matters on the gene score, so it could cause some predictions to change.

The problem was coming from the loops in rbs_score which skip some positions before index 0; however, when there may be a partial match (as it is the case here, with a GGA motif right on the contig edge), the positions should not be skipped, and the decision to ignore some positions should be made by the shine_dalgarno_exact and shine_dalgarno_mm functions directly.

After applying the patch, the predictions are consistent independent of the directionality of the contig, the RBS spacers and hence the gene scores match:

Forward:
CAKWEX010000332.1	Prodigal_v2.6.3	CDS	520	2178	298.9	-	0	ID=1_1;partial=00;start_type=ATG;rbs_motif=AGGA;rbs_spacer=5-10bp;gc_cont=0.636;conf=99.99;score=298.87;cscore=285.76;sscore=13.11;rscore=7.85;uscore=1.61;tscore=3.65;
CAKWEX010000332.1	Prodigal_v2.6.3	CDS	2250	2852	105.5	-	0	ID=1_2;partial=00;start_type=GTG;rbs_motif=AGGAG;rbs_spacer=5-10bp;gc_cont=0.582;conf=100.00;score=105.47;cscore=95.43;sscore=10.04;rscore=13.75;uscore=1.16;tscore=-4.87;
CAKWEX010000332.1	Prodigal_v2.6.3	CDS	2936	3754	140.5	-	0	ID=1_3;partial=00;start_type=ATG;rbs_motif=GGA/GAG/AGG;rbs_spacer=5-10bp;gc_cont=0.602;conf=100.00;score=140.50;cscore=133.74;sscore=6.75;rscore=2.03;uscore=1.08;tscore=3.65;
Reverse-complemented:
CAKWEX010000332.1_r	Prodigal_v2.6.3	CDS	12	830	140.5	+	0	ID=1_1;partial=00;start_type=ATG;rbs_motif=GGA/GAG/AGG;rbs_spacer=5-10bp;gc_cont=0.601;conf=100.00;score=140.50;cscore=133.74;sscore=6.75;rscore=2.03;uscore=1.08;tscore=3.65;
CAKWEX010000332.1_r	Prodigal_v2.6.3	CDS	914	1516	105.5	+	0	ID=1_2;partial=00;start_type=GTG;rbs_motif=AGGAG;rbs_spacer=5-10bp;gc_cont=0.580;conf=100.00;score=105.47;cscore=95.43;sscore=10.04;rscore=13.75;uscore=1.16;tscore=-4.87;
CAKWEX010000332.1_r	Prodigal_v2.6.3	CDS	1588	3246	298.9	+	0	ID=1_3;partial=00;start_type=ATG;rbs_motif=AGGA;rbs_spacer=5-10bp;gc_cont=0.636;conf=99.99;score=298.87;cscore=285.76;sscore=13.11;rscore=7.85;uscore=1.61;tscore=3.65;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant