Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

RiddleHe
Copy link

We find a bug in extraction_utils.py where the code block is identified as the content within the last pair of backticks, specifically handled by this logic:

return "\n".join(outputlines[indexlines[-2] + 1 : indexlines[-1]])

However, this logic is vulnerable because whenever the model decides to put a text summary as a markdown component after the solution code, the summary will be extracted, not the python code.

To fix this, we changed the extraction logic to look specifically for the last pair of backticks with python tags, ensuring that python code is extracted. Only when no python code is detected do we fall back to the original implementation.

@Naman-ntc
Copy link
Contributor

Thanks, but I do not think having python identifier is a robust solution for this. I recommend using custom extraction format to ensure reliability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants