Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Improve regex to detect chars immediately after inline literals and roles #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ezio-melotti opened this issue May 12, 2022 · 1 comment

Comments

@ezio-melotti
Copy link
Collaborator

ezio-melotti commented May 12, 2022

From #27 (comment): some characters (such as punctuation) are allowed immediately after inline literals and roles, whereas others (alphanumerical) aren't.

The following regex could be improved to catch only the problematic characters:

for role in re.finditer("``.+?``(?!`).", paragraph_without_roles, flags=re.DOTALL):

Replacing the last . with \w seems to cause lot of false positives though.

@JulienPalard
Copy link
Collaborator

I think this is now fixed, not because we enhanced the regex, but because all legitimate constructions are hidden from this detector via the paragraph = clean_paragraph(paragraph) call one line before the regex.

Don't hesitate to re-open if you have an example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants