Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

DecimalTurn
Copy link
Contributor

@DecimalTurn DecimalTurn commented Feb 17, 2025

Description

Atomic groups (aka. Non-backtracking subexpression or possessive match) aren't supported in re2. I've removed them to increase portability across libraries that use Linguist's heuristics.

Note that atomic groups can be useful to increase performance, but the risk of excessive backtracking in the cases here is minimal or I've added other countermesures when needed.

Win32 Message File

There was no real use of atomic grouping there since \/\*\s* can't really cause any branching in this context.

INI

The original use of an atomic group in (?>[^\s\[][^\r\n]*(?:\r?\n|\r))* can prevent some backtracking in case a file contains "InternetShortcut]", but no "URL" property afterwards. However, even without the atomic grouping, the execution time remains linear since the capture group doesn't have multiple ways of matching a whole line. I've also added a limit of 20 lines for the "URL" property to appear since that should be more than enough.

GSC

First pattern: The atomic grouping was only preventing minor backtracking due to the presence of 2 options.

Second pattern: That was probably the most useful case of atomic grouping among the one presented here. However, simply combining (?>\w+\.)*\w+ into (\w+\.)+ like proposed in this PR will already prevent the vast majority of potential backtracking.

@DecimalTurn DecimalTurn requested a review from a team as a code owner February 17, 2025 12:40
Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Important

The changes in this PR will not appear on GitHub until the next release has been made and deployed. See here for more details.

@lildude lildude added this pull request to the merge queue Feb 23, 2025
Merged via the queue into github-linguist:main with commit 5892293 Feb 23, 2025
5 checks passed
@DecimalTurn DecimalTurn deleted the atomic_group branch February 23, 2025 13:56
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jul 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants