Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

lildude
Copy link
Member

@lildude lildude commented Jan 10, 2025

Description

GitHub's search no longer users go-enry and instead uses an internally developed library for language detection. This still feeds off Linguist so the same delays and limitations apply.

This PR updates the docs to reflect we no longer use go-enry.

Checklist:

N/A

@lildude lildude requested a review from a team as a code owner January 10, 2025 09:54
@lildude lildude merged commit 09880c7 into main Jan 10, 2025
8 checks passed
@lildude lildude deleted the lildude/update-docs-no-go-enry branch January 10, 2025 10:01
@DecimalTurn
Copy link
Contributor

Can we know what language is used for the internal library? It would help make sure that the regex syntax we use in Linguist is compatible.

@lildude
Copy link
Member Author

lildude commented Jan 13, 2025

It written in Rust.

@DecimalTurn
Copy link
Contributor

DecimalTurn commented Feb 16, 2025

Is it the project mentioned here?

I'm asking because I would like to confirm what engine is used for regex patterns. For instance, if they use this implementation of regex, they would have no support for possessive qualifiers (and most non-Re2 regex patterns) as discussed here. Which means we should probably try to avoid or remove them from the heuristics in Linguist.

@lildude
Copy link
Member Author

lildude commented Feb 17, 2025

Is it the project mentioned here?

Yup.

I'm asking because I would like to confirm what engine is used for regex patterns. For instance, if they use this implementation of regex, they would have no support for possessive qualifiers (and most non-Re2 regex patterns) as discussed here. Which means we should probably try to avoid or remove them from the heuristics in Linguist.

That's the implementation that is used and you raise a good point. Thanks and thanks for #7238.

If you've got an urge to fix more regexes, we have several regexes that need fixing as they don't run linearly or are vulnerable to ReDoS. I've started main...lildude/linear-regex-redos to add a test and clean them up as and when I have the time, so feel free to continue cleaning up our regexes.

@DecimalTurn
Copy link
Contributor

@lildude Thanks for confirming and yes, I do have the intention to work on some more regex adjustments. I actually have some more changes that I was going to submit PRs for, so I'll get on with it.

@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jul 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants