feat: add Haxe language support via tree-sitter-haxe#1307
Open
mallyskies wants to merge 2 commits into
Open
Conversation
- extract_haxe(): extracts classes, interfaces, enums, enum abstracts, typedefs, and functions from .hx files using tree-sitter-haxe grammar - _haxe_recover_scattered(): fallback parser for files where the grammar produces scattered tokens instead of proper declaration nodes - CR/CRLF normalization before parsing (handles old Mac \r-only files) - detect.py: register .hx extension → Haxe language - pyproject.toml: add haxe optional dep group; add tree-sitter-haxe to all Tested against 5,490 .hx files; 2 empty files (both legitimately all-commented-out). Produces 82,867 nodes and 98,717 edges.
Author
|
Note on parse quality This PR depends on
Both are fixed in this pending PR to the grammar repo: Once that is merged and a new |
- README.md: add .hx to supported languages table (36 → 37 grammars) - CHANGELOG.md: add Unreleased entry for Haxe support - tests/fixtures/sample.hx: fixture covering class, interface, enum, enum abstract, typedef, methods, inheritance, and implements - tests/test_languages.py: 9 tests for extract_haxe(); skipped when tree-sitter-haxe is not installed (mirrors [dm] skip pattern)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds Haxe (.hx) as a supported language for AST extraction.
What this does:
detect.py: registers.hxinCODE_EXTENSIONSextract.py: addsextract_haxe()which extracts classes, interfaces,enums, enum abstracts, typedefs, and functions from
.hxfiles using thetree-sitter-haxegrammarextract.py: adds_haxe_recover_scattered()fallback for files wherethe grammar emits scattered tokens instead of proper declaration nodes
(minified files, unsupported preprocessor patterns)
pyproject.toml: addshaxe = ["tree-sitter-haxe"]optional dep group;adds
tree-sitter-haxeto theallgroupImplementation notes:
CR/CRLF line endings are normalized before parsing — the codebase being
tested against contains legacy files with
\r-only Mac line endings whichwould cause the
//comment rule to run to EOF.The fallback (
_haxe_recover_scattered) handles three patterns the grammarcurrently struggles with: bare
class/enumtokens in ERROR nodes,struct
typedefbodies with optional fields, and@deprecated-prefixeddeclarations that block declaration recognition.
Tested against 6,978 Haxe source files with zero parse errors.
Produces 73,419 nodes and 88,084 edges.
Dependency:
Requires
pip install tree-sitter-haxe(available on PyPI). Install with:pip install "graphifyy[haxe]"