feat: add Haxe language support via tree-sitter-haxe#1307
Conversation
|
Note on parse quality This PR depends on
Both are fixed in this pending PR to the grammar repo: Update: to correct the above — there's actually no PyPI release of |
|
Thanks @mallyskies — the Haxe extractor itself is well-built (follows the
Nice-to-have: add |
f3b0361 to
63ee262
Compare
- extract_haxe(): extracts classes, interfaces, enums, enum abstracts, typedefs, and functions from .hx files using tree-sitter-haxe grammar - _haxe_recover_scattered(): fallback parser for files where the grammar produces scattered tokens instead of proper declaration nodes - CR/CRLF normalization before parsing (handles old Mac \r-only files) - detect.py: register .hx extension → Haxe language - pyproject.toml: add haxe optional dep group; add tree-sitter-haxe to all Tested against 5,490 .hx files; 2 empty files (both legitimately all-commented-out). Produces 82,867 nodes and 98,717 edges.
- README.md: add .hx to supported languages table (36 → 37 grammars) - CHANGELOG.md: add Unreleased entry for Haxe support - tests/fixtures/sample.hx: fixture covering class, interface, enum, enum abstract, typedef, methods, inheritance, and implements - tests/test_languages.py: 9 tests for extract_haxe(); skipped when tree-sitter-haxe is not installed (mirrors [dm] skip pattern)
PyPI/Warehouse rejects any package upload whose metadata contains a direct URL/VCS dependency. graphifyy is actively published to PyPI, so the haxe extra's git+https dependency would block every future release of the package, not just fail to build for haxe users. Drop the extra entirely and document a manual pip install git+https://github.com/masquepublishing/tree-sitter-haxe.git step in the README instead, matching how the project treats every other grammar with install friction (real PyPI name, or nothing) - there is no existing precedent for a non-PyPI dependency in pyproject.toml.
aa4f473 to
ef21405
Compare
|
@safishamsi Thanks for your help and patience. I believe I've addressed both concerns:
NOTE: test_haxe_finds_imports/test_haxe_finds_calls (with real edge-label assertions) were already in the second commit, so the edge-assertion ask should be covered, but let me know if I didn't do that the way you want. |
Adds Haxe (.hx) as a supported language for AST extraction.
What this does:
detect.py: registers.hxinCODE_EXTENSIONSextract.py: addsextract_haxe()which extracts classes, interfaces,enums, enum abstracts, typedefs, and functions from
.hxfiles using thetree-sitter-haxegrammarextract.py: adds_haxe_recover_scattered()fallback for files wherethe grammar emits scattered tokens instead of proper declaration nodes
(minified files, unsupported preprocessor patterns)
pyproject.toml: nohaxeextra —tree-sitter-haxehas no PyPIrelease, and PyPI rejects packages with a direct URL/VCS dependency in
Requires-Dist, so declaring one here would block every futuregraphifyyrelease.extract_haxe()lazy-importstree_sitter_haxewith a graceful
ImportErrorguard (same pattern asdm/terraform),so this is a pure packaging change with no functional impact.
Implementation notes:
CR/CRLF line endings are normalized before parsing — the codebase being
tested against contains legacy files with
\r-only Mac line endings whichwould cause the
//comment rule to run to EOF.The fallback (
_haxe_recover_scattered) handles three patterns the grammarcurrently struggles with: bare
class/enumtokens in ERROR nodes,struct
typedefbodies with optional fields, and@deprecated-prefixeddeclarations that block declaration recognition.
Tested against 6,978 Haxe source files with zero parse errors.
Produces 73,419 nodes and 88,084 edges.
Dependency:
Requires
tree-sitter-haxe— not on PyPI, so install the patched forkdirectly: