Thanks to visit codestin.com
Credit goes to lib.rs

#tile #text #split #chunking #overlap #code-aware

plato-tile-split

Text chunking engine — token-aware splitting, overlap, code-aware, semantic boundary detection

1 unstable release

0.1.0 Apr 21, 2026

#2157 in Text processing

MIT license

23KB
482 lines

plato-tile-split

Decompose tiles by sentence, paragraph, or size. pip install plato-tile-split


lib.rs:

plato-tile-split

Text chunking engine. Splits large text into tiles with token-aware boundaries, overlap for context preservation, and code-aware splitting.

Why Rust

Text splitting is CPU-bound string processing. Python's string slicing creates new objects on every operation. Rust's &str slicing is zero-copy.

Metric Python (str.split) Rust (&str.split)
Split 1MB text ~15ms ~2ms
Memory per chunk ~200 bytes (str obj) ~40 bytes (&str + Vec)

The zero-copy nature of Rust string slicing is the key advantage here.

Dependencies

~0.4–1.3MB
~28K SLoC