-
regex
regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
-
unicode-width
Determine displayed width of
charandstrtypes according to Unicode Standard Annex #11 rules -
comfy-table
An easy to use library for building beautiful tables with automatic content wrapping
-
ropey
A fast and robust text rope for Rust
-
similar
A diff library for Rust
-
encoding_rs
A Gecko-oriented implementation of the Encoding Standard
-
textwrap
word wrapping, indenting, and dedenting strings. Has optional support for Unicode and emojis as well as machine hyphenation.
-
fancy-regex
regexes, supporting a relatively rich set of features, including backreferences and look-around. Aims to be compatible with Oniguruma syntax when the relevant flag is set.
-
const_format
Compile-time string formatting
-
pulldown-cmark
A pull parser for CommonMark
-
lazy-regex
lazy static regular expressions checked at compile time
-
lopdf
PDF document manipulation
-
tabled
An easy to use library for pretty print tables of Rust
structs andenums -
unicode-segmentation
Grapheme Cluster, Word and Sentence boundaries according to Unicode Standard Annex #29 rules
-
convert_case
Convert strings into any case
-
tokenizers
today's most used tokenizers, with a focus on performances and versatility
-
unicode-normalization
functions for normalization of Unicode strings, including Canonical and Compatible Decomposition and Recomposition, as described in Unicode Standard Annex #15
-
widestring
wide string Rust library for converting to and from wide strings, such as those often used in Windows API or other FFI libaries. Both
u16andu32string types are provided, including support for UTF-16 and UTF-32… -
html2text
Render HTML as plain text
-
rustybuzz
A complete harfbuzz shaping algorithm port to Rust
-
deunicode
Convert Unicode strings to pure ASCII by intelligently transliterating them. Suppors Emoji and Chinese.
-
heck
case conversion library
-
ansi-to-tui
Convert ANSI color and style codes into Ratatui Text
-
fuzzy-matcher
Fuzzy Matching Library
-
emojis
✨ Lookup emoji in *O(1)* time, access metadata and GitHub shortcodes, iterate over all emoji, and more!
-
email_address
providing an implementation of an RFC-compliant
EmailAddressnewtype -
prometheus-client
Open Metrics client library allowing users to natively instrument applications
-
mime_guess
detection of a file's MIME type by its extension
-
termimad
Markdown Renderer for the Terminal
-
protobuf
Protocol Buffers - Google's data interchange format
-
onig
Rust-Onig is a set of Rust bindings for the Oniguruma regular expression library. Oniguruma is a modern regex library with support for multiple character encodings and regex syntaxes.
-
const-str
compile-time string operations
-
regress
A regular expression engine targeting EcmaScript syntax
-
printpdf
reading and writing PDF files
-
linkify
Finds URLs and email addresses in plain text. Takes care to get the boundaries right with surrounding punctuation like parentheses.
-
indenter
A formatter wrapper that indents the text, designed for error display impls
-
unicase
A case-insensitive wrapper around strings
-
diff
An LCS based slice and string diffing implementation
-
text-splitter
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rust and Python.
-
chardetng
A character encoding detector for legacy Web content
-
jieba-rs
Jieba Chinese Word Segmentation in Rust
-
strip-ansi-escapes
Strip ANSI escape sequences from byte streams
-
unicode-script
exposes the Unicode
ScriptandScript_Extensionproperties from UAX #24 -
html-to-markdown-rs
High-performance HTML to Markdown converter using the astral-tl parser. Part of the Kreuzberg ecosystem.
-
diffy
Tools for finding and manipulating differences between files
-
lindera
A morphological analysis library
-
grok
popular Java & Ruby grok library which allows easy text and log file processing with composable patterns
-
pdf-extract
extract content from pdfs
-
roff
ROFF (man page format) generation library
-
scip
SCIP (pronunciation: "skip") is a language-agnostic protocol for indexing source code, which can be used to power code navigation functionality such as Go to definition, Find references, and Find implementations
-
harfrust
A complete HarfBuzz shaping algorithm port to Rust
-
prettydiff
Side-by-side diff for two files
-
garde
Validation library
-
pulldown-cmark-to-cmark
Convert pulldown-cmark Events back to the string they were parsed from
-
lngcnv
linguistics: display pronunciation, translate between dialects, convert between orthographies; support for multiple languages: English, Latin, Polish, Quechua, Spanish, Tikuna
-
finl_unicode
handling Unicode functionality for finl (categories and grapheme segmentation)
-
edit-distance
Levenshtein edit distance between strings, a measure for similarity
-
rust-stemmers
some popular snowball stemming algorithms
-
charabia
detect the language, tokenize the text and normalize the tokens
-
unicode-general-category
Fast lookup of the Unicode General Category property for char
-
frizbee
Fast typo-resistant fuzzy matching via SIMD smith waterman, similar algorithm to FZF/FZY
-
htmd
A turndown.js inspired HTML to Markdown converter
-
lipsum
lorem ipsum text generation library. It generates pseudo-random Latin text. Use this if you need filler or dummy text for your application. The text is generated using a simple Markov chain…
-
unicode-truncate
Unicode-aware algorithm to pad or truncate
strin terms of displayed width -
rphonetic
Rust port of phonetic Apache commons-codec algorithms
-
rxing
port of the zxing barcode library
-
cruet
Adds String based inflections for Rust. Snake, kebab, camel, sentence, class, title and table cases as well as ordinalize, deordinalize, demodulize, foreign key, and pluralize/singularize…
-
indoc
Indented document literals
-
marmite
easiest static site generator
-
wana_kana
checking and converting between Japanese characters - Kanji, Hiragana, Katakana - and Romaji
-
regex-syntax
A regular expression parser
-
line-index
Maps flat
TextSizeoffsets to/from(line, column)representation -
synoptic
low-level, syntax highlighting library with unicode support
-
os_display
Display strings in a safe platform-appropriate way
-
unescaper
Unescape strings with escape sequences written out as literal characters
-
hyphenation
Knuth-Liang hyphenation for a variety of languages
-
fontconfig
Safe, higher-level wrapper around the Fontconfig library
-
rand_regex
Generates random strings and byte strings matching a regex
-
cow-utils
Copy-on-write string utilities for Rust
-
stop-words
Common stop words in many languages
-
zerostack
Minimalistic coding agent written in Rust, optimized for memory footprint and performance
-
ruskel
Generates skeletonized outlines of Rust crates
-
sentencex
Sentence segmentation library with wide language support optimized for speed and utility
-
fasttext
pure Rust implementation
-
unicode-properties
Query character Unicode properties according to UAX #44 and UTR #51
-
unicode-reverse
Unicode-aware in-place string reversal
-
unicode_names2
Map characters to and from their name given in the Unicode standard. This goes to great lengths to be as efficient as possible in both time and space, with the full bidirectional tables weighing barely 500 KB…
-
markdown-tui-explorer
A terminal-based markdown file browser and viewer with search, syntax highlighting, and live reload
-
html2md
binary to convert simple html documents into markdown
-
hypher
separates words into syllables
-
titlecase
Capitalize text according to a style defined by John Gruber for Daring Fireball
-
dwrote
Lightweight binding to DirectWrite
-
epub-builder
generating EPUB files
-
any_ascii
Unicode to ASCII transliteration
-
stringzilla
Search, hash, sort, fingerprint, and fuzzy-match strings faster via SWAR, SIMD, and GPGPU
-
decancer
that removes common unicode confusables/homoglyphs from strings
-
usage-lib
working with usage specs
-
mdbook-pdf
A backend for mdBook written in Rust for generating PDF based on headless chrome and Chrome DevTools Protocol
-
mupdf
Safe Rust wrapper to MuPDF
-
nucleo-matcher
plug and play high performance fuzzy matcher
-
emojic
Emoji constants
-
shiguredo_http11
HTTP/1.1 Library
-
unicode-id
Determine whether characters have the ID_Start or ID_Continue properties according to Unicode Standard Annex #31
-
stringcase
Converts string cases between camelCase, COBOL-CASE, kebab-case, and so on
-
ferritin
Human-friendly CLI for browsing Rust documentation
-
fitsio
Rust implmentation of astronomy fits file handling
-
sentencepiece
Binding for the sentencepiece tokenizer
-
boreal
evaluate YARA rules, used to scan bytes for textual and binary pattern
-
crop
A pretty fast text rope
-
icu_pattern
ICU pattern utilities
-
zawk
An efficient Awk-like language implementation by Rust with stdlib
-
diff-match-patch-rs
The fastest implementation of Myer's diff algorithm to perform the operations required for synchronizing plain text
-
str_indices
Count and convert between indexing schemes on string slices
-
unicode-xid
Determine whether characters have the XID_Start or XID_Continue properties according to Unicode Standard Annex #31
-
difflib
Port of Python's difflib library to Rust
-
tiktoken
A high-performance pure-Rust implementation of OpenAI's tiktoken BPE tokenizer
-
textsurf
Webservice for efficiently serving multiple plain text documents or excerpts thereof (by unicode character offset), without everything into memory
-
font-types
Scalar types used in fonts
-
mago-linter
A PHP linter that identifies common coding errors, style issues, and potential bugs, helping maintain high code quality
-
text-document
Rich text document editing library
-
mdvault
CLI tool for managing markdown vaults with structured notes, validation, and search
-
fontcull
Pure Rust font subsetting library
-
chordsketch
ChordPro command-line tool
-
aptu-cli
CLI for Aptu - Gamified OSS issue triage with AI assistance
-
omekasy
Decorate alphanumeric characters in your input with various font; special characters in Unicode
-
regex-cursor
regex fork that can search discontiguous haystacks
-
neo_frizbee
Fast typo-resistant fuzzy matching via SIMD smith waterman, similar algorithm to FZF/FZY
-
mq-markdown
Markdown parsing and manipulation utilities for mq
-
inlyne
Introducing Inlyne, a GPU powered yet browserless tool to help you quickly view markdown files in the blink of an eye
-
typstyle
The CLI for Typstyle
-
arborium-c-sharp
C# grammar for arborium (tree-sitter bindings)
-
arrow-string
String kernels for arrow arrays
-
unicode-blocks
contains a list of all unicode blocks and provides some functions to search across them
-
unicode_categories
Query Unicode category membership for chars
-
lsp-textdocument
A LSP text documents manager that map of text document
-
idna
IDNA (Internationalizing Domain Names in Applications) and Punycode
-
hck
A sharp cut(1) clone
-
symspell
Spelling correction & Fuzzy search
-
markdown-org-extract
CLI utility for extracting tasks from markdown files with Emacs Org-mode support
-
ascii
ASCII-only equivalents to
char,strandString -
matchers
Regex matching on character and byte streams
-
kak-lsp
Kakoune Language Server Protocol Client
-
turbovault-parser
Obsidian Flavored Markdown (OFM) parser
-
flickzeug
A fork of diffy: diff, patch, and merge library featuring Myers' algorithm, unified diff format parsing, fuzzy patch application, and three-way merge with conflict detection
-
deno_media_type
Media type used in Deno
-
markdown2pdf
Create PDF with Markdown files (a md to pdf transpiler)
-
terraphim_rolegraph
Terraphim rolegraph module, which provides role handling for Terraphim AI
-
hgrep
grep tool with human-friendly search output. This is similar to
-Coption ofgrepcommand, but its output is enhanced with syntax highlighting focusing on human readable outputs. -
uncased
Case-preserving, ASCII case-insensitive, no_std string types
-
aws-sdk-geoplaces
AWS SDK for Amazon Location Service Places V2
-
sanitizer
A collection of methods and macros to sanitize struct fields
-
esed
Easy sed
-
entities
raw data needed to convert to and from HTML entities
-
cesu8
Convert to and from CESU-8 encoding (similar to UTF-8)
-
laurus
Unified search library for lexical, vector, and semantic retrieval
-
textcode
Text encoding/decoding library. Supports: UTF-8, ISO6937, ISO8859, GB2312
-
katana-markdown-linter
markdownlint-compatible Markdown linter library
-
zpl_toolchain_cli
Command-line interface for parsing, validating, formatting, and printing ZPL II label code (part of the zpl-toolchain project)
-
stfu8
Sorta Text Format in UTF-8
-
mdbook-katex
mdBook preprocessor rendering LaTeX equations to HTML
-
torudo
A terminal-based todo.txt viewer and manager with TUI interface
-
firecrawl
Official Rust SDK for Firecrawl API v2
-
gaze-cli
Gaze command-line interface
-
pdfv
Command-line interface for the pdfv validator
-
bochi
A CLI tool to interact with Android UI elements with CSS-like selectors
-
panache
An LSP, formatter, and linter for Markdown, Quarto, and R Markdown
-
uwc
Counts things in unicode text files
-
microcad
µcad Command Line Interface
-
giallo
A code highlighter giving the same output as VSCode
-
rkg
A one-liner oriented record/grid processor
-
regex-anre
full-featured, zero-dependency regular expression engine that supports both standard and ANRE regular expressions
-
braillify
Rust 기반 크로스플랫폼 한국어 점역 라이브러리
-
simdnbt
an unnecessarily fast nbt decoder
-
rustpython-ruff_source_file
Unofficial fork for RustPython
-
distrs
PDF, CDF, and percent-point/quantile functions for the normal and Student’s t distributions
-
languagetool-rust
LanguageTool API bindings in Rust
-
pdf
PDF reader
-
in_definite
Get the indefinite article ('a' or 'an') to match the given word. For example: an umbrella, a user.
-
content-extractor-rl-cli
RL-based article extraction from HTML using Deep Q-Networks and heuristic fallback
-
json-escape
A no_std, zero-copy, allocation-free library for streaming JSON string escaping and unescaping. Ergonomic, fast, RFC 8259 compliant, with layered APIs for iterators, I/O streaming, and low-level tokens.
-
mdbook-admonish
A preprocessor for mdbook to add Material Design admonishments
-
yggdrasil-cli
Yggdrasil is a project flattener and diff engine that turns any subset of your codebase into a single AI-ready codex (index + contents), or compares snapshots with annotated diffs
-
unicode_titlecase
add Unicode titlecase and Turkish and Azeri locale upper/lowercase utilities to chars and strings
-
mktoc
Generate Table of Contents from Markdown files
-
yore
decoding/encoding character sets according to OEM code pages
-
qpdf
Rust bindings to QPDF C++ library
-
mime-infer
detection of a file's MIME type by its extension
-
GORBIE
GORBIE! Is a minimalist notebook library for Rust
-
kubetui
An intuitive Terminal User Interface (TUI) tool for real-time monitoring and exploration of Kubernetes resources
-
treegrep
regex pattern matcher that displays results in a tree structure with an interface to jump to matched text
-
spellbook
A spellchecking library compatible with Hunspell dictionaries
-
spider_transformations
Transformation utils to use for spider
-
makefile-lossless
Lossless Parser for Makefiles
-
typst-kit
Common utilities for Typst tooling
-
lucid-lint
A cognitive accessibility linter for prose. Bilingual EN/FR. CI-native.
-
svgdx
create SVG diagrams easily
-
dprint-plugin-typescript
TypeScript and JavaScript code formatter
-
collclean
Clean up collaboration commands in LaTeX files
-
sliceslice
A fast implementation of single-pattern substring search using SIMD acceleration
-
microresolve
System 1 relay for LLM apps — sub-millisecond intent classification, safety gating, tool selection. CPU-only, continuous learning from corrections.
-
treelog
A highly customizable, optimized, and modular tree rendering library
-
mdbook-yapp
mdBook preprocessor for simple text replacements
-
fetchkit
AI-friendly web content fetching and HTML-to-Markdown conversion library
-
cirru_parser
Parser for Cirru text syntax
-
patchkit
parsing and manipulating patch files
-
savvy
R extension interface
-
zhconv
Traditional, Simplified and regional Chinese variants converter powered by MediaWiki & OpenCC rulesets and the Aho-Corasick algorithm 中文简繁及地區詞轉換
-
diffutils
A CLI app for generating diff files
-
chewing
(酷音) intelligent Zhuyin input method
-
uncomment
A CLI tool to remove comments from code using tree-sitter for accurate parsing
-
sqry-nl
Natural language to sqry query translation layer
-
ferrous-opencc
A pure Rust implementation of Open Chinese Convert (OpenCC), for fast and reliable conversion between Traditional and Simplified Chinese
-
tossicat
입력된 단어에 맞게 같이 입력된 토시(조사)를 적절하게 변환하는 라이브러리
-
olpc-cjson
serde_json Formatter to serialize as OLPC-style canonical JSON
-
rapidfuzz
rapid fuzzy string matching library
-
resharp-grep
recursive grep with boolean constraints and regex intersection
-
bookforge-cli
CLI-first EPUB translation engine with deterministic structure rebuild and review loop
-
sds-converter
CLI for converting chemical safety SDS documents (PDF/DOCX) ↔ MHLW/JIS Z 7253 standard JSON via LLM (Claude/GPT/Gemini). Batch mode, multilingual.
-
textprep
Text preprocessing primitives: normalization, tokenization, and fast keyword matching
-
chat-gpt-lib-rs
interacting with OpenAI's ChatGPT API, providing a simple interface to make API requests and handle responses
-
citum-engine
Citum citation and bibliography processor
-
todo_lib
Collection of utilities for todo.txt format
-
rschess
chess library with the aim to be as feature-rich as possible
-
graphannis
new backend implementation of the ANNIS linguistic search and visualization system
-
simple-string-patterns
Makes it easier to match, split and extract strings in Rust without regular expressions. The parallel string-patterns crate provides extensions to work with regular expressions via the Regex library
-
rich_rust
port of Python's Rich library for beautiful terminal output
-
kham-cli
Command-line interface for the kham Thai word segmenter
-
inlinable_string
inlinable_stringcrate provides theInlinableStringtype – an owned, grow-able UTF-8 string that stores small strings inline and avoids heap-allocation – and theStringExttrait… -
camxes-rs
Lojban PEG parser with semantic analysis - integrated camxes parser and tersmu semantic engine
-
vaporetto
pointwise prediction based tokenizer
-
llm-guard
Zero-copy guardrails for LLM input/output. Pure-Rust scanners (prompt-injection, role-override, secret leakage, PII, invisible text, deobfuscation, token limit).
-
md-tui-rs
Terminal markdown reader and HackMD-style split-screen editor with mouse support, clickable links and task lists, fuzzy search, and a directory browser
-
markdown-tool
A CLI utility for converting Markdown into AST and vice versa
-
oxideav-scribe
Pure-Rust vector font shaper + layout for the oxideav framework — TrueType / OTF outline access, GSUB ligatures, GPOS kerning, mark attachment, CBDT colour bitmaps. Pixel pipeline lives in oxideav-raster.
-
uroman
A self-contained Rust reimplementation of the uroman universal romanizer
-
mdbook-epub
An EPUB renderer for mdbook
-
unicode-joining-type
Fast lookup of the Unicode Joining Type and Joining Group properties
-
repgrep
An interactive command line replacer for
ripgrep -
dptran
run DeepL translations on command line written by Rust
-
mdr
A lightweight Markdown viewer with live reload and multiple rendering backends
-
jx
An interactive JSON explorer for the command line
-
quamina
Fast pattern-matching library for filtering JSON events
-
htop
HTML to PDF converter
-
stylin
Convert markdown to pandoc markdown with custom styles
-
colored_text
adding colors and styles to terminal text
-
awabi
A morphological analyzer using mecab dictionary
-
quixote
Quizzes and tests in Markdown
-
tauri-plugin-clipboard
A clipboard plugin for Tauri that supports text, html, rtf, files and image, as well as clipboard update listening
-
forbidden-strings
Out-of-band scanner for forbidden literal strings and regex patterns. Gitignore-aware, fast, dependency-light: built for CI deny-listing of leaked credentials and banned tokens.
-
ferroni
Pure-Rust Oniguruma regex engine with SIMD-accelerated search
-
sapphire-journal
Markdown-based task and note manager that keeps your data alive as plain text - timeless like fossils
-
ngrammatic
Character-oriented ngram generator and fuzzy matching library
-
text-processing-rs
Inverse Text Normalization (ITN) — convert spoken-form ASR output to written form
-
cargo-spellcheck
Checks all doc comments for spelling mistakes
-
hongdown
A Markdown formatter that enforces Hong Minhee's Markdown style conventions
-
presenterm
A terminal slideshow presentation tool
-
hermes-tool
CLI tools for Hermes - index management, simhash, sorting, and data processing
-
tiefdownconverter
A CLI tool to manage and convert Markdown-based projects
-
stringdex
A suffixtree search system for static sites
-
espeak-ng
Pure Rust port of eSpeak NG text-to-speech
-
line-ending
Detect, normalize, and convert line endings across platforms, including support for character streams. Ensures consistent handling of LF, CRLF, and CR line endings in text processing.
-
popsam-cli
CLI for AI-assisted selection of semantically representative texts
-
mcd-cli
Command line interface for Markdown CSV Document packages
-
rushdown
A 100% CommonMark-compatible GitHub Flavored Markdown parser and renderer
-
wit_owo
interacting with the Wit.ai API
-
mdx-gen
A robust Rust library for processing Markdown and converting it to HTML with support for custom blocks, enhanced table formatting, and flexible configuration options
-
cols
Smart adaptive formatting of columnar data
-
rdfless
A colorful pretty printer for RDF (Turtle/TriG/N-Triples/N-Quads/PROV-N) with ANSI colors
-
text2num
Parse and convert numbers written in English, Dutch, Spanish, Portuguese, German, Italian or French into their digit representation
-
sapling-streampager
streampager is a pager for command output or large files
-
mime_guess2
detection of a file's MIME type by its extension
-
aki-xcat
concatenate files that are plain, gzip, xz and zstd
-
prosesmasher
Deterministic prose quality validator (binstall-only stub; install via cargo binstall prosesmasher)
-
markdown_timesheet
processing markdown files to extract and format timesheet data
-
buup
Core transformation library with zero dependencies
-
indefinite
Prefix a noun with an indefinite article - a or an - based on whether it begins with a vowel
-
quickstatic
First static site generator build for Djot. Optimized for the actual content and not the themes or bells and wistle of the Static site generator
-
sdml-cli
Rust CLI for Simple Domain Modeling Language (SDML)
-
mdbook-plantuml
A preprocessor for mdbook which will convert plantuml code blocks into inline SVG diagrams
-
oxyl
A fast LaTeX compiler
-
sara-cli
CLI for Sara - Requirements Knowledge Graph
-
ravelact
Static analysis CLI for GitHub Actions workflow estates
-
luciferous-case-converter
A CLI tool to convert text between different cases
-
roman-numerals-rs
Manipulate well-formed Roman numerals
-
dom-content-extraction
Content extraction via text density paper
-
spdfdiff_cli
Command-line semantic PDF diff and comparison tool with JSON, Markdown, and HTML output
-
madato
command line tool for reading and writing tabular data (XLS, ODS, CSV, YAML), and Markdown
-
ocr-rs
A lightweight and efficient OCR library based on PaddleOCR models, using the MNN inference framework for high-performance text detection and recognition
-
regexr
A high-performance regex engine built from scratch with JIT compilation and SIMD acceleration
-
hyperlink
Very fast link checker for CI
-
model2vec-rs
Official Rust Implementation of Model2Vec
-
opentalk-types-common-identifiers
Common identifier types for OpenTalk crates
-
mdvs
A database of markdown documents — schema validation and semantic search
-
yore-cli
Fast document indexer for finding duplicates and searching content
-
mlc
The markup link checker (mlc) checks for broken links in markup files
-
lo_core
Core data models and XML utilities for ODF document generation
-
harrier
A line-map and character-encoding-aware red-green tree for structured, lossless, incrementally-editable text
-
pprint
Flexible and lightweight pretty printing library for Rust
-
wordcut-engine
Word segmentation/breaking library
-
icy_sauce
handling SAUCE – Standard Architecture for Universal Comment Extensions
-
agent-spec
AI-native BDD/Spec verification tool for contract-driven agent coding
-
srgn
A grep-like tool which understands source code syntax and allows for manipulation in addition to search
-
cloakrs-cli
Command-line PII scanner and masker powered by cloakrs
-
twilight-mention
working with mentions in the Twilight ecosystem
-
simd-normalizer
SIMD-accelerated Unicode normalization (NFC, NFD, NFKC, NFKD)
-
lede
Deterministic extractive summarization — stdlib + regex only
-
xan
The CSV magician
-
ruckup
Check and update dependencies across Cargo, npm, and pyproject projects
-
obsidian-logging
A journaling/logging CLI that stores logs in Obsidian markdown files
-
unicodeit
Converts LaTeX to Unicode (rust port)
-
kitoken
Fast tokenizer for language models, supporting BPE, Unigram and WordPiece tokenization
-
mdbook-kroki-preprocessor
render kroki diagrams from files or code blocks in mdbook
-
spekter
Instant, side-by-side directory and file diff with syntax highlighting
-
llm-transpile
High-performance LLM context bridge — token-optimized document transpiler
-
asimov-cli
ASIMOV Command-Line Interface (CLI)
-
mdbook-luadoctest
An mdBook renderer that extracts Lua code blocks as doctests and writes them to a test.lua script
-
citum
CLI: render, check, convert, and manage citation styles, references, and documents
-
ib-matcher
A multilingual, flexible and fast string, glob and regex matcher. Support 拼音匹配 (Chinese pinyin match) and ローマ字検索 (Japanese romaji match).
-
yosina
Japanese text transliteration library
-
red-sed
An experimental drop-in replacement for GNU sed, written in Rust
-
mpd_info_screen
Displays info on currently playing music from an MPD daemon
-
igrepper
The interactive grepper
-
opentalk-roomserver-modules
OpenTalk RoomServer Modules
-
fax
Decoder and Encoder for CCITT Group 3 and 4 bi-level image encodings used by fax machines TIFF and PDF
-
mdbook-preprocessor
assist implementing an mdBook preprocessor
-
twas
A text substitution application for using random look-up tables to generate text in a manner similar to the Mad Libs game
-
deformat
Extract plain text from HTML, PDF, and other document formats
-
pdf-syntax
A low-level crate for reading PDF files
-
semtools
Semantic search and document parsing tools for the command line
-
allium-cli
CLI for checking Allium specification files
-
string_wizard
manipulate string like a wizard
-
mdbook-relative-date
An mdBook preprocessor for build-time relative date placeholders
-
html-to-markdown-cli
Command-line interface for html-to-markdown - high-performance HTML to Markdown converter
-
syara-x
Super YARA — extends YARA-compatible rules with semantic, classifier, and LLM-based matching
-
iepub
epub、mobi电子书读写
-
crowbook
Render a Markdown book in HTML, PDF or Epub
-
readable-name-generator
Generate a readable name for throwaway infrastructure
-
xee-xpath
XPath 3.1 library API
-
rake
Rapid Automatic Keyword Extraction (RAKE) algorithm
-
syllabify-fr
Syllabification française pour l'apprentissage de la lecture — port de LireCouleur 6
-
matcher_rs
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matching, implemented in Rust
-
mdbook-linkcheck2
A backend for
mdbookwhich will check your links for you -
lindera-ko-dic-builder
A Korean morphological dictionary builder for ko-dic
-
oranda
🎁 generate beautiful landing pages for your projects
-
unicode-security
Detect possible security problems with Unicode usage according to Unicode Technical Standard #39 rules
-
ganit-core
Spreadsheet formula engine — parser and evaluator for Excel-compatible formulas
-
kbremap
Custom keyboard layouts for windows
-
wezterm-bidi
The Unicode Bidi Algorithm (UBA)
-
lychee-lib
A fast, async link checker
-
isbn
handling ISBNs
-
litho-book
Litho Book is a modern web documentation reader specifically designed for the Litho (deepwiki-rs) documentation generation engine. It provides…
-
ctj
A command-line tool to convert CSV to JSON written in Rust
-
sile
Simon’s Improved Layout Engine
-
pasta_shiori
SHIORI DLL interface for pasta script engine
-
heatseeker
A fast, robust, and portable fuzzy finder
-
ferris-says
flavored replacement for the classic cowsay
-
wayland-clipboard-listener
impl wlr-data-control-unstable-v1, listen for clipboard
-
qem
High-performance cross-platform text engine for massive files
-
boxen
creating styled terminal boxes around text with performance optimizations
-
qdrant-rust-stemmers
some popular snowball stemming algorithms
-
vectorless
Reasoning-based Document Engine
-
acdc-parser
AsciiDocparser using PEG grammars -
shiguredo_toml
TOML Library
-
kas-text
Text layout and font management
-
redact-core
Core PII detection and anonymization engine - Presidio replacement
-
measured
A better way to measure your application statistics
-
inflections
High performance inflection transformation library for changing properties of words like the case
-
norad
Read and write Unified Font Object files
-
derivre
A derivative-based regular expression engine
-
name
Workspace binary for generating Rust crate names
-
dicexp
A Dice Expression Interpreter program and library for parsing (and rolling) role-playing game style dice notations (e.g. "2d8+5")
-
lipilekhika
A transliteration library for Indian Brahmic scripts
-
mdmf
Formats markdown text files into standard manuscript format for submissions. Works for short stories and multi-part novels.
-
reformat
Command-line tool for text and file reformatting
-
syntext
Hybrid code search index for agent workflows
-
kazoe
Fast wc replacement
-
rspack_error
rspack error
-
string_pipeline
A flexible, template-driven string transformation pipeline for Rust
-
nsys-curses-utils
Rust *curses utilities
-
tphrase
A translatable phrase generator
-
base-d
Universal base encoder: Encode binary data to 33+ dictionaries including RFC standards, hieroglyphs, emoji, and more
-
levenshtein_automata
Creates Levenshtein Automata in an efficient manner
-
localgpt
CLI — a local-only AI assistant
-
index-core
Core document model and semantic types for Index
-
organism-notes
Note and vault capability for Organism — vault management, source adapters, cleanup, enrichment
-
changxi
TUI EPUB Reader
-
littrs-ruff-source-file
Vendored ruff_source_file for littrs (from github.com/astral-sh/ruff)
-
wistra
AI-powered personal wiki builder
-
fuzzy-aho-corasick
Aho–Corasick automaton with fuzzy matching
-
wkhtmlapp
Convert html to pdf or image
-
aico-cli
Scriptable control over LLMs from the terminal
-
mdbook-inline-highlighting
mdBook preprocessor that enables support for inline highlighting
-
sophia_turtle
toolkit for RDF and Linked Data - parsers and serializers for the Turtle-family of syntaxes
-
llmwiki-tooling
CLI for managing LLM-wikis with Obsidian-style wikilinks
-
pipa-js
A fast, minimal ES2023 JavaScript runtime built in Rust
-
astchunk
AST-based code chunking for RAG
-
bulletty
a pretty TUI feed reader (RSS+ATOM) that stores articles locally as Markdown files
-
rusty_regex
A regex engine where geometric algebra is the execution engine
-
minimizer
Minimize files to find minimal test case
-
iwe
IWE CLI utility
-
nu_plugin_regex
nu plugin to search text with regex
-
quickmark-cli
Lightning-fast Markdown/CommonMark linter CLI tool with tree-sitter based parsing
-
chunkedrs
AI-native text chunking — recursive, markdown-aware, and semantic splitting with token-accurate boundaries
-
chunk
The fastest semantic text chunking library — up to 1TB/s chunking throughput
-
arborium-c
C grammar for arborium (tree-sitter bindings)
-
ndg-commonmark
Flavored CommonMark processor for Nix-related projects, with support for CommonMark, GFM, and Nixpkgs extensions
-
infinikey
Tool that allows programmable keyboards to send arbitrary Unicode characters
-
memory-indexer
An in-memory full-text fuzzy search indexer
-
near-facsimile
Find similar or identical text files in a directory
-
typub-ir
Semantic IR types for typub
-
yangon
A high-performance, stack-allocated string type for Rust with fixed capacity and zero heap allocations
-
rucora
High-performance, type-safe LLM agent framework with built-in tools and multi-provider support
-
rhai-autodocs
Custom documentation generator for the Rhai scripting language
-
cicero-sophia
High-performance NLU (natural language understanding) engine built in Rust for speed, accuracy, and privacy
-
lexa
Lexa CLI: hybrid local search (BM25 + binary-quantized Matryoshka KNN + cross-encoder rerank) over arbitrary file trees.
lexa index <path>,lexa search <query>,lexa watch <path>. -
officemd_cli
CLI for OfficeMD document extraction and markdown rendering
-
koicore
core KoiLang module
-
lumis
Syntax Highlighter powered by Tree-sitter and Neovim themes
-
underthesea_core
Underthesea Core
-
xrusty
Parse documents and transform using χrust
-
koji
An interactive CLI for creating conventional commits
-
src2md
Turn source code into a Markdown document with syntax highlighting, or extract it back
-
cupel
Context window management pipeline for LLM applications
-
trawlcat
A CLI for fetching value of trawl resource while omitting surrounding quotes
-
mdbook-toc
mdbook preprocessor to add Table of Contents
-
wp-lang
WPL language crate with AST, parser, evaluator, builtins, and generators
-
inflection-rs
Inflection is a string transformation library. It singularizes and pluralizes English words, and transforms strings from CamelCase to underscored string.
-
zalgo-codec
Convert an ASCII text string into a single unicode grapheme cluster and back. Provides a macro for embedding Rust source code that has been encoded in this way.
-
sd
An intuitive find & replace CLI
-
type-safe-id
A type-safe, K-sortable, globally unique identifier
-
cmx
Rust Spectral Color Management Library
-
sbnf
A BNF-style language for writing sublime-syntax files
-
datafusion-functions
Function packages for the DataFusion query engine
-
japanese-codepoints
A high-performance Rust library for Japanese character validation and code point handling based on JIS standards
-
kham-core
Pure Rust Thai word segmentation engine — no_std compatible
-
swc_ecma_transformer
Compatibility layer for the ECMAScript standard
-
guardrails
Enforce architectural decisions AI coding tools keep ignoring
-
ul/kak-lsp
Kakoune Language Server Protocol Client
-
annatto
Converts linguistic data formats based on the graphANNIS data model as intermediate representation and can apply consistency tests
-
unidown
Convert Markdown to Unicode
-
mdbook-shiftinclude
mdbook preprocessor for file inclusion with shift
-
cskk
C ABIから使う事を目的とした SKK(Simple Kana Kanji henkan)方式のかな漢字変換ライブラリ
-
bareun_rs
an unofficial Rust library for Bareun, a Korean morphological analyzer
-
cora-match
Multi-pattern fixed-string matcher. Aho-Corasick + SIMD on mmap'd files. NDJSON output for AI agents and pipelines.
-
giff
Visualizes the differences between the current HEAD and a specified branch in a git repository using a formatted table output in your terminal. The differences are displayed with color-coded…
-
adobe-cmap-parser
parse Adobe CMap files
-
csd
A super-fast search-and-replace tool for files
-
dacopy
A cross-platform tool for copying text into the clipboard in a shell
-
mad
A fast Markdown terminal renderer with syntax highlighting
-
gulagcleaner_rs
Ad removal tool for PDFs
-
rheo
A typesetting and static site engine based on Typst
-
nodex-core
Universal graph-based document tool — core library
-
at-commands
AT Commands builder and parser for Rust #![no_std]
-
fuzzt
Implementations of string similarity metrics. Includes Hamming, Levenshtein, OSA, Damerau-Levenshtein, Jaro, Jaro-Winkler, and Sørensen-Dice.
-
sourceright
Reference verification infrastructure for academic and legal citation workflows
-
patto
🪽 Yet another plain text format for quick note taking and task management
-
unicode-case-mapping
Fast lowercase, uppercase, and titlecase mapping for characters
-
qwen3-vl
vision-language structured-output engine over mistralrs, implementing the engine-agnostic llmtask::Task contract
-
ripgrep_all
rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc
-
ratex-parser
LaTeX parser for RaTeX
-
hayro-syntax
A low-level crate for reading PDF files
-
rwer
A fast Rust crate for WER, CER, and related ASR evaluation metrics
-
resharp
high-performance regex engine with intersection and complement operations
-
cedarwood
efficiently-updatable double-array trie in Rust (ported from cedar)
-
wikiwho
Fast Rust reimplementation of the WikiWho algorithm for fine-grained authorship attribution on large datasets. Optimized for easy integration in multi-threaded applications.
-
fasttext-pure-rs
Pure-Rust fastText inference engine for language identification and text classification
-
ecl-pipeline-topo
Pipeline topology, resource graph, and core traits for ECL pipeline runner
-
txtfp
Text fingerprinting: MinHash + LSH, SimHash, and ONNX semantic embeddings
-
nuch
A CLI to manage Markdown content and images for Nuxt Content sites
-
agentic-veritas-cli
CLI for AgenticVeritas
-
dioxus-tw-components
Components made for Dioxus
-
mad-useful
A multi-tool utility for file operations and text processing
-
dirgrab
CLI tool to concatenate file contents from directories, respecting Git context
-
etradeTaxReturnHelper
Parses etrade and revolut financial documents for transaction details (income, tax paid, cost basis) and compute total income and total tax paid according to chosen tax residency (currency)
-
adf2html
convenient way to build HTML document body from Atlassian Document Format (ADF) provided by Atlassian v3 API
-
edgeparse-cli
EdgeParse CLI — convert PDFs to Markdown, JSON, HTML
-
legalis-ru
Russian Federation jurisdiction support for Legalis-RS
-
boilerstrip
Learn site boilerplate selectors from multiple pages and convert HTML to clean Markdown
-
rst_parser
a reStructuredText parser
-
xi-unicode
Unicode utilities useful for text editing, including a line breaking iterator
-
parfit
Paragraph fit — a codebase-aware comment reflow tool that wraps prose with optimal-fit line breaking and leaves directives alone. Inspired by par.
-
lindera-tantivy
Lindera Tokenizer for Tantivy
-
blocklet
A cross-platform CLI tool that generates ASCII art using Unicode block characters, similar to figlet but with beautiful solid Unicode blocks instead of outlines or hash symbols
-
sastrawi-rs
High-performance Indonesian stemmer (Nazief-Adriani + ECS). Zero-regex, FST-powered, Rust 2024.
-
recase
Changes the convention case of input text
-
whatwg_streams
whatwg_streams for rust
-
docxide-pdf
CLI for converting DOCX files to PDF, matching Microsoft Word's output as closely as possible
-
anda_db_hnsw
A high-performance vector search library in Rust
-
udataframe_rs
A pure Rust library for data frame operations, particularly useful for processing data extracted from PDF files or OCR recognize
-
agentkit-tool-fs
Filesystem tools and session-scoped filesystem policies for agentkit
-
pager
pipe your output through an external pager
-
hunspell-lsp
Language Server Protocol implementation providing spell checking using Hunspell dictionaries
-
wdl-doc
Documentation generator for Workflow Description Language (WDL) documents
-
apisnip
A terminal user interface (TUI) tool for trimming OpenAPI specifications down to size ✂️
-
betlang
Tiny source-language detection for code
-
utokenizer
CLI tool for building a local model-tokenizer registry and counting input tokens across model families
-
sift-search
Sift — a DSL for agents to search codebases in faster, deeper, and more sophisticated ways
-
fabryk-fts
Full-text search infrastructure for Fabryk (Tantivy backend)
-
smart-patcher
Patcher based on rules
-
line-span
Find line ranges and jump between next and previous lines
-
richrs
port of the Rich Python library for beautiful terminal output
-
tabprinter
creating and printing formatted tables in the terminal. It supports various table styles and offers both color and non-color output options.
-
termdiff
Write a diff with color codes to a string
-
howmany
A blazingly fast, intelligent code analysis tool with parallel processing, caching, and beautiful visualizations
-
perl-module
Perl module resolution, import analysis, and refactoring — unified facade
-
computer-says-no
Local embedding service for text classification using ONNX models
-
index-extract
Deterministic extraction and scripting policies for Index documents
-
search-semantically
Embeddable semantic code search with multi-signal POEM ranking
-
regex-literal
delimited regular expression literals
-
markon
Turn your markdown on
-
reptar
CLI program for wrapping text
-
mle
The markup link extractor (mle) extracts links from markup files (Markdown and HTML)
-
uast
Unicode Aware Saṃskṛta Transliteration in Rust 🦀
-
mdbook-typst-math
An mdbook preprocessor to use typst to render math
-
stet
PostScript Level 3 interpreter and PDF rendering engine — library API
-
lite-strtab
storing a lot of strings in a single buffer to save memory
-
ruma-events
Serializable types for the events in the Matrix specification
-
css_lexer
A spec-compliant CSS tokenizer with zero-copy cursors and optional feature gates
-
sludge
A native GTK4/libadwaita Slack client for the Linux desktop
-
cadar
cada is a C-inspired syntax for Ada and Cadar is a transpiler
-
finetype-cli
CLI for FineType semantic type classification
-
alfrusco
building Alfred workflows with Rust
-
str-utils
some traits to extend
[u8],strandCow<str> -
runiq
An efficient way to filter duplicate lines from input, à la uniq
-
syslog_fmt
A formatter for the 5425 syslog protocol
-
substring
method for string types
-
sloc-core
Source line analysis tool with CLI, web UI, HTML/PDF reports, and CI/CD integration
-
codeix
Fast semantic code search for AI agents — find symbols, references, and callers across any codebase
-
md-tui
A terminal markdown viewer
-
bearing
port of Apache Lucene
-
ascfix
Automatic ASCII diagram repair tool for Markdown files
-
rspack_regex
rspack regex
-
bmo-search
CLI for searching Mozilla's Bugzilla (BMO)
-
servo-xpath
A component of the servo web-engine
-
jetscii
A tiny library to efficiently search strings and byte slices for sets of ASCII characters or bytes
-
sqz-engine
Adaptive multi-pass LLM context compression engine — content-aware pipeline with AST parsing, token counting, session persistence, and budget tracking
-
readability
Port of arc90's readability project to rust
-
typub-storage
S3-compatible storage client for typub
-
yake-rust
Yake (Yet Another Keyword Extractor) in Rust
-
harfbuzz_rs
A high-level interface to HarfBuzz, exposing its most important functionality in a safe manner using Rust
-
unicode-ellipsis
truncate Unicode strings to a certain width, automatically adding an ellipsis if the string is too long
-
terraphim_automata
Automata for searching and processing knowledge graphs
-
motosan-agent-tool
Shared AI agent tool kit — traits, registry, and built-in tools for LLM agents
-
ttypr
terminal typing practice
-
reconcile-text
Intelligent 3-way text merging with automated conflict resolution
-
mdcat
cat for markdown: Show markdown documents in terminals
-
gst-plugin-textahead
GStreamer Plugin for displaying upcoming text buffers ahead of time
-
minspan
a package for determining the minimum span of one vector within another
-
latkerlo-jvotci
Tools for creating and decomposing Lojban lujvo
-
pg_propre
Lightning-fast SQL indenter and linter powered by PostgreSQL's parser
-
lo_
A modern Rust utility library delivering modularity, performance & extras ported from JavaScript Lodash
-
aki-mline
match line, regex text filter like a grep of linux command
-
rassa-unibreak
Pure-Rust Unicode line and word breaking for rassa
-
dprint-plugin-markdown
Markdown formatter for dprint
-
codex
Human-friendly notation for Unicode symbols
-
fitsio-pure
Pure Rust FITS file reader and writer
-
codenamr
A dead simple and lightning fast CLI for generating codenames
-
product-os-http
Product OS : Http is a derivative of the http crate restructured for both std and no_std environments
-
vibrato-rkyv
Vibrato: viterbi-based accelerated tokenizer with rkyv support for fast dictionary loading
-
mdbook-embedify
based mdbook preprocessor plugin that allows you to embed apps to your book, like youtube, codepen, giscus and many other apps
-
xxxpwn
XPath eXfiltration eXploitation Tool - Blind optimized XPath 1 injection attacks
-
aube-settings
Settings schema and loader for Aube
-
asimov-prompt
ASIMOV Software Development Kit (SDK) for Rust
-
unidoc
Unite all Markdown
-
ucp-llm
LLM-focused utilities for the Unified Content Protocol
-
mdbook-angular
mdbook renderer to run angular code samples
-
mdlux
Terminal-first Markdown renderer with ANSI and Kitty enhancements
-
threeway_merge
Git-style 3-way string merging using proven algorithms from libgit2/xdiff. Statically links xdiff (LGPL-2.1+).
-
swappy
An anagram generator
-
simdsieve
SIMD-accelerated byte pattern pre-filtering with AVX-512, AVX2, NEON, and scalar fallback
-
md-wiki
Markdown based static site generator for minimal wikis
-
bm25_turbo
The fastest BM25 information retrieval engine — 28K QPS on 8.8M docs
-
pagefind
Implement search on any static website
-
tagged-urn
Tagged URN - Flat tag-based identifier system
-
tsuki
Lua 5.4 ported to Rust
-
tendril
Compact buffer/string type for zero-copy parsing
-
ai-marketing-campaign-optimizer
AI Marketing Campaign Optimizer - Multi-language toolkit for optimizing AI-powered marketing campaigns with content analysis, strategy frameworks, and automation utilities. Inspired by https://ai-cmo.net/
-
serpl
terminal UI for search and replace, ala VS Code
-
shibuichi
zsh prompt preprocessor to add git integration
-
oxipe
Minimal typing test
-
libphext
A rust-native implementation of phext
-
token-parser
parsing texts into data structures
-
kiwi-rs
Ergonomic Rust bindings for the Kiwi Korean morphological analyzer C API
-
monster-regex
A custom regex spec
-
hwp2md
HWP/HWPX ↔ Markdown bidirectional converter
-
svgdx-pandoc
pandoc filter for svgdx codeblocks in Markdown
-
shell2batch
Coverts simple basic shell scripts to windows batch scripts
-
pomsky
A new regular expression language
-
writing-analysis
Lightweight writing analysis and NLP tools for Rust
-
fuzzy-regex
High-performance fuzzy regular expression engine combining regex with Damerau-Levenshtein distance
-
yaml-include
A lib and a CLI for recursively parsing "!include" data in yaml files
-
langextract-rust
extracting structured and grounded information from text using LLMs
-
y-octo
High-performance and thread-safe CRDT implementation compatible with Yjs
-
geoipsed
Inline decoration of IPv4 and IPv6 address geolocations
-
mdbook-termlink
mdBook preprocessor that auto-links glossary terms throughout documentation
-
agents-are-thinking
Terminal animation effects built with braille, block characters, and unicode glyphs
-
inputx-pinyin
Self-developed Mandarin Pinyin input method engine — segmenter, fuzzy syllables, FST dict, WASM-ready. Powers the Inputx IME.
-
secular
No Diacr!
-
llama-runner
A straightforward Rust library for running llama.cpp models locally on device
-
galm
pattern matching library
-
mcat
Terminal image, video, and Markdown viewer
-
bbd
Binary Braille Dump
-
cli-pdf-extract
Fast Rust CLI wrapper around pdf_oxide for LLM-friendly PDF extraction
-
bamboo-core
Vietnamese input method engine written in Rust
-
kreuzberg-cli
Command-line interface for Kreuzberg document intelligence
-
sedx
A safe, modern replacement for GNU sed with automatic backups, preview mode, and rollback
-
artificial
Typed, provider-agnostic prompt-engineering SDK for Rust
-
seeyou-cub
reading and writing the SeeYou CUB binary file format, which stores airspace data for flight navigation software
-
zuit
Command-line interface for zuit static analysis
-
dspy-rs
A DSPy rewrite(not port) to Rust
-
rustine
High-performance Gel syntax parser transforming to JSON/XML (Rust + PyO3)
-
ggemtext
Glib-oriented Gemtext API
-
lint-ai
Semantic wiki and docs linting for contradictions, stale claims, orphan pages, and missing cross-references
-
talon-core
Core retrieval engine for Talon: hybrid search (BM25 + semantic + reranker), indexing, and graph-aware ranking over markdown corpora
-
rustdoc-stripper
manipulate rustdoc comments
-
mdref
Markdown Reference finding and migration tool
-
mdvalidate
Markdown schema validation engine
-
my-notes
note taking app for taking hierarchical notes in markdown
-
pdfrs
A CLI tool to read/write PDFs and convert to/from markdown
-
nanofts
High-performance full-text search engine in Rust
-
sigrs
Interactive grep (for streaming)
-
music-comp-mt-cli
A music theory command-line tool
-
pdf-compliance
PDF compliance checking (PDF/A, PDF/UA)
-
Inscribe
A markdown preprocessor that executes code fences and embeds their output
-
embeddenator-workspace
Workspace management utilities for embeddenator development
-
pivot-pdf
A low-overhead PDF generation library for reports, invoices, and documents
-
ninede-pimbo
An inventory management app/api, made for personal use
-
core-glyph
View tree, Signal<T>, flexbox layout, and flat quad output for the Glyph UI framework
-
rust_metrics
Incremental evaluation metrics for various machine learning pipelines
-
lcat
lolcat in rust! Full unicode support, escapes for ANSI escape sequences, hue shift in Cubehelix
-
quagga
CLI tool that combines multiple text files into a single prompt suitable for Large Language Models
-
vi
An input method library for vietnamese IME
-
stam
powerful library for dealing with stand-off annotations on text. This is the Rust library.
-
commitbot
A CLI assistant that generates commit and PR messages from your diffs using LLMs
-
shvar
POSIX-compliant shell variable substitution routine
-
icu_provider_baked
Tooling for the ICU4X baked data provider
-
kiru
Fast text chunking for Rust
-
rlm-cli
Recursive Language Model (RLM) REPL for Claude Code - handles long-context tasks via chunking and recursive sub-LLM calls
-
recast-core
Engine behind the recast CLI: regex / Rhai script / tree-sitter rewrites, atomic two-phase commit, schema-locked JSON output
-
deeprl
DeepL client library with all the things (blocking)
-
lynpdf-rs
Pure Rust HTML/CSS to PDF renderer focused on Thai text layout
-
faker-rust
A high-performance, locale-aware fake data generator for Rust
-
kyu-parser
openCypher parser for KyuGraph with hand-written lexer and chumsky combinators
-
spellcode-cli
Minimal CLI frontend for codebook
-
mdbook-mermaid
mdbook preprocessor to add mermaid support
-
vader-sentimental
A faster Rust version from the original Python VaderSentiment analysis tool
-
claude-kb-cli
CLI for generating, validating, and searching Markdown knowledge bases under a .claude/ hierarchy
-
yuru-ko
Korean Hangul matching support for Yuru
-
pastebinit-rs
Just Paste It! A simple CLI tool to paste text to various pastebin services
-
typope
Pedantic source code checker for orthotypography mistakes and other typographical errors
-
unicode-casing
Titlecase helper function on characters
-
textpod
Local, web-based notetaking app inspired by 'One Big Text File' idea
-
tokmd-cockpit
Cockpit PR metrics computation and rendering for tokmd
-
fff-search
Faboulous & Fast File Finder - a fast and extremely correct file finder SDK with typo resistance, SIMD, prefiltering, and more
-
to_snake_case
that transforms strings to snake_case
-
qndx-query
Regex decomposition, candidate planner, and verifier for qndx
-
rustc_lexer
Rust lexer used by rustc. No stability guarantees are provided.
-
hypembed
Pure-Rust BERT-compatible text embedding inference for local-first applications
-
abyo-speculate
Pure Rust Speculative Decoding library for local LLMs — vanilla SD + Medusa, Qwen2 + Llama, batch-1 optimised
-
mdbook-mdinclude
An mdBook preprocessor for better markdown file inclusion
-
blinc_layout
Blinc layout engine - Flexbox layout powered by Taffy
-
kenlm-rs
Rust bindings for KenLM language model inference
-
mr_pdf
A lightweight, high-performance PDF generation library for Rust with premium layouts and charts
-
sublime_fuzzy
Fuzzy matching algorithm based on Sublime Text's string search
-
drova_plugins
Main plugins for drova
-
rsonpath-lib
Blazing fast JSONPath query engine powered by SIMD. Core library of
rsonpath. -
rustpython-common
General python functions and algorithms for use in RustPython
-
huc-tapir
Text & Annotation Processor for Indexing Resources
-
aphid
A static site generator for blogs and wikis, with wiki-links across both
-
bin-rs
Binary Reader from multi source
-
adc-lang
Array-oriented reimagining of dc, a terse RPN esolang
-
ripgrep-api
Dev-friendly API wrapper around the ripgrep implementation to be used directly in Rust projects
-
bloodtree
A hierarchical note-taking system with focus on relationships between nodes
-
ncount
A word count tool intended to derive useful stats from markdown
-
globby
Heavily opinionated glob matching library
-
awful_dataset_builder
Build LLM-ready Q/A datasets from reference text-to-question mappings produced by Awful Knowledge Synthesizer
-
ascii-canvas
canvas for drawing lines and styled text and emitting to the terminal
-
cmark-writer
A CommonMark writer implementation in Rust for serializing AST nodes to CommonMark format
-
glyphweaveforge
Convert Markdown into PDF through an explicit Rust pipeline with minimal and Typst backends
-
turndown-cdp
Convert CDP-style DOM nodes to Markdown
-
lo_writer
Writer-like document editing with Markdown and plain text import/export
-
fuzzy-muff
Fuzzy Matching Library
-
matrix-ui-serializable
Opinionated abstraction of the matrix-sdk crate with serializable structs
-
typwriter
compiling, formatting, and watching Typst documents, with PDF metadata and permission management
-
syllarust
quickly counting syllables
-
m2p
Markdown to PDF
-
mdbook-summarizer
Generate mdBook SUMMARY.md files from a book source tree
-
vibequest
A vibe-coded scripting language focused on developer happiness with a REPL implementation in Rust
-
encoding_rs2
A Gecko-oriented implementation of the Encoding Standard
-
dxpdf
A fast DOCX-to-PDF converter powered by Skia
-
truthlens
AI hallucination detector — formally verified trust scoring for LLM outputs
-
lgtmeow
🐾 —— 「本喵觉得很不错~」
-
mdbook-quiz
Interactive quizzes for your mdBook
-
tesseract-rs
Rust bindings for Tesseract OCR with optional built-in compilation
-
waifu-calendar
fetch your favorite characters' birthdays from AniList
-
mdbook-cmdrun
mdbook preprocessor to run arbitrary commands
-
fits-io
A pure-Rust FITS file handling library inspired by CFITSIO, focused on safety, clarity, and performance
-
amdb
Turn your codebase into AI context. A high-performance context generator for LLMs (Cursor, Claude) using Tree-sitter and Vector Search.
-
bbqr
Implementaion of the bbqr spec in rust
-
notion2prompt
CLI tool that converts Notion pages and databases into structured prompts for AI models
-
rsword_chirho
Core SWORD module library in pure Rust
-
typesense_codegen
Types for typesense generated with openapi spec
-
microcad-docgen
µcad Documentation Generators
-
mq-hir
High-level Internal Representation (HIR) for mq query language
-
mdbook-external-links2
Open external links inside your mdBooks in a different tab
-
ghat
GitHub Actions in TypeScript
-
sed-rs
A GNU-compatible sed implementation in Rust, powered by sd's regex engine
-
lexir
Lexical IR (BM25/TF-IDF) on top of postings lists
-
wikipedia_prosesize
Count Wikipedia prose size
-
glu
Unpacks a Markdown document full of code snippets into a temp directory
-
word-tally
Output a tally of the number of times unique words appear in source input
-
harmorp
Enhanced Nazief-Adriani Indonesian stemmer: iterative ECS, nasal-assimilation restoration, phonotactic guards, FST dictionary, zero-alloc hot path
-
jaarg
It can parse your arguments you should use it it's called jaarg
-
glyf-core
A fast Emmet inspired HTML and JSX abbreviation parser and expander
-
unicode-vo
Unicode vertical orientation detection
-
rdx-math
LaTeX math parser for the RDX specification
-
invlex-cli
CLI tool for inverse lexicographic (a tergo) sorting; installs the
invlexbinary -
tangler
Extracts code blocks from Markdown documents
-
toml-maid
Keep your TOML files clean
-
xim-ctext
compound text en/decoder
-
leindex
MCP and semantic code search engine for AI tools and large codebases
-
agentroot-mcp
Model Context Protocol server for agentroot - AI assistant integration
-
noil
file explorer using text buffers
-
roe
Unicode case conversion
-
txt_to_md
Command converting from a txt file to a markdown file
-
faith
Agent-first Bible CLI. Multi-locale, deterministic, offline. Returns canonical JSON, supports batch and multi-translation parallel lookups.
-
labparse
Parse lab results into structured biomarker JSON
-
blazen-cabi
Hand-rolled C ABI over blazen-uniffi for the Ruby gem (via cbindgen + FFI gem) and any other FFI host
-
mdbook-codeblocks
A mdbook preprocessor to prepend customizable vignette to code blocks
-
bm25x
A fast, streaming-friendly BM25 search engine with mmap support
-
slugomatic
🐌 A simple CLI tool to slugify and unslugify text, perfect for branch names and URLs
-
nu_plugin_emoji
a nushell plugin called emoji
-
decasify
A CLI utility and library to cast strings to title-case according to locale specific style guides including Turkish support
-
rsconstruct
Rust based fast build system
-
clerr
aids in command-line error reporting
-
regect
A cli tool to quickly test regular expressions
-
panfix
parsing: linear time parsing of multifix operators
-
llmwiki
A local-first wiki search and indexing tool
-
wsl-clip
High-performance clipboard bridge for WSL2
-
opusmeta
reading and writing metadata to opus files
-
lo_odf
ODF package serializers for text, spreadsheet, presentation, drawing, formula, and database documents
-
mdbook-theme
A preprocessor and a backend to config theme for mdbook, especially creating a pagetoc on the right and setting full color themes from the offical ace editor
-
slice-command
slice is a command-line tool that allows you to slice the contents of a file using syntax similar to Python's slice notation
-
qj
A fast, jq-compatible JSON processor powered by simdjson
-
typg-core
Core search/discovery engine for typg (made by FontLab https://www.fontlab.com/)
-
dmos
Djot HTML renderer with advanced features
-
wildcard
matching
-
writ
A hybrid markdown editor combining raw text editing with live inline rendering
-
ccase
Command line interface to convert strings into any case
-
kfst-rs
Fast and portable HFST-compatible finite-state transducers
-
lingua-english-language-model
The English language model for Lingua, an accurate natural language detection library
-
codesearch
A fast, intelligent CLI tool with multiple search modes (regex, fuzzy, semantic), code analysis, and dead code detection for popular programming languages
-
rd2qmd-mdast
mdast types and Quarto Markdown writer for rd2qmd
-
chatpack-cli
CLI tool for parsing and converting chat exports into LLM-friendly formats
-
nexo-compliance-primitives
Reusable conversational-compliance primitives for nexo microapps (anti-loop, anti-manipulation, opt-out, PII redaction, per-user rate limit, consent tracking)
-
opencc-jieba-rs
High-performance Chinese text conversion and segmentation using Jieba and OpenCC-style dictionaries
-
tauri-plugin-thermal-printer
Plugin for Tauri to send esc/pos commands to thermal_printer
-
cron_clock
A cron expression parser and schedule explorer. Rich documentation and case studies and related upper-level libraries are available.
-
mdbook-yml-header
mdBook preprocessor for removing yml header
-
asimov-imap-module
ASIMOV module for IMAP email import
-
quillmark
engine API
-
hub-codegen
Multi-language code generator for Hub plugins from Synapse IR
-
spacemod
A easy to understand and powerful text search-and-replace tool
-
mdbook-catppuccin
🎊 Soothing pastel theme for mdBook
-
pdfer_forms
Fast pure-Rust PDF form filling, AcroForm inspection, and document operations (merge, split, rotate, encrypt) — a pypdf / PyPDF2 compatibility layer
-
esri_ascii_grid
reading ESRI Ascii Grid .asc files
-
ucd
Extends the char type to provide access to most fields of the UCD, Unicode Character Database, as of version 9.0.0. It aims to be compact, fast, and use minimal dependencies (only rust's core crate)…
-
mnem-graphrag
LLM-free GraphRAG algorithms over mnem's AdjacencyIndex: Leiden community detection (E1) + extractive summarization, Centroid + MMR (E4)
-
ttf2woff2
A Pure Rust library and CLI for compressing TTF fonts to WOFF2 format
-
zspell
Native Rust library for spellchecking
-
beanfmt
A fast beancount file formatter with CJK support
-
gpu-usage-waybar
display gpu usage in Waybar
-
attack-data
Request Mitre ATTACK data offline
-
fm
Non-backtracking fuzzy text matcher
-
droid-wrap
用于Rust的Android API的高级封装
-
mdbook-d2
D2 diagram generator plugin for MdBook
-
mdless
A terminal-based markdown file viewer
-
omni-mdx
A highly secure, DoS-resistant MDX parser and OCP binary protocol engine
-
mdbook-typst
An mdBook backend to output Typst markup, pdf, png, or svg
-
jpreprocess
Japanese text preprocessor for Text-to-Speech application (OpenJTalk rewrite in rust language)
-
zepub-mini
Minimal crate for writing epubs (in-memory)
-
ssfmt
Excel-compatible ECMA-376 number format codes
-
cmakefmt-rs
CMake formatter
-
dbxcase
Dropbox-compatible case-folding algorithm
-
ironpress
Pure Rust HTML/CSS/Markdown to PDF converter with layout engine, LaTeX math, tables, images, custom fonts, and streaming output. No browser, no system dependencies.
-
tar2
A feature-rich tar replacement with tree view, colors, emoji, and cross-platform config
-
deemuk
Compress any text before it enters your LLM. Less tokens, same meaning.
-
obcore
A single threaded, zero dependency price-time priority limit orderbook implementation in Rust
-
peasytext
Rust client for PeasyText — text tools, glossary, and guides API
-
org-tools
Unified CLI for org-mode: lint, format, query, clock, export
-
rxlsb
Pure Rust XLSB (Excel Binary Workbook) reader/writer library
-
alyze
High-performance text analysis for full-text search
-
alisql
Analyze SQL
-
stygian-plugin
Visual data extraction fallback subsystem with CSS/XPath selectors, idempotent request handling, and composable transformation pipelines
-
yara-x-parser
A parsing library for YARA rules
-
looking-glass
reflection & type-erasure library for Rust
-
sheetsmithcli
The goto cli for sprite sheet packing
-
mantra-lang-tracing
Contains functionality needed to collect requirement traces from code or plain text files for the
mantraframework -
telegram-markdown-v2
Transform regular Markdown into Telegram MarkdownV2 (parse_mode = MarkdownV2)
-
codespan_preprocessed
Beautiful diagnostic reporting for M4 (or cpp) preprocessed text files
-
hanja_hangul
that converts Chinese characters to Korean characters. That is, convert hanja to hangul
-
sara-core
Core library for Sara - Requirements Knowledge Graph CLI
-
covy-core
Fast Rust CLI for coverage and diagnostics gating
-
shimmytok
Pure Rust tokenizer for GGUF models with llama.cpp compatibility (SentencePiece + BPE + WPM + UGM + RWKV)
-
mdbook-aquascope
Interactive Aquascope editor for your mdBook
-
capns
Core cap URN and definition system for FGND plugins
-
rsrpp
project for research paper pdf
-
fop
FOP (Formatting Objects Processor) — Apache FOP-compatible XSL-FO processor in pure Rust
-
retrofont
Retro terminal font toolkit: TDF/FIGlet parsing, rendering, conversion
-
gazenot
Gaze Not Into The Abyss, Lest You Become A Release Engineer
-
neco-fuzzy
Minimal fuzzy score core for commands, paths, and short identifiers
-
airshipper
automatic updates for the voxel RPG Veloren
-
rustpress
增量编译倒分页无后端 Rust 纯静态博客程序
-
dialogi
A dialog parser
-
lex-extension-host
Runtime for the Lex extension system: registry, transports, trust gate, sandboxing
-
caseless
Unicode caseless matching
-
dig2crawl
Universal agnostic web crawler with Claude-powered CSS selector discovery
-
qrsimple-cli
Command line tool to generate QR codes
-
blitztext
fast keyword extraction and replacement in strings
-
constellate
Rust-powered CLI + live editor for curated markdown workspaces (requirements, docs, ADRs, audits, support, status-driven tasks) that build/serve/CRUD a themeable knowledge portal from a single binary
-
identstr
Immutable identifier strings with preserved quote style and normalized lookup keys
-
skera
Subsetting a font file according to provided input
-
quickmd
Quickly preview a markdown file
-
mdbook-plotly
An mdbook preprocessor that renders plot code blocks (e.g., ```plot) into interactive or static charts during book build
-
vidya
— programming reference library and queryable corpus for AGNOS
-
asimov-sdk
ASIMOV Software Development Kit (SDK) for Rust
-
marq
Markdown rendering with pluggable code block handlers
-
ens-normalize
Rust port of adraffy's ENS normalizer
-
flavortown_reader
Read Flavortown Devlogs
-
instant-segment
Fast English word segmentation
-
svgbob
Transform your ascii diagrams into happy little SVG
-
normy
Ultra-fast, zero-copy text normalization for Rust NLP pipelines & tokenizers
-
focaccia
no_std implementation of Unicode case folding comparisons
-
xml-syntax-reader
Low-level, callback-based, streaming XML tokenizer
-
jailguard
Pure-Rust prompt-injection detector with 1.5MB embedded MLP classifier. 98.40% accuracy, p50 14ms CPU inference, 8-class attack taxonomy. Apache-2.0/MIT alternative to Rebuff and Lakera Guard.
-
skimtoken
Fast token count estimation library
-
mdbook-combiner
combine mdbook summaries from multiple source into one mdbook
-
rmeow
A command line tool that aims to be a replacement for cat/bat with better highlighting
-
worf-launcher
Wayland application launcher inspired by wofi, rofi, and walker. Written in Rust with GTK4, supporting multiple modes (math, drun, file, ssh, run, emoji, search, auto), modern theming, and high performance.
-
whitespace-sifter
Sift duplicate whitespaces away!
-
gllm
Pure Rust library for local embeddings, reranking, and text generation with MoE-optimized inference and aggressive performance tuning
-
litsea
extreamely compact word segmentation and model training tool implemented in Rust
-
nbv
A fast terminal-native Jupyter notebook viewer
-
aprilasr
High-level wrapper for the april-asr C api (libaprilasr) using aprilasr-sys
-
xml-3dm-cli
3DM XML Tree Differencing and Merging Tool CLI
-
uchardet-git
C++ 库 uchardet (git 版本) 的简单封装。
-
eadup
A native-first markup language and EADUP compiler for automated, standards-compliant document typesetting
-
mdtablefix
mdtablefixunb0rks and reflows Markdown tables so that each column has a uniform width. When the--wrapoption is used, it also wraps paragraphs and list items to 80 columns. -
mandown
Markdown to groff (man page) converter
-
index-http
Fetch abstraction for Index
-
pygmy
Ping me — notifications from AI agents (Telegram, Discord)
-
mdbook-chess
An mdbook preprocessing plugin to generate chess boards
-
misaki-rs
A self-contained, POS-aware Grapheme-to-Phoneme (G2P) engine for Rust, optimized for TTS models like Kokoro
-
phd
an esoteric gopher server
-
attuned-infer
Fast, transparent inference of human state axes from natural language
-
string-box
Create Rust string from UTF-8 string, byte string or wide string
-
string-offsets
Converts string offsets between UTF-8 bytes, UTF-16 code units, Unicode code points, and lines
-
leptos-sync-components
Leptos components for synchronization UI
-
hanconv
Convert between Chinese characters variants
-
rfham-bands
Data types to represent band plans
-
h2md
HTML to Markdown converter powered by a browser-grade HTML parser
-
wrap-ansi
A high-performance, Unicode-aware Rust library for intelligently wrapping text while preserving ANSI escape sequences, colors, styles, and hyperlinks
-
nu-explore
Nushell table pager
-
textcon
Template text files with file/directory references for AI/LLM consumption
-
varna
— multilingual language engine: phoneme inventories, G2P rules, scripts, grammar, and lexicon for 50+ languages
-
alef-docs
API reference documentation generator for alef polyglot bindings
-
pretokie
Fast, zero-allocation pretokenizers for BPE tokenizers
-
mdbook-curly-quotes
mdBook preprocessor that replaces straight quotes with curlyquotes, except within code blocks or code spans
-
primd-core
Sub-millisecond predictive retrieval runtime for voice AI. Open-source VoiceAgentRAG.
-
postcode_extractor
extract and identify postcodes
-
vidyut-prakriya
A Sanskrit word generator
-
wqpl
The wq Programming Language
-
static-lang-word-lists
Runtime decompressed statically-included word lists
-
comically
fast manga & comic optimizer for e-readers
-
LitePhoton
A blazingly fast text file/csv file/etc scanner
-
async-utf8-decoder
Convert AsyncRead to incremental UTF8 string stream
-
normalized-line-endings
Line endings normalizer
-
mdbook-pandoc
A pandoc-powered mdbook backend
-
fencecat
Walkdir cat with markdown fenced code output
-
ColorShell
A small crate for coloring text for rust
-
virtual-frame
Deterministic data pipeline toolkit for LLM training — bitmask-filtered virtual views, NFA regex, Kahan summation, full audit trail. Python bindings included.
-
genpdf
User-friendly PDF generator written in pure Rust
-
creature_feature
Composable n-gram combinators that are ergonomic and bare-metal fast
-
paswitch-rs
List and swap to pulse sinks by name
-
mdv
Terminal Markdown Viewer
-
euma
color and design theme
-
fastgrep
Fast parallel grep with SIMD-accelerated search and trigram indexing
-
el_roi
simplify reading user input
-
mdbook-wordcount
Word count for mdbook, inspired by the mdbook tutorial
-
mdbook_fork4ls
Fork of mdBook for mdBook_LS
-
unicount
Alphabetic counter supporting unicode
-
chump-perception
Structured perception layer for LLM agents: rule-based extraction of entities, constraints, risk indicators, ambiguity score, and task type from raw user input. No LLM calls — fast pattern matching only.
-
scrunch
full-text-searching compression
-
lexrs-server
Production HTTP server for the lexrs lexicon library
-
anaso_site_api_models
API models for Ana.so
-
newsfresh
CLI and library for querying, filtering, and analyzing GDELT Global Knowledge Graph (GKG) v2.1 data — the world's largest open news event dataset
-
ntcip
National Transportation Communications for ITS Protocol
-
duvet
A requirements traceability tool
-
quranize
Encoding transliterations into Quran forms
-
iword-cli
CLI keyword scanner — iword-rs command-line tool
-
indian-numbers
Format numbers in Indian style (Lakh, Crore) and convert to words with Rupee support
-
picodiff
Tiny GUI app to compare text easily
-
md-scatter
split up and reassemble markdown files
-
easy_reader
easily navigating forward, backward or randomly through the lines of huge files
-
fusefiles
Concatenate a directory full of files into a single prompt for use with LLMs
-
xsample
A CLI tool to convert between various ASCII representations to IPA and vice versa
-
gaze-document
Reversible PII pseudonymization for documents — Tesseract OCR + Gaze redact → SafeBundle (clean Markdown + manifest + report)
-
tu
CLI tool to convert a natural language date/time string to UTC
-
mdlens
Token-efficient Markdown structure CLI for agents
-
googleapis-tonic-google-maps-places-v1
A Google APIs client library generated by tonic-build
-
dec_from_char
Small library for converting unicode decimal into numbers
-
im-identifiers
Extract, validate, and resolve academic identifiers — DOI, arXiv, ISBN, PMID, bibcode. Includes CLI, MCP server, and Python bindings.
-
archive-pdf-urls
Extract all links from a PDF and archive the URLs in the Internet Archive's Wayback Machine
-
dw2md
Crawl a DeepWiki repository and compile all pages into a single, LLM-friendly markdown file
-
hypha
Obsidian vault link graph traverser — neighborhood BFS, shortest path, and co-citation link suggestions
-
streplace
A tiny library for matching and replacing in strings and slices with user-defined functions
-
util-gpui-unofficial
A collection of utility structs and functions used by Zed and GPUI
-
blockwatch
Language agnostic linter that keeps your code and documentation in sync and valid
-
catbus
A Wayland IME for multilingual text input
-
nu-utils
Nushell utility functions
-
zh_num
Convert ASCII numbers and zh words
-
flowmark
A Markdown auto-formatter for clean diffs and semantic line breaks
-
ripvec-mcp
MCP + LSP server for ripvec — semantic code search, PageRank repo maps, and multi-language code intelligence
-
grapheme-utils
Handy utils for working with utf-8 [unicode] Extended Grapheme Clusters
-
gatekpr-patterns
Regex pattern registry and pre-built pattern sets for Shopify validation
-
ansic
does ansi parsing in a dynamic DSL and at compile time for efficient and zero cost ansi styling
-
pretty-console
A fluent, zero-cost API for styling terminal text with colors and attributes
-
wideword
Fast word-length bucketing for text documents using SIMD
-
patiencediff
algorithm
-
kanpyo
Japanese Morphological Analyzer
-
newline_normalizer
Zero-copy newline normalization to \n or \r\n with SIMD acceleration
-
seshat-core
Core types, traits, and intermediate representation for Seshat
-
bashtestmd
Compiles shell commands in .md files into Bash scripts for testing
-
hyli-registry
Hyli Registry - Upload and download ELF binaries
-
spel-right
A fast and lightweight spell checker and suggester
-
smart-config-commands
Command-line extensions for
smart-configlibrary -
lepiter-cli
terminal cli and tui reader for lepiter knowledge bases
-
rst
a reStructuredText parser and renderer for the command line
-
lumin
searching and displaying local files
-
antex
Styled text and tree in terminal
-
sloc-languages
Source line analysis tool with CLI, web UI, HTML/PDF reports, and CI/CD integration
-
tess-cli
less-style terminal pager for files, pipes, and live logs — with structured-log filtering, pretty-printing (JSON/YAML/TOML/XML/HTML/CSV), ANSI passthrough, multi-file navigation, and ctags jumping. Rust, macOS + Linux.
-
normalized-path
Opinionated cross-platform, optionally case-insensitive path normalization
-
unclog
allows you to build your changelog from a collection of independent files. This helps prevent annoying and unnecessary merge conflicts when collaborating on shared codebases.
-
syllabize-es
Syllabize Spanish text, and much more
-
scout
Friendly fuzzy finder for the command line
-
mdbook-variables
mdBook proprocessor for risolve variables configured from book.toml
-
oxford_join
Join string slices with Oxford Commas!
-
grapheme-cli
Grapheme CLI for parsing, compiling, and running .gr workflows
-
ratex-font
Font metrics and symbol tables for RaTeX
-
deencode
Reverse engineer encoding errors
-
jammi-encoders
Candle-native BERT-family encoders for sentence embeddings, with built-in PEFT support via jammi-lora
-
crlf-to-lf-inplace
Fast in-place CRLF to LF line ending conversion for Rust strings. Uses memchr for good performance without custom SIMD.
-
uv-pep440
internal component crate of uv
-
neo4j_cypher
A flexible and intuitive query builder for Neo4j and Cypher
-
corsa_bind_lsp
LSP-focused clients, overlays, and virtual documents for typescript-go
-
ib-pinyin
一个高性能拼音查询、匹配库
-
sel-rs
Select slices from text files by line numbers, ranges, positions, or regex
-
llm-guard-ml
ONNX-runtime-backed scanners for llm-guard. Catches paraphrased / novel prompt-injection attacks the rules tier can't. CPU by default; CUDA / CoreML / DirectML opt-in.
-
weave-content
Content DSL parser, validator, and builder for OSINT case files
-
urlcode
Convinience tool for managing urls from the command line
-
pdfluent
Pure-Rust PDF SDK with XFA, PDF/A, digital signatures, and WASM support
-
vlazba
Lojban words generator and analyzer
-
dir2txt
Convert a directory to text
-
dprint-development
Helper functions for testing dprint plugins
-
rolldown_error
-
mdbook-exercises
An mdBook preprocessor for interactive exercises with hints, solutions, and test execution
-
recursive-file-loader
recursively load files via references in the files
-
neco-editor
Umbrella crate for editor runtime primitives with a unified text buffer
-
rspaddoc
Rust version of paddoc
-
okh-tool
A CLI tool to deal with Open Know-How (OKH) data files. Its main functionalities are: validation of and conversion between the different formats
-
scrapling
Fast, adaptive web scraping toolkit for Rust
-
teip
Masking tape to help commands "do one thing well"
-
ipset_lookup
ipset is a command-line tool that takes networks or IPs and searches through a lot of different threat feeds quickly. It can also download the feed data necessary to perform the queries…
-
booky
analyze English text
-
mq-lang
Core language implementation for mq query language
-
labgenetics
Genetic variant analysis — pathogenicity, pharmacogenomics, and polygenic risk
-
pyohwa-search
Search index builder for Pyohwa static site generator
-
laser-pdf
programmatic PDF generation with precise, predictable layout control
-
iregex
Intermediate representation for Regular Expressions
-
fsqlite-ext-fts5
FTS5 full-text search extension
-
ident_case
applying case rules to Rust identifiers
-
unindent
Remove a column of leading whitespace from a string
-
bo4e-edifact-types
Generic interchange model types for BO4E/EDIFACT conversion
-
koto_test_utils
Testing utilities for the Koto programming language
-
reword
some utility functions for human-readable formatting of words
-
affinidi-messaging-text-client
Affinidi Messaging SDK
-
unicode-display-width
Unicode 15.1.0 compliant utility for determining the number of columns required to display an arbitrary string
-
xhtml_parser
Non-validating XHTML Tree-based parser
-
slug-preserve
Case-preserving slugifier with Unicode PUA sentinel support (internal to fren)
-
confluex
Export Confluence pages to Markdown from the command line
-
ld-ownedbytes
Expose data as static slice
-
cpf_cnpj
Validador de CPF e CNPJ para Rust
-
atoxide-export
Export formats for the Ato electronics compiler (netlist, BOM)
-
giallo-kak
Kakoune syntax highlighter using TextMate grammars
-
mylsp
LSP helper
-
phonetik
Phonetic analysis engine for English. Rhyme detection, stress scanning, meter analysis, and syllable counting with a 126K-word embedded dictionary.
-
url_encor
A lightweight library to encode and decode special characters in urls
-
date_time_parser
Rust NLP library for parsing English natural language into dates and times
-
office2pdf
Convert DOCX, XLSX, and PPTX files to PDF using pure Rust
-
shannon-nu-pretty-hex
Pretty hex dump of bytes slice in the common style
-
greppy-cli
Sub-millisecond semantic code search and trace with AI reranking (Claude/Gemini/Ollama)
-
linkcheck2
extracting and validating links
-
tokstream-cli
CLI token stream simulator using Hugging Face tokenizers
-
bmfont_rs
Load/ save/ manipulate BMFont files
-
slop-guard
Detect AI slop patterns in prose — scores text 0-100 for ~80 regex-based rules targeting LLM writing tics
-
mdbook-abbr2
a preprocessor to add support for abbreviations to mdbook, inspired by the typst package abbr
-
google-book-scraper
downloading the contents of books hosted on books.google.com for offline viewing
-
eid-mubarakc
A CLI tool to celebrate Eid Mubarak with ASCII video art
-
rust-regex-dsl
Regular expression DSL
-
ripsecrets
A command-line tool to prevent committing secret keys into your source code
-
rob_test_sagebox_integration_001
Internal test crate for validating Sagebox packaging and README rendering. Not intended for public use.
-
scrubbers
High-throughput redaction engine + CLI
-
scrivener-mcp
MCP server for Scrivener 3 projects — AI-powered writing assistant tools
-
tmenu
TUI fuzzy finder
-
ht32-panel-daemon
Daemon with web UI for HT32 panel control
-
zettel-cli
cli app for Luhmann-style Zettelkasten management
-
ytx-cli
Extract YouTube transcripts from the terminal. Pipe-friendly, no API key needed.
-
bad-apple
A terminal-based player for videos
-
ternlang
A stack based ternary esolang
-
mdbook-tracey
mdbook preprocessor for tracey requirement annotations
-
mdtrans
Markdown parser and transformer using
pest.rs, focused on flexibility to a project’s needs -
treesearch
Structure-aware document search CLI. Fast keyword matching over hierarchical document trees.
-
prompt-input
lightweight library for user input prompts in Rust, designed to make input handling straightforward
-
basalt-tui
Basalt TUI application for Obsidian notes
-
uvie
Ultra fast Vietnamese input method engine (Telex, VNI)
-
caco3
common lib
-
heiwa
A minimalist flat file CMS
-
codebook-lsp
A code-aware spell checker with language server implementation, installable via cargo install
-
topiary-cli
CLI app for Topiary, the universal code formatter
-
tre-regex
Rust safe bindings to the TRE regex module
-
eloran
Comics and Ebook web library written in rust, with reading, search, reading status, bookmarks
-
agentic-veritas-core
Intent compilation, uncertainty detection, and truth verification for AI agents
-
mdbook-markdown
Markdown processing used in mdBook
-
ere
A compile-time alternative for POSIX extended regular expressions
-
asciidork-backend
Asciidork backend
-
md-crdt
Conflict-free replicated data types for collaborative markdown editing
-
chardetng_c
C bindings for chardetng
-
mdbook-numbering
A mdBook preprocessor that adds numbers to headings and code block lines (for mdbook 0.5.0 and above)
-
loc
Count lines of code (cloc) fast
-
plsfix
Text cleaner upper
-
integrity-calc
Text integrity detection pipeline — entropy, perplexity, burstiness, Zipf's law, Bloom threshold classification
-
matchy-paraglob
Glob pattern matching with Aho-Corasick for matchy (internal)
-
lexicmp
comparing and sorting strings lexicographically and naturally
-
md-ulb-pwrap
Markdown paragraph wrapper using Unicode Line Breaking Algorithm
-
basic-text
Basic Text strings and I/O streams
-
bwrap
A fast, lightweight, embedded systems-friendly library for wrapping text
-
markex
Fast, non-validating markup element extractor (Tag Element, MdRef, MdCodeBlock, MDSection)
-
c2pa-text
Reference implementation for embedding C2PA manifests in text using Unicode variation selectors
-
fff-query-parser
Query parser for fff file finder - includes specific syntax for various constraints like globs, extensions, regex etc
-
philiprehberger-changelog
Programmatic CHANGELOG.md parsing, generation, and manipulation following Keep a Changelog format
-
mdbook-open-on-gh
mdbook preprocessor to add a open-on-github link on every page
-
cranberry
A versatile Rust library for Russian Cyrillic transliteration
-
uzumibi-gem
Uzumibi is a mruby/edge gem for serverless environment
-
casile
The command line interface to the CaSILE toolkit, a book publishing workflow employing SILE and other wizardry
-
mdkb
Persistent memory, hybrid search, and code intelligence for Claude Code and Codex — with CLI, lifecycle hooks, and MCP
-
ria
An adapter for converting the RefractiveIndex.INFO database into a flat, key-value store
-
krafna
terminal-based alternative to Obsidian's Dataview plugin, allowing you to query your Markdown files using standard SQL syntax
-
linestats
Group similar text lines and compute numeric statistics
-
matchr
A fast fuzzy matcher library written in Rust for use in CLI tools and TUI apps
-
mdbook-pikchr
A mdbook preprocessor to render pikchr code blocks as images in your book
-
unimorph
Command-line interface for UniMorph morphological data
-
fitsort-rs
rewrite of fitsort, used to read dfits output
-
pretext
Native Unicode text preparation and paragraph layout engine for Pretext
-
autofoam
related tools
-
rust-canto
Convert Chinese characters to Jyutping (粵拼) / Yale romanization (耶魯)
-
hayai
(速い) — generic fast-match engine with pluggable normalizers and prefilters
-
runefix-core
Unicode character display width engine supporting CJK, emoji, and grapheme clusters
-
pluck-core
Core indexing, AST chunking, BM25 search, and incremental reindex for pluck — the fast, token-friendly code-reading MCP server for AI coding agents
-
zp
Copy the contents of the source file or the standard output buffer to the clipboard, with support for maintaining a history of copied content, allowing users to easily paste into another file or program
-
chamkho
Khmer, Lao, Myanmar, and Thai word segmentation/breaking library and command line
-
hawkeye-fmt
The formatter library for hawkeye cli
-
pdfvec
High-performance PDF text extraction library for vectorization pipelines
-
pdf_oxide_mcp
MCP server for PDF extraction — gives Claude, Cursor, and AI assistants the ability to read PDFs locally. Text, markdown, and HTML output. Powered by pdf_oxide.
-
mdbook-mermaid-ssr
mdbook preprocessor to add mermaid support with server-side rendering
-
mallard
A line-oriented text buffer using an immutable green/red model with cheap branching and undo/redo
-
linkup
Automatically add links to Markdown files
-
etch
Not just a text formatter, don't mark it down, etch it
-
alphabet_detector
Natural language alphabet detection library
-
wetext-rs
Text normalization library for TTS, Rust implementation of WeText
-
ctxd-core
Core types for ctxd: events, subjects, hash chains
-
picomatch-rs
Rust glob matching core for the picomatch-rs workspace
-
acroform
High-level PDF form manipulation library using lopdf
-
probly-search
A lightweight full-text search engine with a fully customizable scoring function
-
rsslide
The ultimate slide builder
-
pure-tui
A modern terminal-based word processor for Markdown and other structured text documents
-
llmvm-core
The core application for llmvm
-
glance-md
Markdown preview that scrolls with your cursor. Terminal-first, editor-optional.
-
supermarkdown
High-performance HTML to Markdown conversion for LLMs
-
gaze-recognizers
Built-in recognizers for Gaze
-
markdown-harvest
designed to extract, clean, and convert web content from URLs found in text messages into clean Markdown format. Originally created as an auxiliary component for Retrieval-Augmented Generation (RAG)…
-
spider-tendril
Send-able tendril fork (atomic refcount) for high-concurrency HTML parsing
-
lexicon-docx
Lexicon Markdown to DOCX processor for legal contracts
-
cssbox-test-harness
WPT test runner for cssbox layout engine
-
rhema_contracts_chirho
Shared type-level contracts, newtypes, DTOs, and trait definitions for the Rhema Chirho engine
-
onetools
CLI tools for the ONEcode file format
-
c2pa-text-binding
C2PA soft binding and content fingerprinting for text assets
-
gorgeous
Grammar-driven pretty printers auto-generated from BBNF grammars
-
json_to_table
pretty print JSON as a table
-
embed-src
Embed source files into any text file
-
llmtask
Engine-agnostic Task abstraction for LLM structured-output: Task trait + Grammar (JSON Schema, Lark, Regex) + ImageAnalysis
-
blogr-cli
A CLI static site generator for blogs
-
brk_string_wizard
manipulate string like a wizard
-
onig-regset
Rust-Onig is a set of Rust bindings for the Oniguruma regular expression library. Oniguruma is a modern regex library with support for multiple character encodings and regex syntaxes.
-
mdsh
Markdown shell pre-processor
-
fluxer-rust
Rust API wrapper for Fluxer
-
rust_string_utils
String utilities for rust based on org.apache.commons.lang3
-
unicode-language
detect language coverage given a list of codepoints
-
mantra-miner
your software recite mantras while it runs
-
mdbook-mermaid-mmdr
A mdbook preprocessor that renders mermaid diagrams using mermaid-rs-renderer
-
freq-calc
calculate occurence and frequency of Words, Letters, etc in text
-
pdf2pwg
Single purpose A4 page renderer rendering PDF using pdfium to PWG/URF
-
logparse-pretty-print
pretty print tree
-
pager2
pipe your output through an external pager
-
awsim-kendra
AWS Kendra intelligent search emulator for AWSim
-
atomic-plus
type extensions for the atomic standard library
-
kaff_sso
Small-buffer-optimized generic buffer and UTF-8 string type
-
unicode-matching
match Unicode open/close brackets
-
adabraka_util
A collection of utility structs and functions for Adabraka GPUI (originally from Zed - github.com/zed-industries/zed)
-
smb-server
SMB2/3 file-sharing server library with pluggable storage backends
-
zeroten-denote
Handle denote name scheme
-
aki-mcycle
mark up text with cycling color
-
lines
Utililities for iterating readers efficiently line-by-line
-
tamil-yaappu-analyzer
Tamil prosody analyzer and classifier for verse compositions
-
gemini-tokenizer
Authoritative Gemini tokenizer for Rust, ported from the official Google Python GenAI SDK
-
fast-grep
Indexed regex search. 6-25x faster than ripgrep on large codebases via sparse n-gram index, position masks, and mmap'd posting lists.
-
simstring_rust
A native Rust implementation of the SimString algorithm
-
spooky_go
Go board game engine
-
fmtm
A diff-friendly Markdown formatter that breaks lines on sensible punctuations and words to fit a line width
-
string-auto-indent
Normalizes multi-line string indentation while preserving platform-specific line endings
-
memchr-rs
Fast memchr and memchr2 implementations in Rust
-
topo-score
BM25F, heuristic, structural, and RRF fusion scoring
-
hackertyper
A local CLI alternative for hackertyper.net
-
pdf_oxide_cli
CLI for pdf-oxide — the fastest PDF toolkit. 22 commands: text extraction, PDF to markdown, search, merge, split, images, compress, encrypt, watermark, forms, and more.
-
yongcat
한국어 용언(동사/형용사) 활용 라이브러리
-
reedy
A terminal-based RSS reader with a clean TUI interface
-
lib-bcsv-jmap
reading and writing BCSV/JMap format used for Wii and GC games, including Super Mario Galaxy
-
japanese-text
日本語テキスト正規化ライブラリ - 文字幅、かな、Unicode、句読点、旧字体の正規化
-
mdbook-preprocessor-boilerplate
Boilerplate code for mdbook preprocessors
-
obsidian-export
associated CLI program to export an Obsidian vault to regular Markdown
-
zenpatch
A robust library for applying text-based patches, designed for AI coding agents with backtracking algorithm
-
yamlpatch
Comment and format-preserving YAML patch operations
-
pagefuse
Your pages, your way — PDF, DOCX, images and more
-
pkstate
representing, serializing, and deserializing the state of a poker hand
-
legalis-uk
United Kingdom jurisdiction support for Legalis-RS (Employment Law, UK GDPR, Consumer Rights, Contract Law, Company Law)
-
nerdle
A macro-powered compile-time nerd-font code point resolver
-
repvar
A tiny CLI tool that replaces variables of the style
${KEY}in text with their respective value. It can also be used as a rust library -
patine
Render Markdown beautifully in the terminal
-
caseify
A CLI tool to convert strings between different cases
-
pdflens-mcp
An MCP server for reading PDFs, coded by human, designed for AI
-
generative-artifact-protocol
Generative Artifact Protocol (GAP) — token-efficient artifact generation and updates for LLMs
-
repr
The regular-expression-as-linear-logic interpretation and its implementation
-
htmd-cli
The command line tool for htmd
-
easymark
Lightweight Markdown rendering utility that just works
-
scrybe-mcp-server
Scrybe MCP server — inbound MCP tools: open/read/section/edit/diff/find/embed/lint
-
rep-grep
wgrep/write-grep CLI
-
string-patterns
Makes it easier to work with common string patterns and regular expressions in Rust, adding convenient regex match and replace methods (pattern_match and pattern_replace) to the standard…
-
badwords-rs
filtering based on badwords (https://github.com/hughsie/badwords)
-
utf8-bytes
bytes::Bytes, but UTF-8
-
zipcodes
Query US zipcodes without SQLite
-
matchy-match-mode
Shared MatchMode enum for matchy workspace (internal)
-
markov_strings
A simplistic Markov chain text generator
-
floating-ui-dom
Rust port of Floating UI. Floating UI for the web.
-
monochora
gif to ascii art converter written in rust
-
xj_scanf
Safe reimplementation of
scanf() -
utf16_iter
Iterator by char over potentially-invalid UTF-16 in &[u16]
-
triblespace-search
Content-addressed BM25 + HNSW indexes on top of triblespace piles
-
oris-runtime
An agentic workflow runtime and programmable AI execution system in Rust: stateful graphs, agents, tools, and multi-step execution
-
chord3
Create pdf songbooks from chopro source
-
bazaar
formats and protocols
-
mdbook-qr
An mdBook preprocessor that generates a QR code using fast_qr
-
colonnade
format tabular data for display
-
filenamify
Convert a string to a valid filename
-
encoding-next
Character encoding support for Rust
-
renderdag
An ASCII or Unicode renderer for directed acyclic graphs (DAGs)
-
triton-tui
Terminal User Interface to help debugging programs written for Triton VM
-
rehuman
Unicode-safe text cleaning & typographic normalization for Rust
-
paperdown
A fast CLI tool to batch convert PDFs into Markdown using GLM-OCR
-
rewrite
Safely rewrite file contents from stdin, even when file is open as an input
-
hyperchad_markdown
Markdown to HyperChad Container conversion with GitHub Flavored Markdown support
-
zepub
epub、mobi电子书读写
-
terraphim_hooks
Unified hooks infrastructure for Terraphim AI - knowledge graph-based text replacement and validation
-
asciisavers
A small collection of ascii screensavers
-
breadchunks
Heading-aware, token-budgeted semantic chunker for Markdown — for RAG and embedding pipelines
-
hanzi-sort
Sort Chinese text by pinyin or stroke count, with polyphonic overrides and terminal-friendly output
-
krilla-rxing
Render barcodes (QR Codes, Aztec, Data Matrix, etc) using rxing into a krilla Surface (PDF)
-
stenotype
Machine stenography primitives
-
furigana
Map furigana to a word given its reading
-
paperless-api-client
Paperless-ngx API client
-
mkdlint
A style checker and lint tool for Markdown/CommonMark files, written in Rust
-
okstd
The standard library that's ok
-
mdbook-callouts
mdBook preprocessor to add Obsidian Flavored Markdown's Callouts to your book
-
pragmatic-segmenter
Rust port of pySBD v3.1.0
-
pgf2json
Application Programming Interface to load and interpret grammars compiled in Portable Grammar Format (PGF). The PGF format is produced as a final output from the GF compiler. The library…
-
agent-doc
Interactive document sessions with AI agents
-
mdbook-assets-hash
mdbook preprocessor that adds content-based cache-busting hashes to asset filenames
-
kaho
A Rust-based library for interacting with Stoat
-
ix-match
matching and moving IIQ files so they can be easily imported into IX Capture
-
llm-utl
Convert code repositories into LLM-friendly prompts with smart chunking and filtering
-
compression-prompt
Fast statistical compression for LLM prompts - 50% token reduction with 91% quality retention
-
doryen-rs
Pure rust OpenGL accelerated roguelike console API with native/wasm support
-
prompty
asset class and format for LLM prompts
-
datex
native libary
-
appletheia-domain
The Domain Layer for Event-Sourcing Architecture
-
anyxml-encoding
character encoding and decoding library for XML
-
syntaxfmt
A derive macro-based library for flexible syntax tree formatting with pretty printing support
-
lingua-latvian-language-model
The Latvian language model for Lingua, an accurate natural language detection library
-
fop-cli
Command-line interface for Apache FOP - XSL-FO to PDF converter
-
zalo
A code highlighter giving the same output as VSCode
-
parquette
View and search through parquet files
-
latinga
High-performance, Zero-Copy Uzbek Cyrillic-Latin transliterator
-
ohos-input-method-sys
OpenHarmony's input method binding for rust
-
mdbook-private
An mdbook preprocessor that controls visibility of private chapters and sections within them
-
slugrs
A fast, locale-aware slugify library for Rust
-
srt2txt
Convert SRT subtitle files into clean plain text (strip timestamps, tags, merge lines)
-
zet
zet finds the union, intersection, set difference, etc of files considered as sets of lines
-
absorb
A cli tool to absorb text quickly
-
rustpython-literal
Common literal handling utilities mostly useful for unparse and repr
-
facelessvideos
Rust helpers for drafting faceless YouTube short script outlines, paired with the FacelessVideos web app
-
servo-background-hang-monitor
A component of the servo web-engine
-
mdbook-pagetoc
A mdbook plugin that provides a table of contents for each page
-
think-lint
Reasoning trace quality auditor for LLM training data
-
minimo
terminal ui library combining alot of things from here and there and making it slightly easier to play with
-
sourceannot
render snippets of source code with annotations
-
xlsynth-pir
partial XLS IR focused on functions
-
rustyphoenixgenerator
generator from text files
-
zantetsu-vecdb
Canonical anime title matching via Kitsu dumps or remote endpoints
-
dubs
Themed name generator — like haikunator, but with categories
-
marque-engine
Pipeline orchestration: core + rules → diagnostics + fixes
-
tibco_ems
A high level API for the Tibco EMS
-
bear-query
A read-only Rust library for querying the Bear note-taking app's SQLite database with minimal interference
-
cosmic-text-tessera-fork
Pure Rust multi-line text handling
-
semantic-edit-mcp
MCP server for semantic code editing with tree-sitter
-
lindera-unidic-builder
A Japanese morphological dictionary builder for UniDic
-
liblevenshtein
Levenshtein/Universal Automata for approximate string matching using various dictionary backends
-
aprender-shell
AI-powered shell completion trained on your history
-
search-text
A fast and flexible command-line tool to recursively search for text or regex patterns in files under a directory
-
multimatch
Multi-pattern matching engine — Aho-Corasick + regex with optional Hyperscan SIMD acceleration
-
deinflector
Attempts to be a 1 to 1 reimplementation of Yomitan's MultiLanguageTransformer
-
ystd
An opinionated and batteries included
stdmirror for convenient, correct code and pleasant error messages -
hexout
A compact and dependency-free, flexible and customizable hex dump library for Rust that provides beautiful, configurable binary data visualization
-
pbtree
A fast, generic piece-table text buffer backed by a balanced B+ tree
-
embedd
Embedding interfaces + local backends (Candle/HF)
-
sedregex
Sed-like regex library
-
thesaurus
An offline thesaurus library for Rust
-
cjrh-moreutils-isutf8
Rust implementations of the moreutils tools
-
h2m-search
Zero-config web search for h2m (DuckDuckGo, Wikipedia, SearXNG, Brave, Tavily)
-
codebase-to-prompt
bundling text files like code to single file
-
ucf
A universal code formatter
-
sk-skimmer
Fuzzy Finder in rust!
-
use-pattern
Feature-gated facade crate for RustUse pattern helpers
-
unicode-bidi-mirroring
Unicode Bidi Mirroring property detection
-
searcher_txt
A copy of grep that I made to show that im bad at rust
-
sakoku
A fast CLI tool to detect non-ASCII bytes in source files
-
arxiv-cli
CLI to download papers from arXiv
-
text_trees
textual output for tree-like structures
-
omnix-common
Common functionality for omnix frontends
-
vn-settings
Various settings intended to simulate visual novels
-
limit-tldr
Code analysis library that actually fits in context - 95% token savings
-
edifact-randomize
Deterministic field randomization for German energy market EDIFACT data
-
rfham-markdown
Markdown writer (utility) for RF-Ham libraries
-
shuck-formatter
Shell script formatter with configurable style options
-
gspell
Rust bindings for gspell
-
corpa
The ripgrep of text analysis. Blazing-fast CLI for corpus-level NLP statistics.
-
spider-util
Shared utility functions and types for the spider-lib ecosystem
-
mds
A skim-based
*.mdexplore and surf note-taking tool -
document_tree
reStructuredText’s DocumentTree representation
-
rustic_print
A versatile Rust library for enhancing console output. It offers a range of features to create a more engaging and informative command-line interface.
-
badwords-core
Core profanity filter logic - normalization, transliteration, homoglyphs
-
phonetics-rs
IPA-based phonetic distance metrics: strict edit distance, listener-confusion distance, and per-phoneme acoustic and perceptual scoring. Calibrated against Mad Gab puzzle data; tunable per dialect.
-
oxide-api
A fully generated & opinionated API client for the Oxide API
-
text_io
really simple to use panicking input functions
-
dvd-term
A bouncing ASCII art DVD logo (or custom text) for the terminal
-
xfa-layout-engine
Box-model and pagination layout engine for XFA forms. Experimental — part of the PDFluent XFA stack, under active development.
-
typf-unicode
Unicode segmentation, bidi, and normalization for Typf
-
sourcefile
Retain mapping information when concatenating source files, to make error messages more useful
-
mdbook-tabs
mdBook plugin for rendering content in tabs
-
yamake
yet another make tool
-
bmux_decoration_plugin
Decoration plugin for bmux — paints pane borders and publishes scene updates
-
tectonic
A modernized, complete, embeddable TeX/LaTeX engine. Tectonic is forked from the XeTeX extension to the classic "Web2C" implementation of TeX and uses the TeXLive distribution of support files.
-
lig-rs
lig is a multipattern regex matching tool
-
graphify-detect
File discovery and classification for graphify
-
inflector-plus
Adds String based inflections for Rust. Snake, kebab, camel, word, sentence, class, title and table cases as well as ordinalize, deordinalize, demodulize, foreign key, and pluralize/singularize…
-
analiticcl
approximate string matching or fuzzy-matching system that can be used to find variants for spelling correction or text normalisation
-
typub
Universal publishing tool that converts Typst content to multiple platforms (Ghost, WordPress, Dev.to, Notion, etc.)
-
seonbi
Korean text arrow/quote/punctuation processor
-
just_progress
Just a progress display tool
-
markdown-readtime
estimate reading time for Markdown content
-
yeslogic-fontconfig-sys
Raw bindings to Fontconfig without a vendored C library
-
r-matrix
Rust port of cmatrix
-
crate-doc-cli
Access Rust crate documentation from the CLI
-
saytify
greeting and farewell messages
-
pinyin-parser
Parses a string of pinyin syllables. Covers marginal cases such as
ẑ,ŋandê. -
nib
static site generator
-
oxyl-diagnostics
Diagnostic types for oxyl
-
rio-grapheme-width
Emoji presentation and variation-sequence tables for Rio terminal. Forked from wezterm-char-props.
-
bump-bin
Increments version with semver specification
-
sakurs-core
High-performance sentence boundary detection using Delta-Stack Monoid algorithm
-
wikidot-normalize
provide Wikidot-compatible string normalization
-
cro_stem
A lightning-fast, zero-dependency Croatian stemming library written in Rust
-
re_view_text_document
view that shows a single text box
-
unicode-intervals
Search for Unicode code points intervals by including/excluding categories, ranges, and custom characters sets
-
aprender-contracts-cli
CLI for provable-contracts — validate, scaffold, verify, status, audit
-
cljvindent-core
Clojure(script), EDN indentation and alignment library
-
hanzo-extract
Content extraction with built-in sanitization via hanzo-guard
-
invisible-characters
A list of invisible characters
-
nom-grapheme-clusters
Adapter that allows nom to account for unicode grapheme clusters
-
ltk_io_ext
I/O extensions used by League Toolkit
-
tbll
tbll outputs data in tabular format
-
anycase
a case conversion library for Rust
-
officemd_markdown
Markdown renderer for OfficeMD document IR
-
pyohwa-core
Core engine for Pyohwa static site generator — config, markdown, rendering, and build pipeline
-
kiters
timestamps, request IDs, and external IDs
-
aetna-markdown
Aetna — markdown to El tree transformer
-
erebus
A CLI message generation library
-
fkys-rs
F*cking Kill Yourself lang interpreter written in Rust
-
terraphim-cli
CLI tool for semantic knowledge graph search with JSON output for automation
-
CompactPrefixRadix
a minimalistic but efficient radix tree implementation with extra prefix support
-
mdbook-indexing
mdbook preprocessor for index generation
-
lera-trigram
A trigram-based regex optimization library inspired by PostgreSQL's pg_trgm
-
awful_knowledge_synthesizer
Generate LLM-powered exam questions from YAML books, manpages, mdbooks, tealdeer pages, and code
-
indent_write
Write adapters to add line indentation
-
bangumi-api
An api implementation for Bangumi website
-
seshat-unicode
A Unicode Library for Rust. Unicode 16.0.0 ready. XID_Start and XID_Continue are also available.
-
harper-tex
The language checker for developers
-
amt-phonetic
Articulatory Moment Transform — language-agnostic phonetic name matching
-
csml_interpreter
The CSML Interpreter is the official interpreter for the CSML programming language, a DSL designed to make it extremely easy to create rich and powerful chatbots
-
mdbook-driver
High-level library for running mdBook
-
illbethejudgeofthat
Pro se custody case builder. Google Takeout to courtroom in one evening.
-
edit_core
Dependency-free text editing core for terminal and GUI editors
-
regex_regexop
peliminary function that turns a regex into a comparable FTS search query
-
flo_rope
An attributed and streaming implementation of the rope data structure
-
ansi-align
Text alignment library with ANSI escape sequence and Unicode support
-
deucalion
High-performance Windows library for capturing decoded FFXIV packets
-
webshift
Denoised web search library — fetch, clean, and rerank web content for AI agents
-
kdex
A fast CLI for indexing and searching code repositories and knowledge bases for AI-powered workflows
-
easyeditor
Easy Markdown Editor
-
swc-plugin-inferno
SWC plugin for InfernoJS
-
hemoglobin
Bloodless
-
pulumi_gestalt_core
Core Pulumi Gestalt implementation
-
fast-slice-utils
Highly optimized slice utilities using SIMD instructions when available
-
tortilla
Somewhat syntax-aware text wrapping for source code and plain text documents
-
text-document-direct-access
Entity CRUD controllers and DTOs for text-document
-
toolpath-md
Render Toolpath documents as Markdown for LLM consumption
-
biometrics
provide the vitals of a process in the form of counters, gauges, moments, and T-digests
-
pink_accents
Replacement of patterns in string to simulate speech accents
-
bullet_stream
Bulletproof printing for bullet point text
-
cjc-regex
NFA-based regex engine with no external dependencies
-
aclneko
caitsith policy abstract
-
fuzzytail
A modern, colorful tail replacement with split-pane log monitoring
-
identifier_safety
Unicode confusable character detection and canonicalization
-
tectonic_xetex_format
Tectonic/XeTeX engine data structures and their expression in TeX "format" files
-
ucm-engine
Transformation engine for the Unified Content Model
-
ADA_Standards
help you handle checks on your ADA projects, especially good to build scripts to check coding standards conformity
-
glyphana
Quickly find, inspect & collect unicode glyps
-
kashida
Insert Kashidas/Tatweel into Arabic text, e.g. for justification purposes.
-
harper-literate-haskell
The language checker for developers
-
mdbook-math
An mdbook preprocessor that converts MathJax LaTeX math blocks to raw LaTeX notation for the LaTeX renderer
-
ttf_word_wrap
Wraps text based on character width
-
rsxiv
Tools for working with arXiv and the arXiv API
-
markless
A terminal markdown viewer with image support
-
commit_crafter
AI powered tool for Git commit message generator
-
curtana
Simplified zero-cost wrapper over llama.cpp powered by lama-cpp-2
-
snips
Keep code snippets in markdown files in sync
-
text-similarity-metrics
A high-performance Rust library for computing text similarity using multiple algorithms
-
logappend
Execute child process, read from stdin and stderr, emit into files, truncate at given total content sizes
-
genpdfi
User-friendly PDF generator written in pure Rust
-
mdpdf
A fast CLI tool to convert Markdown files to PDF
-
go22dos
go to todos
-
upid
Universally Unique Prefixed Lexicographically Sortable Identifier
-
simplematch
Fast wildcard pattern matching for strings and bytes with a simple api
-
regex_generate
Use regular expressions to generate text
-
no-crlf
A CLI tool to convert CRLF line endings to LF in text files
-
on-selected-text
A tiny Rust library that allows you to easily obtain selected text across all platforms (macOS, Windows, Linux)
-
rolldown_utils
General-purpose utilities for Rolldown
-
tool-output-truncate
Truncate tool output (file reads, command runs, search hits) before adding to LLM message history. Char-aware head/middle/tail strategies with a configurable elision marker. Zero deps.
-
finetype-model
Candle-based transformer model for FineType
-
mdbook-alerts
mdBook preprocessor to add GitHub Flavored Markdown's Alerts to your book
-
madoru
markdown task runner
-
dart_edge_core
Shared FFI primitives for Dart Edge native crates
-
grapheme_machine
Grapheme cluster text segmentation (UAX #29) state machine for streaming input
-
pii-vault
Presidio-compatible PII detection, anonymization, and reversible tokenization
-
xarray
version of the XArray with copy-on-write capabilities
-
demes-forward-capi
C API to demes-forward crate
-
rust-persian-tools
Official Rust implementation of Persian Tools
-
gitfluff
Commit message linting tool with presets, custom formats, and cleanup automation
-
reovim-module-snippet
Snippet expansion module for reovim
-
uv-requirements-txt
internal component crate of uv
-
copyit
A cross-platform clipboard tool similar to pbcopy/pbpaste
-
rustik-highlight
Rustik code highlighter
-
hmd-cli
Command-line tooling for Human Markdown documents
-
serpscraper
A CLI tool to fetch and convert search results into Markdown
-
dash-em
Enterprise-Grade Em-Dash Removal Library — SIMD-Accelerated String Processing
-
rstango-top
Small text user-interface for watching activity in the Tango distributed control system
-
ai-translator
基于 AI 的多语言文本翻译工具,支持自定义提示词
-
rawgrep
Grep at the speed of raw disk
-
inlet_manifold
A general purpose highlighting library
-
blame-rs
Line-by-line authorship tracking for revisioned text
-
ankify
Generate and sync Anki flashcards from your Typst documents
-
tfidf-text-summarizer
extractive text summarization system which uses TF-IDF scores of words present in the text to rank sentences and generate a summary
-
birta
Preview markdown files in the browser with GitHub-style rendering
-
umsc
Uyghur multi-script converter for Arabic, Latin, Yengi, Cyrillic, XJUS, and Uzbek Latin scripts
-
mdbook-findrep
mdBook find / replace preprocessor
-
maddi-recipe
parsing and scaling markdown recipes
-
cfasttext-sys
fastText ffi binding
-
qmd-cli
CLI for qmd - lightweight SOTA local search engine for AI agents
-
pipefog
Stream-structured data obfuscator for JSON/YAML/CSS
-
pelagic
Small command parsing primitives for CLI tools and text interfaces
-
sphinx-rustdocgen
Executable to extract rustdoc comments for Sphinx
-
streamdown-config
Configuration loading and management for streamdown
-
gh-emoji
Convert
:emoji:to Unicode using GitHub’s emoji names -
modeling
tools to analysis different languages by Ctags
-
fulgur
HTML/CSS to PDF conversion library
-
nobom
remove UTF-8 BOM (Byte Order Mark) from stdin and write to stdout
-
yara-x-fmt
A code-formatting library for YARA rules
-
miktik
A unified, multi-backend tokenizer library for LLMs
-
ainl-semantic-tagger
Deterministic semantic tagging and normalization for AINL / ArmaraOS agents
-
devek
CLI for copying HTML to clipboard
-
mdbook-plugin-utils
mdBook plugins
-
bhc-lexer
Lexical analysis for BHC
-
serde-llsd-benthic
serializing and de-serializing data in Linden Lab Structured Data format. This format is used by Second Life and Open Simulator
-
combust
AI-driven local pull request workflow where Claude is the only contributor
-
ezemoji
Catigoryized Emoji's
-
artifacts-rs
Rust client for Artifacts
-
transmutation
High-performance document conversion engine for AI/LLM embeddings - 27 formats supported
-
seam
Symbolic Expressions As Markup
-
utf16_lit
macro_rules to make utf-16 literals
-
printwell-cli
Command-line tool for HTML to PDF conversion
-
hexdump
Easy hexdump to stdout or as an iterator
-
urlable
A comprehensive URL manipulation library for Rust, providing utilities for parsing, encoding, and manipulating URLs with support for query strings, path manipulation, punycode domains and more
-
like
A SQL like style pattern matching
-
case_insensitive_hashmap
A HashMap that uses case-insensitive strings as keys
-
unescape
Unescapes strings with escape sequences written out as literal characters
-
glk
Bindings for the Glk I/O interface for hosting interactive fiction interpreters
-
mpeg-syntax-dump
Dump data in the style of MPEG speficication pseudocode
-
asimov-core
ASIMOV Software Development Kit (SDK) for Rust
-
character_converter
Turn Traditional Chinese script ot Simplified Chinese script and vice-versa and tokenize
-
repo-grove
CLI tool for managing a collection of Git repositories
-
cosmic-text
Pure Rust multi-line text handling
-
gaze
small utility library with the goal of making it easier to scan/lex text and collections
-
obfsck
Text obfuscation library for redacting secrets, IPs, emails, users, and hostnames from logs and alerts
-
rosetta-aisp
Bidirectional prose ↔ AISP symbolic notation conversion based on the Rosetta Stone mappings
-
cai
User friendly CLI tool for AI tasks
-
himmelblau_red_asn1
A little library to encode/decode ASN1 DER
-
cloakrs-core
Core PII scanning, recognizer, and masking primitives for cloakrs
-
pukram2html
converting Pukram-formatted text to HTML
-
datadriven
Rewritable table-driven testing
-
satteri-plugin-api
Rust plugin trait, typed visitors, and runner for Sätteri
-
lexrs
Efficient lexicon data structures: Trie and DAWG
-
byteutils
that provides a collection of frequently used utility functions for working with bytes, strings, and vectors. It includes common tasks such as converting between strings and byte arrays…
-
finding
command line finding tool
-
oak-pretty-print
Syntax highlighter supporting multiple programming languages
-
regex-charclass
Manipulate and convert regex character classes
-
nu-command
Nushell's built-in commands
-
profanite-core
Kryptonite for Profanities — lightweight, obfuscation-resistant profanity filter
-
unreal-doc
generating documentation from Unreal C++ sources
-
rustrails-text
Rich text content (ActionText equivalent)
-
unsafe-tools-mimic
Size and alignment matched opaque types
-
rwkv-tokenizer
A fast RWKV Tokenizer
-
connected-papers
client for Connected Papers integrated with Semantic Scholar utilities
-
tree-sitter-stack-graphs-typescript
Stack graphs definition for TypeScript & TSX using tree-sitter-typescript
-
md-formatter
A fast, opinionated Markdown formatter
-
linurgy
Manipulate the output of multiple newlines. Replace/Insert/Append newlines with text. Input and output from stdio/files/buffers
-
liquidwar7core
Liquidwar7 core logic library, low-level things which are game-engine agnostic
-
bk-tree
A Rust BK-tree implementation
-
sentencepiece-rs
Rust runtime reimplementation of SentencePiece model loading, normalization, encoding, and decoding
-
doing-ops
Domain operations for the doing CLI
-
h2m
HTML to Markdown converter
-
reason-shell
Reason: A Shell for Research Papers
-
mecab-ko-dict
한국어 형태소 사전 관리 - 바이너리 포맷, FST 검색, 연접 비용
-
falcom-sjis
Falcom-compatibile Shift JIS implementation
-
easy-regex
Make long regular expressions like pseudocodes
-
ai_tokenopt
Adaptive token optimization engine for LLM inference pipelines — compresses prompts, conversation history, tool schemas, and output streams to minimize token usage while preserving response quality
-
pspp
Statistical analysis software
-
newdoc
Generate pre-populated module files formatted with AsciiDoc that are used in Red Hat and Fedora documentation
-
tree-sitter-stack-graphs-javascript
Stack graphs definition for JavaScript using tree-sitter-javascript
-
numaelis-rckive-genpdf
User-friendly PDF generator written in pure Rust
-
mdbook-footnote
mdbook preprocessor for footnotes
-
smart-markdown
Parse and render Markdown to ANSI-styled terminal output with live in-place refresh
-
stam-tools
Command-line tools for working with stand-off annotations on text (STAM)
-
mdbook-xref
a preprocessor to add support for easy cross-references in mdbook
-
daemon8-store
SurrealDB storage backend for daemon8 observations
-
utf8_iter
Iterator by char over potentially-invalid UTF-8 in &[u8]
-
arborium-cli
Command-line syntax highlighter powered by arborium
-
fret-text-nav
Text navigation utilities (word/line boundaries, selection movement) for Fret
-
ngram_rs
Facilitate creating ngrams in Rust to be used in the polars plugin
-
tengwar
Transliterate text into J.R.R. Tolkien's Tengwar.
-
ratex-unicode-font
System Unicode font discovery for RaTeX fallback rendering
-
metatron
core library
-
santoka
Translations of 668 of Taneda Santoka's free-verse haiku
-
goose-eggs
in writing Goose load tests
-
mdriver
Streaming markdown printer for the terminal with syntax highlighting
-
gramdex
k-gram / trigram indexing primitives for approximate string matching
-
snailquote
Escape and unescape strings with shell-inspired quoting
-
spellabet
Convert characters into spelling alphabet code words
-
mdbook-slides
An mdbook preprocessor that renders slide presentations from markdown
-
pdf-cos
PDF COS (Carousel Object Structure) parser -- edgeparse fork of lopdf 0.39.0 with f64 Real precision and ICC color support
-
litedoc-cli
Command-line tool for parsing and validating LiteDoc documents
-
rust_readability
A package to assess the complexity of texts using a variety of readability formulas
-
treebender
An HDPSG inspired symbolic NLP library for Rust
-
riimut
Transform latin letters to runes & vice versa
-
merge3
merge tool for three-way merges
-
bash-builtins
implement loadable builtins for bash
-
mdbook-cat-prep
a preprocessor for mdbook which provides teacher, subject, material and tag functionality
-
bitutils2
A package of tools for bit manipulations, including bit indexing, bitfields, and a variation of regular expressions for binary data
-
streamdown
A streaming markdown renderer for modern terminals (Rust port of Streamdown)
-
ultra-nlp
A NLP library
-
kreuzberg-paddle-ocr
PaddleOCR via ONNX Runtime for Kreuzberg - high-performance text recognition
-
legalis-eu
European Union jurisdiction support for Legalis-RS (GDPR, Consumer Rights, Competition, Treaties)
-
winload
Network Load Monitor — nload-like TUI tool for Windows/Linux/macOS
-
codabase
Polyglot development tool for markdown-defined data types
-
jawk
JSON AWK
-
everruns-a2ui
A2UI component catalog and prompt generator for Everruns
-
rlm-rs
Reward Language Model (RLM) verifier: weighted signals, grading, and reports
-
unicode-ccc
Unicode Canonical Combining Class detection
-
herring-automata
Automata construction for Herring
-
owl-write
A TUI for managing your writing
-
boreal-cli
CLI utility to run boreal, a YARA rules engine
-
autonomo-arabic-reshaper
Arabic text shaper + visual RTL reverser tailored for game modding (RimWorld, Unity LTR UIs). Handles tags, escapes, entities, and common UI artifacts.
-
clipboard-stream
Async stream of clipboard change events
-
cascii-core-view
Core frame display and animation library for ASCII art viewers
-
libreadability
Rust port of go-readability — extract readable content from HTML
-
mdlynx
Small, fast utility to find broken file links in Markdown documents
-
mdbook-last-changed
mdbook preprocessor to add the last modification date per page
-
gilt
Fast, beautiful terminal formatting for Rust — styles, tables, trees, syntax highlighting, progress bars, markdown
-
chess-notation-parser
Algebraic chess notation parser
-
pdf_tables
Scrape text from tables in PDF files
-
kathoey
text feminization using open corpus linguistics data
-
aufbau
Generalized prefix parsing for a class of context-dependent languages
-
grift_unicode
Unicode character operations for the Grift Scheme language
-
ik-rs
chinese segment, ik-analyzer for rust
-
aho-corasick
Fast multiple substring searching
-
asciidork-eval
Asciidork eval
-
wg-ragsmith
Semantic chunking and RAG utilities for document processing and retrieval-augmented generation
-
lister-cli
Lister: Navigate Markdown Lists
-
simdutf
Unicode validation and transcoding at billions of characters per second
-
bibleref
Structures and functions for managing Bible references
-
liteparse-pdfium
Safe Rust wrapper around PDFium for liteparse
-
pandoc_types
Rust port of pandoc-types
-
ferret
A trigram-based tool for detecting similarity in groups of text documents or program code
-
llm-message-hash
Stable canonical hash of LLM request/message structures. Recursive key-sorting JSON canonicalization + sha256, with per-provider ignore-lists so semantically-equal Anthropic/OpenAI/Bedrock requests produce the same hash…
-
uapi-version
Compare versions according to the UAPI Version Format Specification
-
ebg
Eric's Blog Generator, a simple static site generator
-
ncp-matcher
plug and play high performance fuzzy matcher
-
lo_impress
Presentation slide deck builder with ODP export
-
forbidden-bands
8-bit string handling library
-
privacy-filter-rs
OpenAI Privacy Filter — PII detection inference in pure Rust with Burn ML
-
rustkorean
processing Korean characters. It provides functionalities to check if a character is Korean, classify Korean characters, verify if a character is a leading consonant (choseong), a medial vowel (jungseong)…
-
mongodb-gridfs
Mongo GridFS
-
proper-sort
Small crate for natural sorting of strings that include number and size data
-
voice-g2p
Grapheme-to-phoneme conversion: misaki dictionary + espeak-ng fallback
-
mdbook-najan
Preprocessor for the Najan mdBook
-
jc-adf
Pure markdown <-> Atlassian Document Format (ADF) converter. Lossless via an
adf:<type>fenced-block escape hatch. -
aurora-semantic
Local embedded semantic search engine for source code, designed for IDE integration
-
ingest-api
Ingestion and verification service logic for immutable trace records
-
xerg
Ultra-fast grep implementation in Rust - built for maximum speed with direct output and parallel processing
-
natord-plus-plus
Natural ordering for Rust
-
yuuang-test-napi
N-API bindings
-
libappindicator-zbus
zbus implement for libappindicator
-
eddie
Fast and well-tested implementations of edit distance/string similarity metrics: Levenshtein, Damerau-Levenshtein, Hamming, Jaro, and Jaro-Winkler
-
automata-like-programming
that provides mechanisms for controlling the flow of execution in imitation of an automaton
-
agentai
designed to simplify the creation of AI agents
-
ninmu-tools
Built-in tool implementations for the Ninmu Code ecosystem
-
escaping
configurable string escaping and unescaping
-
tcalc-rustyline
A fork of Rustyline for use specifically with tcalc
-
ratel-ai-core
Tool retrieval and ranking for AI agents — BM25 over tool catalogs. Core of the Ratel context engineering platform.
-
docloom
Programmatically compose documents and render them to Markdown or styled terminal output
-
emojicon
Find Emoji by using Emoticons and GitHub's, Bengali emoji names
-
sci-fmt
Format values with PDG-style uncertainty notation
-
lex-extension
Public surface for Lex extensions: handler trait, wire types, schema types
-
obfuskey
Cross-language compatible integer obfuscation and bit-packing library
-
safe-string
safe interface for interacting with multi-byte strings in Rust, namely IndexedStr, IndexedString, and IndexedSlice
-
slack-blocks-render
Slack blocks render is a Rust library to render Slack blocks as Markdown
-
spanned
string processing with file/line/col information and the regular rust
strAPI -
alass-util
convenience API for subtitle synchronization with alass-core
-
mdbook-presentation-preprocessor
A preprocessor for utilizing an MDBook as slides for a presentation
-
rspack_plugin_case_sensitive
rspack case sensitive plugin
-
shore-tui
Terminal UI client for the Silvershore chat daemon
-
agentfit
Fit messages to an LLM context window. Token-aware truncation with pluggable tokenizers and multiple strategies.
-
css_recess_order
Recess-based sort order for CSS properties
-
elizaos-plugin-pdf
elizaOS PDF Plugin - PDF reading and text extraction
-
unicode-width-16
Determine displayed width of
charandstrtypes according to Unicode Standard Annex #11 rules -
cfd16-assembler
An assembler backend for the CFD-16 ISA
-
subslay
Text → emoji 💅🏻 Powered by Rust
-
md-to-incodoc
Convert markdown to incodoc
-
lex-core
Parser library for the lex format
-
readable-rs
A native Rust port of Mozilla's Readability algorithm for extracting readable content from HTML pages
-
rawk-core
Core library for an AWK interpreter with the goal to be POSIX compatible
-
sonai_metrics
Text metrics for sonai
-
oyster-md
Static site generator for Markdown with bidirectional links and HTML rendering
-
runsible-doc
1-for-1 Rust/TOML reimagining of the corresponding Ansible tool
-
retrogress
Progress bars with a thin API around complex features
-
typos-cli
Source Code Spelling Correction
-
typing_engine
A typing game engine for Japanese and English
-
opentalk-types-signaling-legal-vote
Signaling types for the OpenTalk legal vote module
-
coverio
Better code coverage reporting for Rust crates
-
mdi
markdown include
-
mago-text-edit
A text editing library for Mago
-
fastui-cosmic
Pure Rust multi-line text handling
-
eco
reasoning about breaking changes in Rust ecosystems
-
onig_sys
onig_syscrate contains raw rust bindings to the oniguruma library. This crate exposes a set of unsafe functions which can then be used by other crates to create safe wrappers around Oniguruma… -
simple-regex
😎 Simple and readable way of writing regular expressions
-
unicode-canonical-combining-class
Fast lookup of the Canonical Combining Class property
-
mnem-rerank-providers
Cross-encoder reranker adapters for mnem (Cohere, Voyage, Jina). Sync, TLS-via-rustls, tokio-free.
-
pukram-formatting
A type to represent the formatting of the pukram markup language
-
oak-highlight
A lightweight syntax highlighter for Rust with support for multiple programming languages and customizable themes
-
markov
A generic markov chain implementation in Rust
-
osc66
CLI that wraps text in kitty text-sizing-protocol escape codes using harfbuzz for accurate glyph shaping and width calculation
-
gxter
A parsing library for creating and reading GTA 3/VC/SA GXT (text string) files
-
fonts
High-performance font parsing and analysis library for Grida Canvas
-
sesters
💱 Fast, offline currency converter 💴 💷 💶 💵
-
aistack
Functional text-to-function AI utilities
-
cvkg-runic-text
Natively integrated Cyber Viking text shaping and layout engine for CVKG
-
ebook
A CLI tool for reading, writing, and operating on various ebook formats
-
srx
A mostly compliant Rust implementation of the Segmentation Rules eXchange (SRX) 2.0 standard for text segmentation
-
duca
Search and read Dante's Divine Comedy from your terminal
-
mkulid
A command-line ULID generator — like uuidgen, but for ULIDs
-
slack-markdown-converter
converting standard Markdown to Slack mrkdwn format
-
facilguide
Multilingual tech guide utilities. Guides in EN, ES, FR, PT, IT.
-
grapheme
Abstractions for working with extended Unicode grapheme clusters
-
xgrammar-rs
Efficient, Flexible and Portable Structured Generation for Rust - Rust bindings for XGrammar
-
zeitgrep
Find frecent results in git repositories using regular expressions
-
ecl-adapter-slack
Slack stub source adapter for the ECL pipeline runner (validation)
-
term_grid
formatting strings into a grid layout
-
pangu
Paranoid text spacing for good readability, to automatically insert whitespace between CJK (Chinese, Japanese, Korean) and half-width characters (alphabetical letters, numerical digits and symbols)
-
satteri-mdast
Arena-allocated MDAST nodes with zero-copy references and binary buffer format for Sätteri
-
hebrew_unicode_script
A low-level library designed to ascertain whether a character belongs to the Hebrew Unicode script. It supports checks for individual characters as well as for membership within collections
-
zz-data
Data structures for Zanzarah apis
-
flxy
Full-text searching and scoring of strings
-
okh-scraper
A scraper of Open Source Hardware (OSH) projects. based on the Open Know-How (OKH) standard
-
afrim-translator
Manage the predication system of the afrim input method
-
zen-expression
Zen Expression Language
-
trpl
A support crate for The Rust Programming Language book
-
tpt
Pure Rust implementation of the Unix concatenate (cat), word-count (wc) and echo command
-
retrofont-cli
CLI for retrofont: Render and convert retro ASCII/ANSI art fonts
-
charx
A replacement for char::is_ascii*
-
basalt-core
core functionality for Basalt TUI application
-
univert
Universal file converter (library and CLI)
-
az-pinyin
Chinese pinyin utilities: hanzi-to-pinyin conversion, initial letter extraction, and identifier sanitization
-
kdl-xml
XML<->KDL conversion
-
lindera-ko-dic
A Korean morphological dictionary for ko-dic
-
ratex-pdf
PDF export for RaTeX DisplayList using pdf-writer
-
zpl_toolchain_core
Core parser, emitter, and validator for ZPL II label code (part of the zpl-toolchain project)
-
yozuk
Chatbot for Programmers
-
ethan-rs-wc
The ethan-rs-ws(erwc) is word, line, character, and byte count. Like wc command but not just wc command, more accurate and faster. Text can also be read from standard input for statistics.
-
did-toolkit
spec-compliant implementation of did-core, W3C's 'Decentralized Identity Documents'
-
zsh-metafied
Zsh bytes metafying and unmetafying utilities
-
vader_sentiment
Bindings for Rust from the original Python VaderSentiment analysis tool
-
philiprehberger-slug
Unicode-aware slug generation for URL-safe strings
-
utf64
encode utf-8 strings into utf-64, and decode them back
-
twitter_text_config
Configuration for twitter-text in Rust
-
anthropic-text-editor
A micro-CLI to apply tool calls from Anthropic for their text_editor_20250124 built-in computer use tool
-
ctags-update
Incrementally update a tags file with symbols from specific source files
-
rstring
A comprehensive set of string manipulation utilities inspired by Apache Commons Lang3 StringUtils
-
puniyu_event
puniyu 事件类型库,统一消息、通知与请求事件模型
-
zine
opinionated tool to build your own magazine
-
nysiis
A fast NYSIIS (New York State Identification and Intelligence System) phonetic encoding library
-
ilyvion-util
Collection of utility functions and types for use in my personal projects
-
fierros-rag
RAG, retrieval, and connector primitives for Fierros
-
vtashkov-bf
Brainfuck interpreter
-
aki-gsub
substitude text command, replace via regex
-
pomsky-bin
Compile pomsky expressions, a new regular expression language
-
see-cat
A cute cat(1)
-
reggy
friendly, resumable regular expressions for text analytics
-
crowbook-text-processing
some utilities functions for escaping text (HTML/LaTeX) and formatting it according to typographic rules (smart quotes, ellipsis, french typograhic rules)
-
name-variants
Multilingual name romanization lookup tables: Chinese, Japanese, Korean, Arabic, Vietnamese, Indian, Persian, Hebrew, Thai, Greek, Turkish, Russian, Indonesian/Malay
-
tetratto-shared
Shared stuff for Tetratto
-
gitbook2text
A CLI tool to download GitBook pages and convert them to markdown and text
-
tantivy-stemmers
A collection of Tantivy stemmer tokenizers
-
ezstr
A String wrapper supporting negative indexing with grapheme indexing for slices and Regex::find_iter and Regex::find
-
textgridde-rs
dealing with Praat TextGrid files. MIT licensed.
-
tdoc
assorted CLI tools for working with FTML (Formatted Text Markup Language) documents
-
mdka-cli
CLI executable for mdka – a HTML to Markdown converter
-
dwg-core
Deterministic Writing Guard core analysis engine for spotting AI-styled prose
-
ids-apis
IDS APIs in Rust
-
webdog
static site generator fit for a dog
-
cjrh-moreutils-ts
Rust implementations of the moreutils tools
-
rust_iso15924
ISO 15924, Codes for the representation of names of scripts, is an international standard defining codes for writing systems or scripts (a "set of graphic characters used for the written form of one or more languages")…
-
qrcode2pdf
Render barcodes (QR Codes, Aztec, Data Matrix, etc) using rxing into a krilla Surface (PDF)
-
anno-cli
CLI for anno: extract entities, coreference chains, relations, and PII from text, HTML, and URLs
-
mdbook-numeq
An mdbook preprocessor for automatically numbering centered equations
-
jsrmx
command-line tool to manipulate JSON files. It can split large single-object JSON files into many files; merge multiple JSON files into one large JSON file; bundle multiple JSON files into one NDJSON file…
-
mdbook-replace
mdBook preprocessor that simply replaces text
-
ygrep-core
Core library for ygrep – fast, local, indexed code search
-
inputx-wubi
Self-developed Wubi 86 encoder, dictionary, and dataset (PHF + FST, WASM-ready). Powers the Inputx IME.
-
semantic-query
AI-powered schema validation with automatic JSON generation for type-safe responses
-
piper-phoneme-streaming
A high-performance Rust library for streaming Text-to-Phoneme (G2P) conversion
-
rrename
" Opinionated tool to rename files in batch. Match regular expression, replace some characters I consider noise to kebab case
-
gazetta-render-ext
A static site generator framework. Extra render code.
-
shabdakosh
— Pronunciation dictionary with ARPABET/CMUdict support for svara phonemes
-
gaze-assembly
Policy-to-pipeline assembly for Gaze
-
mdbook-quiz-schema
Schema for quizzes used in mdbook-quiz
-
reformat-plugins
Plugin system for reformat transformers
-
deagle-parse
Tree-sitter based multi-language code parser for deagle
-
text-to-ascii-art
program to convert text to ASCII art
-
suricata
components
-
rslug
fast, and configurable library to create URL-friendly slugs from strings
-
filecheck
writing tests for utilities that read text files and produce text output
-
transbot
translation robot that translates HTMLs/EPUBs/MarkDowns/TEXTs based on LLMs
-
rspdfoverlay
Overlay data on pdf files
-
sniffer-rs
that simplifies fuzzy string matching in rust
-
text-parsing
Hierarchical text processing preserving char position info
-
streampager
pager for command output or large files
-
fmd
Find Markdown files by metadata - Search by tags, frontmatter, and custom fields
-
rstype
Rust based typing trainer
-
memvid-ask-model
LLM inference module for Memvid Q&A with local and cloud model support
-
animated-emojis-rs
Noto Animated emojis
-
merlion-memory
Persistent markdown memory store for Merlion Agent
-
tyml_source
TYML: type checker for markup language
-
symbi-invis-strip
Strip invisible / steganographic Unicode code points from strings before they reach a knowledge store, a journal, or a prompt
-
pdfium-auto
Auto-download and cache PDFium binaries — zero-friction setup for pdfium-render
-
biblicist
working with Bible data
-
facet-singularize
Fast, no-regex English singularization for the facet ecosystem
-
homeboy
CLI for multi-component deployment and development workflow automation
-
lemmeknow
Identify any mysterious text or analyze strings from a file
-
mig-assembly
MIG-guided EDIFACT tree assembly — parse RawSegments into typed MIG trees
-
bstr
A string type that is not required to be valid UTF-8
-
axiomme-core
Core data-processing engine for AxiomMe local retrieval runtime
-
ohos-ime
Bindings to the
inputmethodAPI of OpenHarmony -
mago-fixer
Applies automated fixes and transformations to text
-
topiary-queries
tree-sitter query files compatible with Topiary
-
mdpack
Pack codebases into Markdown bundles and expand them back into files
-
dedoc
Terminal-based viewer for DevDocs documentation
-
sanitize-pii
Detect and mask personally identifiable information (PII) in strings
-
paltoquet
rule-based general-purpose tokenizers
-
axiomme-mobile-ffi
Mobile FFI boundary for AxiomMe core runtime
-
aex-scanner
Content inspection pipeline for Agent Exchange Protocol (AEX): size, magic-bytes, EICAR, regex prompt-injection
-
ascii-fmt
CLI tool to fix and align ASCII diagrams generated by AI agents
-
timeblok
A language for event scheduling in plain text
-
ranked-searcher
Search inside text files using tf-idf formula, showing the most relevant search at the top
-
rust-ai
A collection of 3rd-party AI APIs for Rust
-
puniyu_message
puniyu 消息链封装库,提供 Message 类型与 message! 构建宏
-
hunspell-rs
Rust bindings to the Hunspell library
-
markovify-rs
A fast, extensible Rust implementation of a Markov chain text generator, inspired by markovify
-
regex-specificity
A heuristic-based crate to calculate the specificity of a regular expression pattern against a specific string
-
office_oxide_cli
CLI for office-oxide — the fastest Office document toolkit. Extract text, convert to markdown, dump IR, and inspect DOCX/XLSX/PPTX/DOC/XLS/PPT files.
-
ramit_mygrep
A small grep-style command-line search tool and library
-
lazy-grep
A high-performance, line-oriented command-line tool for searching text with regular expressions
-
mdbook-tocjs
A mdbook preprocessor which adds extra js and css file for ToC hydration
-
mdbook-image-attrs
An mdbook preprocessor for adding attributes to images
-
texto
CLI made for generating dummy text
-
rgx-cli
A terminal regex tester with real-time matching, multi-engine support, and plain-English explanations
-
large-text-core
Core library for handling large text files search and replace efficiently
-
mdqy
jq for markdown: query and transform Markdown with a hybrid selector and jq DSL
-
jmdict-fast
Blazing-fast Japanese dictionary engine with FST-based indexing
-
enum-ts
TypeScript Enum pattern matcher codegen
-
scan-rules
some macros for quickly parsing values out of text. Roughly speaking, it does the inverse of the print!/format! macros; or, in other words, a similar job to scanf from C.
-
mullama
Comprehensive Rust bindings for llama.cpp with memory-safe API and advanced features
-
frawk
an efficient Awk-like language
-
tetratto-markdown
Markdown rendering for Tetratto
-
betacode
conversion
-
llm-coding-tools-rig
Lightweight, high-performance Rig framework Tool implementations for coding tools
-
hybrid-match
Hybrid string similarity
-
tokmat
Standalone high-performance Canadian address parsing engine core
-
mq-check
Type checker for mq
-
codump
A straightforward and flexible code/comment dump tool
-
palpad
A really simple static site generator
-
rsdex_bin
a little tool that behaves as a pokedex
-
textwrap-macros
procedural macros to use textwrap utilities at compile time
-
imperative
Check for imperative mood in text
-
lethe-core-rust
High-performance hybrid retrieval engine combining BM25 lexical search with vector similarity using z-score fusion. Features hero configuration for optimal parity with splade baseline…
-
swc_ecma_regexp_ast
AST definitions of ECMAScript regular expressions
-
opencc-fmmseg
High-performance Chinese conversion library (Simplified ↔ Traditional) using OpenCC lexicons and FMM segmentation — no runtime I/O, cross-platform, and production-ready
-
futf
Handling fragments of UTF-8
-
codefmt
a markdown code block formatter
-
fop-layout
Layout engine for Apache FOP Rust implementation
-
sinstr
A single WORD small string optimization library
-
lindera-dictionary
A morphological dictionary library
-
armnod
random string generator
-
docsite-to-md
A robust Rust CLI and library for exporting documentation sites to Markdown
-
float-pretty-print
Format f64 for showing to user, not for serialisation
-
trafilatura
Extract readable content, comments, and metadata from web pages
-
use-case
Composable string casing primitives for RustUse
-
tlict
A language analysis and compilation tool for constructing and analyzing domain-specific languages
-
lingua-french-language-model
The French language model for Lingua, an accurate natural language detection library
-
panproto-expr-parser
Haskell-style surface syntax parser for panproto expressions
-
web2llm-cli
CLI for fetching web pages as clean Markdown with web2llm
-
mlrs-core
Core RLM engine — recursive LLM inference via Rhai scripting
-
typoglycemia
function to convert text to typoglycemic format with a Leet-speak variant. The function takes a string as input and returns a new string where the first and last letters of each word are unchanged…
-
orgflow
managing documents with support for tasks and notes
-
asimov-serpapi-module
ASIMOV module for data import powered by the SerpApi search data platform
-
glum
A reading-focused terminal markdown viewer
-
reqmd_cli
CLI tool for reqmd
-
maybe-valid
Traits and outcome enums for structural validation/refinement conversions
-
googleapis-tonic-google-cloud-datalabeling-v1beta1
A Google APIs client library generated by tonic-build
-
atuin-nucleo-matcher
plug and play high performance fuzzy matcher
-
csmlinterpreter
The CSML (Conversational Standard Meta Language) is a Domain-Specific Language developed for creating conversational experiences easily
-
cloudiful-redactor
Structured text redaction with reversible sessions for secrets, domains, URLs, and related sensitive values
-
mnem-embed-providers
Embedding-provider adapters for mnem (OpenAI, Ollama). Sync, TLS-via-rustls, tokio-free.
-
atog
ascii to greek - prints greek letters given latin alphabets as input
-
ascii-img2-cli
ASCII image generation CLI
-
unicode_reader
Adaptors which wrap byte-oriented readers and yield the UTF-8 data as Unicode code points or grapheme clusters
-
rust-texas
generate latex documents
-
typub-html
HTML processing utilities for typub (AST types, parsing, serialization, SVG handling, link resolution)
-
fast_symspell
Spelling correction & Fuzzy search
-
tarzi
Rust-native lite search for AI applications
-
kaiba
domain library - Core types and interfaces for AI persona system
-
lexa-mcp
rmcp stdio MCP server for the Lexa hybrid retrieval engine. Exposes
search_files,index_path,list_indexed_paths,purge_path, andstatusto any MCP client (Codex, Claude Desktop… -
custard
A frontmatter-querying server
-
cloc
Count, or compute differences of, lines of source code and comments
-
servo-base
A component of the servo web-engine
-
moeix
Sub-millisecond code search via sparse trigram indexing
-
vmks-exam-generator
CLI program for pseudo-randomly generating different variants of an embedded programming exam
-
latex-thebib
Clean and sort legacy TeX bibliographies written using ‘thebibliography’ via the
refactorsub-command. Compile BibTeX files to legacythebibliographyTeX code using thecompilesub-command… -
mdwright
Command-line delivery for mdwright
-
routers_realtime
A Demonstration for Real-Time Map Matching
-
skp-validator-rules
Built-in validation rules for skp-validator
-
artificial-prompt
Fluent builders and helpers for composing markdown prompt fragments
-
bpetok
CLI for tokenizing text input using Byte Pair Encoding (BPE)
-
linebreak
breaking a given text into lines within a specified width
-
mdbook-lint-rulesets
Modular rulesets for mdbook-lint - standard and mdBook-specific linting rules
-
spdfdiff_types
Shared data model, diagnostics, provenance, and limits for semantic PDF diff tools
-
haqumei
Japanese Grapheme-to-Phoneme (G2P) library implemented in Rust
-
bible
A beautiful TUI Bible reader with on-demand translation downloads
-
lexical-sort
Sort Unicode strings lexically
-
hebrew_unicode_utils
Some functions for processing Hebrew unicode characters
-
gaze-pii
Reversible PII pseudonymization runtime for agentic workflows
-
bfom-lib
Brendan's Flavor of Markdown: I'll build my own markdown format, what could go wrong?
-
legalis-in
India jurisdiction support for Legalis-RS - comprehensive modeling of Indian law
-
fierros-evals
Evaluation, reporting, and production-gate primitives for Fierros
-
dnd-character
A Dungeons and Dragons character generator
-
redact-ner
Named Entity Recognition for PII detection using ONNX Runtime
-
quot
A fast and flexible command-line tool that converts text input into escaped string literals
-
undoc-cli
CLI for undoc - Microsoft Office document extraction
-
difference-rs
text diffing and assertion library
-
typst-count
Count words and characters in Typst documents
-
globset
Cross platform single glob and glob set matching. Glob set matching is the process of matching one or more glob patterns against a single candidate path simultaneously, and returning all of the globs that matched.
-
convert_case_extras
Extra features for convert_case
-
mdbookkit
Support library for mdBook preprocessors in the mdbookkit project
-
chonkie
🦛 Chonkie, now in Rust 🦀: No-nonsense, ultra-fast, ultra-light chunking library
-
hy-mt
A lightweight machine translation inference library for Tencent Hunyuan MT models
-
marker-typ
Generate markdown documentation from typst doc comments
-
shannon-nu-explore
Nushell table pager
-
sqlitepipe
piping the output of a command into sqlite databases
-
blinc_noto_emoji
Drop-in NotoColorEmoji fallback for blinc_text. Add to your Cargo.toml and the bundled subset auto-registers at binary init via a
#[ctor]function — no source changes required… -
yy1
Tiny utility to convert KiCad centroid files into Neoden YY1 pick and place machine format
-
docx_mcp_rust
A Rust-based MCP (Model Context Protocol) server for creating and manipulating DOCX files
-
mnem-extract
Statistical, embedding-based entity + relation extraction for mnem (KeyBERT-style, co-occurrence PMI). Sync, no LLM, no network.
-
enya-search
Full-text search index for Enya codebase (metrics, alerts, commits)
-
code-to-pdf
Generates a syntax-highlighted PDF of your source code
-
kl-hyphenate
Knuth-Liang hyphenation for a variety of languages
-
rustdoc-md
Convert Rust documentation JSON into clean, organized Markdown files
-
vimspell
Native Rust library for spellchecking based on vimspell database and algorthim
-
strs_tools
Tools to manipulate strings
-
index-readability
Main-content extraction prototype for Index
-
wordcutw
A C-interface wrapper for Wordcut - a Lao/Thai word segmentation/breaking library
-
neco-editor-viewport
Viewport geometry calculations for editor rendering
-
stylish-style
Internal implementation details of
stylish-core -
mono
Mono repository automation toolkit
-
rpdfium-text
Text extraction for rpdfium
-
fabryk-mcp-content
Content and source MCP tools for Fabryk (ContentItemProvider, SourceProvider)
-
yeslogic-ucd-generate
A program for generating packed representations of the Unicode character database that can be efficiently searched with support for additional tables
-
awful_book_sanitizer
CLI to clean up OCR-mangled book excerpts into readable text using OpenAI-compatible APIs
-
mecab-furigana-rs
MeCab-based furigana and romaji annotation for Japanese text — no Python, no kakasi
-
journey-cli
A CLI-based journal application with automatic timestamping, vault management, and Obsidian integration
-
pdfgen
PDF rendering library
-
telegram-escape
Escape text for Telegram's MarkdownV2 format
-
tiktag
CLI for multilingual text anonymization with a built-in ONNX NER model
-
cn-font-split
划时代的字体切割工具,CJK与任何字符!支持 otf、ttf、woff2 字体多线程切割,完美地细颗粒度地进行包大小控制。A revolutionary font subetter that supports CJK and any characters…
-
csvpretty
A command-line tool that formats CSV input into tables with Unicode box-drawing characters
-
kiroku-tui
terminal-based personal journaling and note-taking tool
-
simd-utf16-len
SIMD-accelerated UTF-16 length calculation from UTF-8 strings
-
mtf
Markdown Table Formatter
-
rust-regex-dsl-creator
Regular expression DSL derive macros
-
ai-stringprep
A no_std fork of stringprep
-
gremlh
A CLI tool to find and fix invisible 'gremlin' characters (homoglyphs, zero-width spaces, Bidi overrides) in source code
-
transmd
Convert any English PDF document into Chinese Markdown
-
skanda_engine
A zero-dependency, ultra-high-performance retrieval engine designed for the next generation of RAG
-
gut-cli
A tiny goose that roasts your git typo
-
markdown2json
Reads a markdown file or directory of markdown documents and emits a structured JSON
-
apple-notes-exporter
CLI tool for exporting Apple Notes to Markdown
-
whatwg-infra
Tiny Rust-based implementation of the WHATWG Infra Standard
-
sibylline-clean
Prompt injection detection primitives
-
vidyut-lipi
A Sanskrit transliterator
-
opendev-runtime
Runtime services: approval rules, cost tracking, interrupt token, plan management, error handling
-
citum-server
Citum JSON-RPC server for citation and bibliography processing
-
matchy-extractor
Fast extraction of IPs, domains, emails, hashes from text (internal)
-
libretranslate
A wrapper for the LibreTranslate web API
-
okane-golden
supporting Golden Testing
-
content-canonical
Content canonicalization and text normalization library
-
mdbook-trunk
mdBook plugin which bundles packages using Trunk and includes them as iframes
-
yaml2toml
1-for-1 Rust/TOML reimagining of the corresponding Ansible tool
-
allsorts-azul
Azul’s fork of the allsorts font parser / shaping engine / subsetter. Adds pixel-snap hinting fixes + assorted bug fixes to YesLogic’s upstream. Intended to be upstreamed — use the official
allsortscrate if you can. -
loran-render
Loran — Markdown to terminal renderer (text and TUI)
-
unicode-charname
functions for retrieving Unicode character name properties as described in Unicode Standard Annex #44
-
ttlint
Small, fast utility to lint text
-
parfill
Alias for parfit — paragraph fit, a codebase-aware comment reflow tool. Installs a parfill binary with identical behaviour. See https://github.com/caldempsey/parfit.
-
mdbook-git
Insert git commit files and diffs into mdbook
-
xee-ir
Xee intermediate representation and compilation to bytecode
-
pdf-min
Very minimal crate for writing PDFs
-
matchkit
Vocabulary types for multi-pattern matching — Match struct, Matcher trait, shared errors
-
graphify-ingest
URL fetching and content ingestion for graphify
-
rustyink
Blazing fast static site generator
-
md-word-count
counting words in Markdown text. The intent is to match the behavior of LibreOffice and Microsoft Word.
-
pdf_to_text
PDF to text
-
uresamp
HIFI uresamp delivers ultrasonic-fidelity text resampling via adaptive 64-bit floating-point spectral mapping, preserving Unicode 32-bit codepoint integrity with zero-phase distortion
-
mdiew
A lightweight macOS markdown viewer with live reload
-
markdown-translator
A translation library with DeepLX API integration, rate limiting, and smart text chunking
-
artificial-openai
OpenAI backend adapter for the Artificial prompt-engineering SDK
-
aki-mcolor
mark up text with color
-
fortune-rs
classic BSD fortune program
-
pdf_semantic
Semantic PDF layout model construction for document comparison and diffing
-
hlight
dedicated to delivering exceptional syntax highlighting capabilities
-
table-grep
A grep-like tool for searching CSV and Parquet table files
-
jq-rs
Run jq programs to extract data from json strings
-
normalize-context
Frontmatter-filtered context resolution for normalize: hierarchical .normalize/context/ walk with YAML frontmatter matching
-
nova-cite
Smart citation management with CrossRef/Zotero integration
-
fret-render-text
Renderer-owned Parley text shaping and wrapping utilities for Fret
-
autotex
Continuously compile TeX and LaTeX
-
opendev-repl
REPL loop and command handling for OpenDev
-
engish
A language utility for sampling and building words
-
colgrep
Semantic code search powered by ColBERT
-
chordsketch-render-ireal
iReal Pro chart renderer (SVG skeleton)
-
gram-data
Unified gram CLI and library for validating gram notation
-
kiri-engine
Core Rust engine for Kiri Japanese morphological analyzer
-
img2epub
Convert images to EPUB
-
chatlog
Extract and save agent chat logs (Claude, Codex, Gemini CLI) as local Markdown files
-
plato-kernel
Plato Kernel - Event sourcing + Constraint-Theory + Git runtime
-
clima
A minimal Markdown reader in the terminal
-
json_keyquotes_convert
convert JSON from and to JSON without key-quotes
-
text-editing
string with utilities for editing
-
mdpdf-cli
Markdown to styled PDF — syntax-highlighted, themed, dense or normal layout
-
ucfirst
Uppercase the first letter of a string
-
emoji
Every emoji, their metadata, and localized annotations
-
artificial-types
Reusable prompt fragments and helper types for the Artificial prompt-engineering SDK
-
fileslug
Filename-aware slug generator — slugifies file names (preserving extensions, dotfiles, version numbers) and arbitrary text
-
beautiful-md
A CLI tool to format and beautify Markdown files
-
mdbook-langtabs
An mdbook preprocessor that adds language tabs for code blocks
-
mdbook-linkcheck
A backend for
mdbookwhich will check your links for you -
lo_lok
LibreOfficeKit-like in-process runtime: Office handle, document handles, command dispatch and tile rendering
-
unimorph-cli
Command-line interface for UniMorph morphological data
-
floem-editor-core
The core of the floem text editor
-
pdf-redact
GDPR-compliant PDF redaction: permanent content removal
-
token-count
Count tokens for LLM models using exact tokenization
-
jpreprocess-njd
Japanese text preprocessor for Text-to-Speech application (OpenJTalk rewrite in rust language)
-
udpipe-rs
Rust bindings for UDPipe - a trainable pipeline for tokenization, tagging, lemmatization and dependency parsing of CoNLL-U files
-
ticker-sniffer
extracting multiple stock ticker symbols from a text document
-
mdbook-renderer
assist implementing an mdBook renderer
-
markast
Rust-powered markdown to HTML renderer with customizable styles
-
mdbook-typst-pdf
mdbook typst pdf backend
-
skim-common
Fuzzy Finder in rust!
-
enma
serving anime and manga information 📦
-
cin
that simplifies command-line input in Rust, especially mimicking C++-style input
-
convergio-reports
Convergio Think Tank — professional research report generation service
-
skill-tree
generate graphviz files to show roadmaps
-
luau-lexer
A lexer for the luau language
-
matcher
UCFP matching layer for semantic and perceptual search over indexed fingerprints
-
bookforge-store
SQLite checkpoint and job store for BookForge
-
htmls
parsing HTML and extracting HTML elements or text
-
ucd-general-category-ranges
Unicode character ranges by general category
-
collie-search
Index-backed code search. Faster than grep on large repos.
-
include-preprocessor
Tooling for C preprocessor style include directives
-
odtgen
Flat ODT writer
-
typub-config
Configuration types for typub
-
graphify-hooks
Git hook integration for graphify
-
tgrep
Toy grep that honors .gitignore
-
re-x
AI-native regex CLI — Test, validate, explain. Built for coding agents.
-
obsidian-cli-inspector
Local-first CLI/TUI for indexing and querying Obsidian vaults
-
mdfrier
A markdown parser that produces styled terminal lines
-
mudssky_utils
A comprehensive Rust utility library providing common functionality for everyday programming tasks
-
dm-index
Index and changelog generator for documentation trees
-
segtok
Sentence segmentation and word tokenization tools
-
genpdf-json
PDF generator using JSON data
-
philiprehberger-mask
Data masking and redaction for strings, emails, and sensitive data
-
markdown-to-ansi
Render Markdown as ANSI-formatted terminal text
-
untangle
Module-level dependency graph analyzer for Python, Ruby, Go, and Rust
-
mdbook-figure
a preprocessor that adds support for numbered figures in mdbook
-
print-positions
providing string segmentation on grapheme clusters and ANSI escape sequences for accurate length arithmetic based on visible print positions
-
serenity_utils
provide additional utilies for Discord bots created with serenity
-
perl-source-editing
Source text editing heuristics for insertion points and display truncation
-
scrybe-cli
Scrybe CLI — headless render/lint/mermaid/panel command-line tool
-
data-gov-catalog
Async client for the data.gov Catalog API (DCAT-US 3 search)
-
msbwt2
multi-string BWT query library
-
text-document-cli
CLI for text-document
-
scanlex
lexical scanner for parsing text into tokens
-
shiplog-render-md
Markdown self-review packet renderer for canonical shiplog data
-
block-id
generating opaque, unique, and short string values from (unsigned) integers
-
language-tokenizer
Text tokenizer for linguistic purposes, such as text matching. Supports more than 40 languages, including English, French, Russian, Japanese, Thai etc.
-
ccwc-mh
A CLI tool to count words, characters, and lines (WC clone in Rust)
-
zhconv-cli
Convert Traditional/Simplified Chinese and regional words of Taiwan/Hong Kong/mainland China/Singapore based on Wikipedia and OpenCC rulesets 轉換中文簡體、繁體及兩岸、新馬地區詞,基於維基百科和…
-
textnonce
Text based random nonce generator
-
inputx-pinyin-wasm
WASM bindings for inputx-pinyin — Mandarin Pinyin IME engine, browser/Node ready. Powers the Inputx IME web surface.
-
r4d
Text oriented macro processor
-
epub2mdbook
convert EPUB files to MDBook format
-
matchy-literal-hash
O(1) exact string matching via memory-mapped hash tables (internal)
-
chinese_dictionary
A searchable Chinese / English dictionary with helpful utilities
-
mdmux
A terminal UI for browsing markdown files and rendering them in a cmux split
-
hmd
Custom Markdown Engine for my personal blog
-
langdetect-rs
Language detection in Rust. Port of Mimino666's langdetect.
-
hyphertool
Hypertool is a command-line tool for syllabification and hyphenisation
-
spellchk
A blazingly fast spellchecker CLI for any text file
-
open-english-pronouncing-dictionary
OpenEPD: open, fused English IPA pronunciation dictionary (~280k US English words). Misaki + CMUdict + WikiPron, canonical IPA with provenance and frequency-derived rarity. CC-BY-SA 4.0.
-
rushdown-footnote
Footnote extension for rushdown markdown parser
-
directwrite
A safe abstraction for interacting with DirectWrite, intended initially to be used with direct2d for easy text rendering
-
rd2qmd-package
Package-level operations for converting R documentation to Quarto Markdown
-
wildcard_ex
extended wildcards that allows VB-like specifications
-
rsdex
a little tool that behaves as a pokedex
-
quillmark-cli
Command-line interface for the Quillmark Markdown rendering system
-
sff
SemanticFileFinder (sff): Fast semantic file finder using sentence embeddings. Searches .txt, .md, .mdx files.
-
hangeul
Korean alphabet manipulation library
-
car-active-planner
Active planner for CAR — generates, scores, and selects proposals via inference
-
cartog-languages
Tree-sitter language extractors for cartog code graph
-
vectradb-chunkers
Chunking utilities for VectraDB in Rust
-
mecab-sys
FFI binding and safe wrappers of MeCab
-
lightgrep
A fast, ergonomic grep-like tool in Rust
-
ssexp
A powerful parser for s-expressions
-
ichoose
Interactive terminal list selection (lib+bin)
-
ruchydbg
ML-powered debugger for Ruchy with SBFL fault localization
-
graphrag
Knowledge Graph RAG: meta-crate that bundles graphrag-core and graphrag-cli
-
ink-md
The most advanced terminal markdown reader
-
ainu-utils
A collection of utilities for the Ainu language
-
talon-cli
Talon CLI: hybrid retrieval over Obsidian vaults and markdown corpora, with grounded answers, MCP server, and agent-native output
-
open-redact-pdf-content
PDF content stream parsing and operation modeling for open-redact-pdf
-
drail
CLI-first code intelligence for AI agents
-
slabs
AST-aware code chunking and late chunking for RAG
-
pullup
Convert between markup formats
-
mecab-ko-hangul
한글 처리 유틸리티 - 자모 분리/결합, 음절 처리, 정규화
-
merge-engine
A non-LLM merge conflict resolver using structured merge, Version Space Algebra, and search-based techniques
-
pandoc
API that wraps calls to the pandoc 2.x executable
-
content-extractor-rl
RL-based article extraction from HTML using Deep Q-Networks and heuristic fallback
-
fuse-rust
Fuse is a super lightweight library which provides a simple way to do fuzzy searching. Fuse-Rust is a port of Fuse-Swift, written purely in rust
-
repub-rs
binary for converting mhtml webpages into remarkable-style summarized epubs
-
broken-md-links
A command-line tool and library to detect broken links in Markdown files
-
cwe-data
Request CWE data offline
-
rsticle
Treat source files as articles / narrative documentation
-
somedoc
A very simple document model and markup generator
-
husk-lexer
Lexer for the Husk programming language
-
twars-url2md
A powerful CLI tool that fetches web pages and converts them to clean Markdown format using Monolith for content extraction and htmd for conversion
-
pulldown-cmark-to-flowed
Convert Markdown to Plain Text with format=flowed
-
aasvg
Convert ASCII art diagrams to SVG with automatic light/dark mode support
-
mnem-sparse-providers
Learned-sparse encoder adapters for mnem (SPLADE, BGE-M3-sparse, opensearch-doc-v3-distill). Sync, TLS-via-rustls, tokio-free.
-
awk-rs
A 100% POSIX-compatible AWK implementation in Rust
-
redactor
Secure PDF redaction library with Type3 font support using MuPDF
-
slugify
Macro for flexible slug generation
-
askama-markdown-cmark
Askama filter for markdown, using pulldown-cmark
-
mdbook-svgbob
SvgBob mdbook preprocessor which swaps code-blocks with neat SVG
-
streamdown-syntax
Syntax highlighting for streamdown via syntect
-
izihawa-tantivy-stacker
term hashmap used for indexing
-
vndb_tags_get
convert VNDB tag list (JSON to markdown)
-
speedgrep
grep tool
-
sansaccent
Convertit les chaînes françaises en slugs URL-friendly en supprimant accents et caractères spéciaux
-
cairn-extract
Rule-based claim extraction from markdown with confidence scoring
-
alpha-counter
Alphabetic counter
-
dictator-frontmatter
Markdown frontmatter decree for Dictator structural linter
-
wimbd
A CLI for inspecting and analyzing large text datasets
-
ascend-tools-tui
Interactive TUI for the Ascend Instance web API
-
shannon-nu-utils
Nushell utility functions
-
tars-bin
A small, fast, static site generator
-
phonologist
Parse phonemes in the International Phonetic Alphabet
-
satteri-ast
MDAST and HAST node types, codecs, tree operations, and conversion for Sätteri
-
waken_snowball
Snowball stemming algorithms for 33 languages
-
agentic-veritas-ffi
FFI bindings for AgenticVeritas
-
streamdown-plugin
Plugin system for streamdown extensibility
-
ah-ah-ah
VUN token! TWO tokens! Count all the beautiful tokens ... offline! Ah-ah-ah!
-
supermarkdown-cli
CLI for supermarkdown HTML to Markdown conversion
-
lo_uno
UNO-like service registry framework
-
cuteness
Cute static site (+ server) generator with a bunch of plugins :3
-
rfgrep
Advanced recursive file grep utility with comprehensive file type classification - search, list, and analyze 153+ file formats with intelligent filtering and safety policies
-
ised
An interactive tool for find-and-replace across many files
-
bilingual
A cmdline tool used for markdown translation via calling Chinese translation api cloud services
-
hancat-core
함수 하나로 토시 변환과 용언 활용을 {단어, 접사} 템플릿으로 자동 처리하는 라이브러리
-
terraphim-repl
Offline-capable REPL for semantic knowledge graph search
-
stopstream
Streaming-safe stop-sequence detector for LLM token streams. Handles partial matches at chunk boundaries.
-
unword
MS Word .doc (OLE/CFB) to Markdown converter
-
strike48-siem-core
Shared SIEM types and traits for case management, alerts, and search
-
detone
Decompose Vietnamese tone marks
-
mdvalidate-utils
functions for mdvalidate
-
oxide-browser-sh
Self-healing AI-driven browser automation for Rust Oxide. Built on chromiumoxide with accessibility-tree-first targeting, an LLM-friendly feedback loop, and token-efficient Markdown extraction via oxide-compress.
-
avatarr-parser
Release-name parser ported from Sonarr v4.0.17.2952
-
prosesmasher-app-core
Internal core checks crate for the prosesmasher workspace. Published to support the workspace dependency graph.
-
langsan
sanitizing language model input and output
-
block-list
A minimalist hosts-based tool for managing block lists and ad-blocking
-
cloakrs-locales
Locale-specific PII recognizers for cloakrs
-
jntajis
port of jntajis-python providing character transliteration functionality for Japanese text processing
-
bogrep
Full-text search for bookmarks from multiple browsers
-
chx
A TUI hex editor
-
pathmut
Command line utility for manipulating path strings
-
cmls-knowledge-core
Core indexing, graph, search, and schema library for Cumulus Knowledge
-
kreuzberg-tesseract
Rust bindings for Tesseract OCR with cross-compilation, C++17, and caching improvements
-
rhema_module_chirho
Self-contained SQLite module format (.rhema) for distributing Bible modules
-
asimov-chromium-module
ASIMOV module for Chromium (and Brave, Google Chrome) bookmark import
-
emoji-remover
A fast command-line tool to remove emojis from source code files
-
cockpitctl-render
Deterministic markdown and annotation rendering for cockpitctl reports
-
slugify-core
Fast, Unicode-aware slug generation library with multi-language bindings
-
termaid
Render Mermaid flowchart, stateDiagram-v2, and sequenceDiagram files in the terminal
-
lingua-spanish-language-model
The Spanish language model for Lingua, an accurate natural language detection library
-
pdforg-spell
Spell checker for PDF Office
-
madskills
The toolchain for madskilling: lint, format, and wrangle Agent Skills like you mean it
-
pii
PII detection and anonymization with deterministic, capability-aware NLP pipelines
-
vibrato
viterbi-based accelerated tokenizer
-
aki-xtee
copy standard input to each files and standard output
-
forgetless
Smart context optimization for LLMs that compresses massive content to fit your token budget
-
lucide-dioxus
Dioxus port of Lucide
-
scrivener
reading and writing Scrivener 3 projects
-
lopdf-parang
A fork of lopdf optimized for PDF text extraction — lazy streams, O(1) object slicing, zlib-rs
-
tantivy-tokenizer-api
Tokenizer API of tantivy
-
fastn-builtins
fastn: Full-stack Web Development Made Easy
-
latexmk-diff-head
LaTeX compilation tool that generates diff PDFs against Git commits
-
highlight-spans
Tree-sitter ObjectScript highlight spans as attr/start/end tuples
-
rustruut
Text-to-IPA converter and phonetic translator for Rust, powered by the Goruut phonemization engine
-
deliminator
Universal code documentation generator
-
mistral_ocr_gui
GUI tool for Mistral OCR - convert documents to Markdown using Mistral AI
-
utf-64
The next-generation text encoding standard using 64 bits per character
-
tectonic_bridge_core
Exposing core backend APIs to the Tectonic C/C++ code
-
microslop
Turn your text into beautifully chaotic, glitchy, Wandoze-level slop
-
html-index
Generate an HTML index
-
taboc
A table of contents generator for markdown documents
-
oak-tex
TeX/LaTeX document preparation system parser with support for typesetting commands and macros
-
jrl
Journaling terminal app that prompts you questions from time to time when opening a new terminal and allows you to rate, describe and take notes of your day, as well as view past entries
-
wind-wiki
LLM-powered Wiki SDK — Ingest, Query, and Lint pipelines
-
char-ranges
Iterate chars and their start and end byte positions
-
rtranslate
dependency-free Rust wrapper for Google Translate public web API
-
tiefdownlib
manage and convert TiefDown projects
-
tracey-proto
Protocol definitions for the tracey spec coverage daemon RPC
-
jdpub
Annotate source documents with Japanese readings and definitions
-
byteforge
A next-generation byte-level transformer with multi-signal patching and SIMD optimization
-
koruma-collection
A collection of common validators using koruma
-
moonwave
generating documentation from comments in Lua source code
-
ragrs
Fast local RAG in Rust. Index, query, verify.
-
acorns
Generate an AsciiDoc release notes document from tracking tickets
-
lex-analysis
Semantic analysis for the lex format
-
mdbook-numthm
An mdbook preprocessor for automatically numbering theorems, lemmas, etc
-
encoding_c
C API for encoding_rs
-
csvpp
Compile csv++ source code to a target spreadsheet format
-
unicode-rs
A comprehensive Unicode character library for Rust applications with theme support
-
zen-rs
generating non-interactive content like cards or files
-
pretty-xmlish
Pretty print XML-ish data with unicode art
-
mq-run
Command-line interface for mq Markdown processing tool
-
yar_markdown
Markdown handling for yar
-
arborium-theme
Theme support for arborium syntax highlighting
-
pepl-lexer
PEPL lexer: source text to token stream
-
httpwg
Test cases for RFC 9113 (HTTP/2)
-
mermd
Terminal Markdown renderer with Mermaid flowcharts drawn as ASCII art
-
semire_core
An extension to my former semire_read crate now with more functionality
-
pulldown-cmark-escape
An escape library for HTML created in the pulldown-cmark project
-
finetype-core
Core taxonomy and data generation for FineType
-
passwordkit
generate passwords and validate requirements
-
dictx
A fast, colorful terminal dictionary with offline indexes and optional AI explanations
-
patchlib
Tooling for working with patch files
-
mcd-core
Core parser, validator, and exporter for Markdown CSV Document packages
-
awful_news_vibes
Daily news meta-analysis pipeline with AI-powered clustering and D3 visualizations
-
ztlgr
Terminal-based note-taking app with Zettelkasten methodology
-
fsrc
Embed source files into any text file
-
notidium
Developer-focused, local-first note-taking with semantic search and MCP integration
-
unicode-normalization-alignments
functions for normalization of Unicode strings, including Canonical and Compatible Decomposition and Recomposition, as described in Unicode Standard Annex #15
-
spider_agent_html
HTML processing utilities for spider_agent — cleaning, content analysis, and diffing
-
bible-io
working with Bible text data structures
-
codetypo-dict
Source Code Spelling Correction
-
u8lit
Custom literal to convert strings to UTF-8 bytes
-
mecab-ko
한국어 형태소 분석기 - MeCab-Ko의 순수 Rust 구현
-
mdja
日本語に最適化されたMarkdownパーサー - CommonMark + GFM対応、目次生成、読了時間計算
-
roff-cli
Skillful man page to JSON/Markdown converter - human readable, AI-friendly
-
sik
A fast and concurrent command-line tool for searching patterns in files
-
create_broken_files
Create broken files from other ones
-
seams
High-throughput sentence extractor for Project Gutenberg texts with dialog-aware detection
-
indentify
writing text that requires indents a little easier
-
pdfluent-lopdf
PDF document manipulation
-
lightweight_config
easily parsing plain-text configuration files
-
doc_loader
A comprehensive toolkit for extracting and processing documentation from multiple file formats (PDF, TXT, JSON, CSV, DOCX) with Python bindings
-
kd-rust
A crystal clear command-line dictionary
-
rustling
A blazingly fast library for computational linguistics
-
television-nucleo-matcher
plug and play high performance fuzzy matcher
-
plato-tile-split
Text chunking engine — token-aware splitting, overlap, code-aware, semantic boundary detection
-
lindera-ipadic-builder
A Japanese morphological dictionary builder for IPADIC
-
markdown-peek
Markdown previewer in browser and terminal
-
yuru-zh
Chinese pinyin matching support for Yuru
-
ld-lucivy-bitpacker
Lucivy-sub crate: bitpacking
-
grep-app-cli
CLI for grep.app — search code across 1M+ public GitHub repos
-
tokie
Blazingly fast tokenizer - 50x faster tokenization, 10x smaller model files, 100% accurate drop-in replacement for HuggingFace
-
univiz
A command-line tool for analyzing Unicode strings, providing detailed information about graphemes, code points, and UTF-8 byte sequences
-
ucp-cli
Command-line interface for Unified Content Protocol
-
komito
A fast, reliable semantic commit message validator and version bumper with gitmoji support
-
wp-mini-epub
Minimal async WP to EPUB downloader | Extremely minimal
-
fontheight
Find out the vertical extents your font reaches on shaped words
-
mask-pii
A lightweight library to mask PII (Personally Identifiable Information) like emails and phone numbers
-
neco-wrap
Word wrap engine with pluggable line-breaking and character-width policies
-
snapper-fmt
Semantic line break formatter for Org, LaTeX, Markdown, and plaintext
-
rlex
A cursor-based, utf-8 Vec<char> lexer
-
n_gram
training n-gram language models
-
tracery
Text-expansion library
-
neco-textpatch
Deterministic text patch helpers for narrow structured edits
-
use-match
Shared match-result primitives for RustUse pattern helpers
-
joyful
Generate delightful, random word combinations - Rust port of the joyful TypeScript library
-
harfbuzz
Rust bindings to the HarfBuzz text shaping engine
-
nlprule
A fast, low-resource Natural Language Processing and Error Correction library
-
ricat
A Rust-Based implemenation of classic UNIX
catcommand -
dmos-cli
Djot HTML renderer with advanced features - CLI
-
evfmt
Emoji Variation Formatter
-
legalis-fr
French jurisdiction support for Legalis-RS (Code civil, Code de commerce, Code du travail)
-
yamth
Markdown To HTML, A fast Markdown to HTML converter with live reload
-
rag-cli-cuda
CUDA-accelerated build of rag-cli — local semantic search powered by candle + NVIDIA GPU
-
readability-js
wrapper for Mozilla's Readability.js library
-
yuru-ja
Japanese phonetic matching support for Yuru
-
ucp-translator-html
HTML to UCM document translator
-
character-set
High performance
set.contains(char) -
sublime-syntaxes
Precompiled Sublime Text syntax definitions for languages not in syntect's defaults
-
mcplint-report
Output formatters (text, JSON, Markdown, SARIF) for mcplint
-
tuillem-plugin
External process plugin host for tuillem
-
fontlift-platform-mac
macOS platform implementation for fontlift
-
lindera-unidic
A Japanese morphological dictionary for UniDic
-
kiromi-ai-embed-onnx
ONNX/fastembed-rs embedder plugin for kiromi-ai-memory. Bundles multilingual-e5-small by default.
-
anslatortray
translate from English to Pig Latin!
-
anno-eval
Evaluation harnesses, datasets, and muxer-backed sampling for anno
-
panko
A small, zero-copy text tokenizer that crumbles strings into Words, Symbols, and Newlines
-
hayro-font
A parser for CFF and Type1 fonts
-
glyph-names
Mapping of characters to glyph names according to the Adobe Glyph List Specification
-
viks
vim-like key crate
-
fop-types
Core types for Apache FOP Rust implementation
-
lo_calc
Spreadsheet formula parser/evaluator and CSV conversion
-
smoltok-core
Byte-Pair Encoding tokenizer implementation in Rust
-
anno-metrics
Shared evaluation/analysis primitives for anno (metrics + cluster encoders)
-
clarifai_grpc
The official Clarifai gRPC Rust client
-
type1-encoding-parser
parse encodings from Type1 font files
-
stam-python
STAM is a library for dealing with standoff annotations on text, this is the python binding
-
parserst
A recursive-descent reST parser and renderer
-
waterui-text
Text and typography components for WaterUI
-
phaier_markdown
A markdown parser and renderer
-
plato-tile-import
Tile import/export from Markdown, JSON, CSV, and plaintext sources
-
chardet
rust version of chardet
-
rune-redact
Redact PII and secrets from text — emails, IPs, credit cards, phones, SSNs, bearer tokens, and high-entropy secret strings
-
aqp3
Congress.gov legislation text query syntax parser
-
crawdad
ChaRActer-Wise Double-Array Dictionary
-
ohos-ime-binding
OpenHarmony's input method binding for rust
-
fontcull-font-types
Scalar types used in fonts. (Vendored fork for fontcull)
-
crawdad-rkyv
Crawdad: ChaRActer-Wise Double-Array Dictionary with rkyv support
-
rs-stem-words
Normalizes the input words
-
turbo-json-checker
A pushdown automaton low memory JSON bytes stream checker returning the JSON root-type followed by its start and end index in the Reader
-
turndown-core
Core Markdown AST and serialization for turndown
-
pdfluent-extract
PDF content extraction: images, text with positions, and full-text search
-
crossandra
A fast and simple lexical tokenization library
-
almanaculum
Core types and traits for analysis
-
regex-chunker
Iterate over the data in a
Readtype in a regular-expression-delimited way -
indent
Functions for indenting multiline strings
-
unobtanium-segmenter
A text segmentation toolbox for search applications inspired by charabia and tantivy
-
pulldown_typst
A pull parser for Typst markup
-
tessera-embeddings
Multi-paradigm embedding library: ColBERT, dense, sparse, vision-language, and time series models
-
mdbook-tiny
Use mdbook to generate tiny and fast static sites
-
hebrew_accents
finding, filtering, and displaying Hebrew accents, specifically focusing on the Tiberian accent system as documented by the Masoretes
-
nib-cli
A cli for a yet another static site generator Nib
-
aqp3-cli
Congress.gov legislation text search query syntax validator
-
wicket
Wikipedia corpus knowledge extractor
-
sre-engine
A low-level implementation of Python's SRE regex engine
-
lzy-codec
一種變長文本編解碼方案,支持對Unicode進行編解碼。編解碼效率、存儲空間全面優於UTF-8,未來會替代UTF-8成為新的世界通用編解碼標準。
-
klieo-pii-patterns
Shared PII detection patterns (regex source strings) consumed by klieo-ops and klieo-ops-evidence-verify
-
htmlescape
HTML entity encoding and decoding
-
memchunk
The fastest semantic text chunking library — up to 1TB/s chunking throughput
-
voirs-g2p
Grapheme-to-Phoneme conversion for VoiRS speech synthesis
-
tectonic_engine_bibtex
The
bibtexprogram as a reusable crate -
json-carver
Digital forensics tool that reads (carves) JSON strings from a dump. Think of it as a more accurate and faster replacement for the strings(1) utility.
-
shift_or_euc
Detects among the Japanese legacy encodings
-
wchar
Procedural macros for compile time UTF-16 and UTF-32 wide strings
-
fastripgrep
Fast regex search with sparse n-gram indexing — faster than ripgrep on every pattern type
-
spongebob
convert text to spongebob case a.k.a tHe MoCkInG sPoNgEbOb MeMe
-
typub-adapters-core
Core adapter interface and types for typub
-
devup-editor-markdown
Markdown ↔ Document conversion (import + export) for devup-editor
-
scanix
search a text or pattern in files. A fast and lightwight text tool.
-
case
A set of letter case string helpers
-
office_oxide_mcp
MCP server for Office document extraction — gives Claude, Cursor, and AI assistants the ability to read DOCX/XLSX/PPTX/DOC/XLS/PPT files locally. Powered by office_oxide.
-
opentalk-types-signaling-meeting-notes
Signaling types the OpenTalk meeting-notes module
-
mdbook-translator
A translation preprocessor plugin for mdBook that automatically translates documents using the DeepSeek API
-
mago-reference
Mago Reference is a library for analyzing PHP codebases by providing advanced symbol search capabilities
-
rtl-flip-detect
Detect right-to-left override (U+202E) and other bidi-control characters that flip rendering of strings. Used in filename-spoof and prompt-injection attacks. Zero deps.
-
utf8-io
Traits and types for UTF-8 I/O
-
grep-regex
Use Rust's regex library with the 'grep' crate
-
traverze
CLI for full-text search built on Tantivy and Lindera
-
wagyan
CLI tool to convert text into extruded ASCII STL meshes (TTF/OTF supported)
-
dictutils
Dictionary utilities for Mdict and other formats
-
mupdf-sys
Rust FFI binding to MuPDF
-
rtss
A command-line tool to annotate stdout/stderr with elapsed times
-
csv-groupby
execute a sql-like group-by on arbitrary text or csv files
-
avila-tokenizers
The most complete tokenizer library in Rust - BPE, WordPiece, Unigram, with native support for GPT, BERT, Llama, Claude
-
flutmax-decompile
.maxpat JSON to .flutmax source decompiler
-
prettychars
Unicode text styling and named glyph lookup with zero runtime overhead
-
drova_sdk
Sdk for absolute converter of formats for dalet
-
screenplay-doc-parser-rs
Tools to parse Screenplay-formatted documents into semantically-typed structs
-
brevis
An XML processor for a more comfortable light markup syntax
-
marknest
Markdown workspace analyzer and PDF converter CLI
-
embedsearch
Lightweight embeddable full-text search engine in pure Rust
-
skimple
interface for the skim fuzzy-matcher
-
markdown-strip
Strip Markdown formatting (headers, bold, italic, links, code, blockquotes) to plain text. Conservative, fast, zero deps.
-
popsam-core
Core library for AI-assisted selection of semantically representative texts
-
diffy-imara
Tools for finding and manipulating differences between files
-
trailfix
Trim trailing whitespace and ensure single newline at EOF
-
axonml-text
Text processing utilities for the Axonml ML framework
-
orion_cfmt
Format output without Rust code segment in binary to reduce the ultimate binary size
-
ucp-translator-markdown
Markdown translator for UCP
-
rusty_phoenix_file_to_crate
allows to integrate text files as canvas inside rust program. To generate customizable html pages or markdown for example.
-
line-numbers
Find line numbers in strings by byte offsets, quickly
-
capec-data
Request CAPEC data offline
-
cai-tui
Terminal UI for Coding Agent Insights
-
anyxml-uri
URI library for XML
-
translitrs
Transliteration utility for Serbian language
-
rsmarkdownlint
Rust version of markdownlint
-
east-asian-width
Determine the display width of Unicode characters in East Asian contexts
-
penmanship
A Unicode character lookup library for converting text patterns to Unicode characters
-
gbk2utf8
CLI tool to detect and convert GBK-encoded source files to UTF-8 safely
-
mq-repl
Read-Eval-Print Loop (REPL) for mq query language
-
rosetta-aisp-llm
LLM fallback for AISP conversion using Claude SDK - extends rosetta-aisp with AI-powered conversion
-
mnem-ner-providers
NER provider adapters for mnem. Ships RuleNer (heuristic, zero-dep) and NullNer. Future: GLiNER ONNX.
-
alizain
Zero-dependency crate for mathematical Unicode text styles: mono, bold, italic, sans_bold
-
codebook_downloader
Dictionary downloading utility for the Codebook spell checker
-
rascii_art
Advanced ASCII Art Generator
-
text-tokenizer
Custom text tokenizer
-
legalis-viz
Visualization engine for legal statutes - decision trees, flowcharts, and dependency graphs
-
tmpltr
Template-based document generation CLI
-
cwc
A word counter utility that properly handles CJK and Unicode text
-
pecto-python
Python behavior extractor (FastAPI, Flask, Django)
-
xore-config
XORE 配置管理模块 - 统一的配置加载和路径管理
-
ratex-font-loader
Shared lazy font loading and caching for RaTeX renderers
-
encoding-next-index-tradchinese
Index tables for traditional Chinese character encodings
-
smallgrep
A Lite version of a CLI tool grep made with rust
-
bard
Creates PDF and HTML songbooks out of easy-to-write Markdown sources
-
mistral_ocr
CLI tool to convert PDF, image, and document files into Markdown using Mistral AI's OCR API
-
opentalk-report-generation
OpenTalk report generation functionality
-
rho-hashline
Line-level hashing for file editing anchors
-
tui-syntax
Tree-sitter based syntax highlighting for TUI applications
-
cadi-scraper
CADI Scraper/Chunker utility for converting source code repos and file data into reusable CADI chunks
-
docket
markdown to HTML documentation rendering
-
oxipdf
A standalone, general-purpose native Rust PDF layout and generation engine
-
plato-room-search
Full-text search engine for PLATO rooms — inverted index, BM25 ranking, fuzzy matching
-
docanvil
A Rust-based static documentation generator that converts Markdown into HTML sites
-
rascii_art_img
Advanced ASCII Art Generator. Fork for imgii.
-
waterui-str
String utilities for WaterUI
-
upf
reading UPF text into typed structs and writing validated UpfData back to UPF
-
pdf_text
Positioned PDF text and glyph extraction for semantic diff and comparison pipelines
-
mdbook-asciinema
mdbook asciinema preprocessor
-
md2logseq
Convert standard Markdown (GFM) to Logseq block format
-
jpreprocess-jpcommon
Japanese text preprocessor for Text-to-Speech application (OpenJTalk rewrite in rust language)
-
zed-collections
Standard collection type re-exports used by Zed and GPUI
-
prezzy
Make any CLI output beautiful. Zero config. Just pipe.
-
lontar
Comprehensive document generation library for Rust — write once, render everywhere
-
twoslash-rust
Twoslash for Rust - extract type information from Rust code using rust-analyzer
-
opentalk-types-signaling-whiteboard
Signaling types the OpenTalk whiteboard module
-
semantic-commands
A lightweight Rust framework for defining and executing semantic commands using text embeddings
-
utf8next
function for getting the next character and its length in bytes from a string
-
spikard-ffi
Rust-centric multi-language HTTP framework with polyglot bindings
-
zg
Small query-normalization helpers for the zg search tooling
-
bookforge-core
Core IR, segmentation, configuration, and progress types for BookForge
-
gstring
String with support for Unicode graphemes
-
wdpe
WebDynpro Parse Engine
-
rustpython-wtf8
WTF-8 for use in RustPython
-
html2markdown
HTML to Markdown converter using AST-to-AST transformation
-
bashdoc
generating documentation/help menu for user defined bash functions
-
typdiff
A diff tool for Typst documents, similar to latexdiff
-
kham-capi
C FFI bindings for the kham Thai word segmenter
-
capitalize
Change first character to upper case and the rest to lower case, and other common alternatives
-
fea-rs
Tools for working with Adobe OpenType Feature files
-
open-redact-pdf-text
Text extraction and search geometry for open-redact-pdf
-
exine
Universal Markdown extraction engine. 37+ formats, zero external dependencies, 10-96× faster than Pandoc.
-
rust-functions
A collection of Rust utility functions (starting with format_number)
-
hanconv-cli
Convert between Chinese characters variants
-
typos-dict
Source Code Spelling Correction
-
shannon-nuon
Support for the NUON format
-
pandoc_ast
deserializes and serializes the markdown ast for writing pandoc filters
-
rushdown-fenced-div
Fenced div extension for rushdown markdown parser
-
console-mermaid
Pure Rust CLI for rendering Mermaid graphs inside your terminal
-
doxygen-bindgen
Converts Doxygen comments into Rustdoc markdown
-
braille-bar
Render a percentage as a fixed-width braille bar
-
lowcharts
draw low-resolution graphs in terminal
-
pleias-stratum-client
Official Rust client for the Stratum document-chunking API
-
idna-cli
Encode/decode Unicode domain names to/from IDNA ASCII
-
rfham-antennas
Data types to represent antennas
-
qplan
Compile a typed query AST (qexpr) into execution-friendly plans (prime-ideal: planning)
-
pulldown-cmark-toc
Generate a table of contents from a Markdown document
-
persian-tools-cli
cli for rust-persian-tools crate
-
rpmvercmp-rs
RPM version comparison (rpmvercmp) in Rust — no_std and no_alloc friendly
-
wtf8
WTF-8 encoding. https://simonsapin.github.io/wtf-8/
-
spider-client
Spider Cloud client
-
man_parser
roff parser for converting man pages to JSON/Markdown
-
re2
Wrapper for the re2 C++ regex library
-
unified-diff
GNU unified diff format
-
rbook-cli
Experimental command-line interface for rbook
-
neco-textview
Text position primitives: line/column ↔ byte offset, UTF-16 mapping, selection/caret model
-
tidyvcf
command-line tool to convert VCF files to tab/comma separated tables
-
uclanr
A random word picker that gives you actually useful words
-
officemd_docling
Docling JSON conversion for OfficeMD document IR
-
viddy
A modern watch command
-
tagsearch
Filter plaintext files based on @keyword tags
-
illuminate-string
A comprehensive Rust library for advanced string manipulation and processing
-
servo-layout-api
A component of the servo web-engine
-
dossiers
home for your specs, policies, and process docs
-
qpprint
console printing/formatting
-
inkline
Display colorized ASCII art and images directly in the terminal
-
oxilean-lint
OxiLean linter - Static analysis and lint rules
-
regexml
XPath compatible regex engine
-
unicode_names2_generator
Generates the perfect-hash function used by
unicode_names2 -
legalis-ca
Canada jurisdiction support for Legalis-RS (Charter of Rights, Federal/Provincial Law, Quebec Civil Law)
-
liwe
IWE core library
-
win-utf8-rs
a function to enable UTF-8 for windows
-
checkstream-proxy
High-performance HTTP/SSE proxy server for LLM guardrails with sub-10ms latency
-
ansi-width
Calculate the width of a string when printed to the terminal
-
mdbook-treesitter
mdBook preprocessor for html adding tree-sitter highlighting support
-
lingua-russian-language-model
The Russian language model for Lingua, an accurate natural language detection library
-
ito-domain
Domain models and repositories for Ito
-
savagestr
SAVAGE string encoder/decoder. If can encode or decode by specifying the code page or the encoding name, it works, else it uses the savage way to encode/decode by using
String::from_utf8_lossy() -
text-block-permutation-optimizer
If TSP would meet Text processing
-
response-validator
Detect and truncate hallucinated conversation turns in LLM responses
-
rs-case-converter
Converts the string to the specified case
-
lindera-wasm
A morphological analysis library for WebAssembly
-
mdbook
Creates a book from markdown files
-
mat-o-viewer
A modern terminal file viewer combining cat, less, and grep with syntax highlighting and markdown rendering
-
cruxi-validate
Validator combinators and built-in validators for Cruxi
-
rust_tokenizers
High performance tokenizers for Rust
-
qmd
Lightweight SOTA local search engine for AI agents in Rust
-
legalis-cn
China (中国) jurisdiction support for Legalis-RS - Socialist civil law with Chinese characteristics
-
asimov-openai-module
ASIMOV OpenAI module
-
ctx-telemetry
Local telemetry and reporting utilities for CTX
-
html2text-cli
Render HTML as plain text
-
lindera-ipadic-neologd
A Japanese morphological dictionary for IPADIC NEologd
-
prosesmasher-domain-types
Internal domain types crate for the prosesmasher workspace. Published to support the workspace dependency graph.
-
vidyut-kosha
A Sanskrit key-value store
-
aozora2
Aozora Bunko format converter CLI
-
legalis-us
United States jurisdiction support for Legalis-RS (Common Law)
-
COXave
Instruments for codings
-
g2-unicode-jp
convert Japanese Half-width-kana[半角カナ] and Wide-alphanumeric[全角英数] into normal ones
-
rushdown-emoji
Emoji extension for rushdown markdown parser
-
pdf-annot
PDF annotation engine — parsing and typed access to all annotation types per ISO 32000-2 §12.5
-
diff-man
diff utility lib
-
ld-lucivy-stacker
term hashmap used for indexing
-
rosie
Interface for the Rosie Pattern Language, for efficient and maintainable text pattern matching and search
-
text-document-io
Import/export for text-document: plain text, Markdown, HTML, LaTeX, DOCX
-
datatroll
a robust and user-friendly Rust library for efficiently loading, manipulating, and exporting data stored in CSV files
-
swift-check
High-performance, robust, and expressive searching and validation (uses SIMD on x86_64, aarch64, and WASM)
-
wcount
CLI word counting tool
-
convert-to-spaces
Convert tabs to spaces in a string
-
markdown-extract
Extract sections of a markdown file
-
cloakrs-adapters
Format adapters for scanning text, JSON, CSV, logs, and SQL with cloakrs
-
wubi
Self-developed Wubi 86 encoder, dictionary, and dataset (PHF + FST, WASM-ready)
-
java_string
Java strings, tolerant of invalid UTF-16 encoding
-
smt-str
working with SMT-LIB strings in Rust
-
scribe-patterns
Advanced pattern matching and search algorithms for Scribe
-
tiktoken-stream
Streaming token counter for partial LLM responses. Accumulates token count across chunks without holding the full text. Pluggable estimator function. Zero deps.
-
lucide-icon-name
Lucide icon names
-
jfmt
command-line tool for formatting json files in both readable and compact formats. It supports stdin/stdout shell usage, as well as working on files directly.
-
nlpo3
Thai natural language processing library, with Python and Node bindings
-
record-query
doing record analysis and transformation
-
rustpython-parser-vendored
RustPython parser vendored third-party crates
-
ncase
Enforce a case style
-
git-blamediff
A program to automatically annotate changes to a file in git(1)
-
mdbook-files
Preprocessor for mdbook which renders files from a directory as an interactive widget
-
static_table
creates pretty tables at compiler time
-
arborium-highlight
Unified syntax highlighting for arborium - works with both static Rust grammars and WASM plugins
-
asimov-module-cli
ASIMOV Module Command-Line Interface (CLI)
-
ai-context-gen
A context generator for Rust repositories that creates structured markdown files with relevant information for LLMs and AI agents
-
use-glob
Predictable glob helper utilities for RustUse
-
incredimo
just another font for your terminal
-
tracey-config
Configuration types for tracey spec coverage tool
-
gh_page_tool
A github gh-pages tool for static blog site
-
rune-tokenize
Approximate token counting, budget checking, text truncation, and overlapping chunk splitting for LLM context windows
-
dm-meta
YAML frontmatter parser and validator for technical documentation
-
ik-mini-epub
Minimal async IK to EPUB downloader | Extremely minimal
-
hmd-core
Core IR, diagnostics, and validation types for Human Markdown documents
-
pdf-engine
Unified PDF rendering engine — page rendering, text extraction, thumbnails
-
clipcount
Counting words from the clipboard content
-
selmr
Package to create and use Simple Explainable Language Multiset Representations
-
processors-rs
Embed anything at lightning speed
-
crabular-cli
A CLI tool for generating ASCII tables
-
mecab
Safe Rust wrapper for mecab a japanese language part-of-speech and morphological analyzer library
-
perl-heredoc
Heredoc collector and processor for Perl — handles multi-line heredoc syntax including indentation stripping and CRLF normalization
-
mcd-wasm
Raw WebAssembly bindings for Markdown CSV Document packages
-
mdlite
A super-lightweight terminal Markdown reader
-
markdown-code-runner
Automatically update Markdown files with code block output
-
spangrep
Grep out a spam of lines
-
fontheight-cli
Find out the vertical extents your font reaches on shaped words
-
maproom
Semantic code search powered by embeddings and SQLite
-
meme_generator_utils
Meme generator utils
-
rhema_ai_chirho
AI integration: LLM providers, embeddings, vector search, query expansion
-
codetypo-vars
Source Code Spelling Correction
-
mut-str
A toolkit for working with mutable string slices (&mut str)
-
hr-shape
Command-line utilities for HarfRust text shaping library
-
googleapis-tonic-google-cloud-documentai-v1beta3
A Google APIs client library generated by tonic-build
-
microformats-types
A representation of the known objects of Microformats
-
mdbook-permalinks
Generate permalinks in mdBook using paths
-
next-plaid-cli
Semantic code search powered by ColBERT
-
rst_renderer
a reStructuredText renderer
-
rust-mando
Convert Chinese characters to pinyin with jieba word segmentation
-
md2pdf-rs
A CLI tool to convert Markdown to PDF using Typst
-
assert-text
the testing macro tools
-
lo_math
LaTeX formula parser with MathML and ODF emission
-
pii-masker
Rust port of the HydroXai PII masker with a library API and CLI
-
tree-sitter-stack-graphs-python
Stack graphs definition for Python using tree-sitter-python
-
lindera-cc-cedict-builder
A Chinese morphological dictionary builder for CC-CEDICT
-
toonconv
CLI tool for converting JSON to TOON (Token-Oriented Object Notation) format
-
opentalk-roomserver-module-legal-vote
OpenTalk RoomServer Module Legal Vote
-
mintyml-cli
Creates HTML from MinTyML, a minialist alternative syntax to HTML
-
scrybe-core
Scrybe core — AST, Document, ContentAddressable, Plugin trait, Workspace
-
wrapr
wrap your code for ai
-
committed
Nitpicking commit history since beabf39
-
domrs
Document builder and serializer
-
clparse
A command line tool for parsing CHANGELOG.md files that use the Keep A Changelog format
-
markdowndown
acquiring markdown from URLs with smart handling
-
heckle
Semi-joke case conversion library: Spongebob Case and Billy Mays Mode
-
mdbook-summary
Summary parser for mdBook
-
atomr-agents-ingest
Document loaders, text splitters, and CachedEmbedder for atomr-agents
-
wikiext
extracting and processing Wikipedia data, implemented in Rust
-
blz-cli
CLI for blz – fast local llms.txt search
-
rjot
A minimalist, command-line jotting utility that's fast, private, and git-friendly
-
strslice
that provides zero copy string iterators for working with string slices. The library offers iterators similar to standard Rust string methods
-
minigrep_rd
searching through lines of text
-
quickmark-core
Lightning-fast Markdown/CommonMark linter core library with tree-sitter based parsing
-
philiprehberger-str-utils
String manipulation utilities — truncation, case conversion, padding, and whitespace operations
-
ucd-util
A small utility library for working with the Unicode character database
-
codebase-to-markdown
convert codebase to markdown format
-
typed-oid
Typed Object IDs
-
lindera-ipadic
A Japanese morphological dictionary for IPADIC
-
mq-http
HTTP server for mq scripts
-
output-sanitize-rs
Strip dangerous HTML/SQL/shell snippets from LLM output before render, query, or shell sinks. Rust port of @mukundakatta/llm-output-sanitizer. Zero deps.
-
strval
Parse strings into values
-
inkjet
A batteries-included syntax highlighting library for Rust, based on tree-sitter
-
lil-tabby
A macro-based library for creating visually appealing tables with automatic column spanning
-
pinyin2ch
converting Chinese Pinyin to Chinese characters with various levels of detail
-
dartboard-picker-core
Icon catalog and glyph source helpers for dartboard pickers
-
hayro-cmap
A parser for CMap files
-
libruskel
Generates skeletonized outlines of Rust crates
-
minislug
A tiny, dependency-free slugifier that turns any &str/String into a safe cross-platform filename
-
nonsense
Lorem ipsum placeholder text generator with clipboard integration
-
elicit_regex
Elicitation-enabled Regex newtype — JsonSchema-compatible wrapper around regex::Regex with serde support
-
oxidize-html
A backend-agnostic HTML parser, style engine, layout engine, and painter. Emits flat draw commands with no UI framework dependency.
-
unicode-box-drawing
Unicode box-drawing characters
-
tfon
Bitmap font parsing / conversion
-
inline_flexstr
copy/clone-efficient inline string type for Rust
-
text-replacer
Takes a String, or Bytes and replaces each word found with a same word from the provided dictionary
-
vecgrep
Semantic grep — like ripgrep, but with vector search
-
mnem-llm-providers
Text-generation adapters for mnem (OpenAI chat, Ollama chat) for HyDE, multi-query, and future LLM-in-the-loop features. Sync, TLS-via-rustls, tokio-free.
-
flerp
CLI tool that does XYZ
-
pulldown_mdbook
A pull parser for mdBook
-
mdka-node
Node.js bindings for mdka – a HTML to Markdown converter
-
uniwhat
Display the unicode characters text
-
rok-utils
Laravel/AdonisJS-inspired utility helpers for the Rok ecosystem
-
apiari-tui
Shared TUI design system for Apiari tools — theme, scroll, and common widgets
-
typub-passes
Semantic IR passes for typub
-
lucide-yew
Yew port of Lucide
-
kiri_nif
Erlang NIF wrapper for Kiri Japanese morphological analyzer
-
mdbook-keeper
An improved testing experience for mdbook
-
yeslogic-unicode-script
Fast lookup of the Unicode Script property
-
ascii-cleaner
Detect, Remove and Replace non-ASCII characters
-
lindera-python
A Python binding for Lindera
-
sdaas-rs
Official Rust SDK for SDaaS — Semantic Delta as a Service
-
rawk-cli
The rawk cli, which is an AWK interpreter clone. The goal is to be POSIX compatible.
-
playbill
ASCII art title generator with random gradient effects
-
code_generator
A code generator (Currently only targets C)
-
fount
A terminal-based Fountain screenplay editor
-
libabbs
aosc-os-abbs maintenance
-
use-text
Composable text primitives for RustUse
-
human_regex
A regex library for humans
-
doxx
Terminal document viewer for .docx files
-
legalis-th
Thailand jurisdiction support for Legalis-RS - Thai legal system with Buddhist Era calendar, FBA, BOI, PDPA, Labor law
-
llama-tokenizer
Tokenizer crate for llama.rs — deterministic text-to-token conversion
-
citecite
Citation-marker [1] [2] injector + parser for RAG outputs. Round-trips between sources and rendered text.
-
abbreviation_extractor
extracting abbreviations from text
-
padzapp
An ergonomic, context-aware scratch pad library with plain text storage
-
panache-formatter
Core formatting engine for Pandoc markdown, Quarto, and RMarkdown
-
aki-stats
output the statistics of text, like a wc of linux command
-
a2a-agents-common
Common utilities for building A2A Protocol agents
-
verso-reader
A terminal EPUB reader with vim navigation, a Kindle-style library, and Markdown highlight export
-
ascii_table_rs
Elegant ASCII table renderer for Rust CLI and terminal apps
-
skribo
low-level text layout
-
ansi-escape-sequences
High-performance Rust library for detecting, matching, and processing ANSI escape sequences in terminal text with zero-allocation static regex patterns
-
rucora-providers
LLM provider implementations for rucora (OpenAI, Anthropic, Gemini, Ollama, etc.)
-
mdbook-svgdx
mdbook preprocessor to convert svgdx fenced code blocks into inline SVG images
-
emoji-search
Fast fuzzy emoji searcher and picker for the terminal
-
rustdoc-markdown
convert Rust documentation to Markdown, for use with LLMs
-
htmlsnob_rules
HTML validator, formatter and autofixer
-
mdbook-typstpdf
An mdBook backend that generates PDF output using Typst
-
libopenlipc-sys
Wrapper around liblipc to interact with Kindle dbus-based LIPC events
-
diff_core
Semantic PDF comparison engine for matching document blocks and reporting meaningful changes
-
cvxtract
LLM-powered structured extraction from CVs/resumes — PDF, DOCX, HTML, TXT input; typed Rust structs output
-
lipgloss-tree
A tree component for terminal user interfaces, styled with Lip Gloss
-
mdx-cli
A fast, beautiful terminal markdown viewer with gradient headings, syntax highlighting, and 8 themes
-
blinc_noto_symbols
Drop-in Noto Sans text/symbol fallback for blinc_text. Bundles a subset of Noto Sans Regular that covers math operators, arrows, currency symbols, Latin-1 supplement punctuation, and…
-
ratex-lexer
LaTeX lexer for RaTeX
-
mdloc
command-line tool for processing image links in Markdown files. Download remote images and convert them to Base64 embedded format or local file references.
-
llm-wiki-lib
LLM-powered Wiki SDK — Ingest, Query, and Lint pipelines
-
wordshk_tools
A combination of parsers and other tools for words.hk (粵典)
-
crate2bib
Create BibLaTeX entries for crates hosted on crates.io
-
litsea-cli
Litsea is an extreamely compact word segmentation and model training tool implemented in Rust
-
maybe-regex
Wrapper for strings that may be either a regex or a plain-text string
-
rust-tfidf
calculate TF-IDF (Term Frequency - Inverse Document Frequency) for generic documents
-
divvunspell-bin
Spellchecker for ZHFST/BHFST spellers, with case handling and tokenization support
-
the-other-tui-markdown
Convert Markdown to ratatui Text with configurable theming and per-element rendering
-
typship
A cli for typst packages
-
opentalk-types-signaling-meeting-report
Signaling types the OpenTalk meeting-report module
-
rushdown-diagram
Diagram visualization extension for rushdown markdown parser
-
ogrep
searching in indentation-structured texts
-
git2prompt
command-line tool that takes a GitHub repository URL, downloads its contents, and generates a single text file optimized for use as input to AI tools
-
three-dcf-core
Document-to-dataset encoding library for LLM training data preparation. Converts PDFs, Markdown, HTML into structured formats optimized for machine learning.
-
aki-resort
sort lines of text. You can use regex to specify the KEY.
-
aki-txpr-macro
the more easy to use libaki-*
-
sakurs-cli
Command-line interface for Sakurs sentence boundary detection
-
turndown
An opionated Rust port of Turndown.js
-
shiplog-cluster-llm
LLM-assisted workstream clustering with OpenAI-compatible backends and repo fallback
-
lera_regexop
peliminary function that turns a regex into a comparable FTS search query
-
mdansi
A blazing-fast Markdown-to-ANSI terminal renderer with built-in syntax highlighting
-
constr
Constant string generics
-
zero-width-strip
Strip zero-width and bidi-control Unicode characters from text. Defends against invisible-payload prompt injection. Zero deps.
-
encoding-next-index-simpchinese
Index tables for simplified Chinese character encodings
-
elizaos-plugin-knowledge
Knowledge plugin for elizaOS, providing Retrieval Augmented Generation (RAG) capabilities
-
array_tool
Helper methods for processing collections
-
holy-carpet
customizable blog creator
-
use-ascii
ASCII detection and classification helpers for RustUse
-
typos-vars
Source Code Spelling Correction
-
jpreprocess-window
Japanese text preprocessor for Text-to-Speech application (OpenJTalk rewrite in rust language)
-
lindera-ipadic-neologd-builder
A Japanese morphological dictionary builder for IPADIC NEologd
-
agentic-veritas-mcp
MCP server for AgenticVeritas
-
grep-searcher
Fast line oriented regex searching as a library
-
mecrab
A high-performance, thread-safe morphological analyzer compatible with MeCab, written in pure Rust
-
low-expectations
GX-inspired validation engine for content validation
-
lingua-italian-language-model
The Italian language model for Lingua, an accurate natural language detection library
-
mdbook-rustdoc-links
Link to Rust API docs by name in mdBook
-
rs-emoji-list
Prints the known emoji info
-
clip-sanitize
Meta-library for robust text sanitization, repair, and normalization
-
asimov-ollama-module
ASIMOV Ollama module
-
tabwriter
Elastic tabstops
-
caribon
A repetition detector program and library
-
mq-conv
A CLI tool for converting various file formats to Markdown
-
use-text-line
Composable line-level text primitives for RustUse
-
chinese-telegraph
unicode to chinese telegraph code conversion
-
mdbook-tikz
mdBook preprocessor that renders TikZ and tikzcd diagrams to inline SVG
-
content-ingest
Content ingestion, validation, and normalization pipeline for text and binary data
-
cabocha
Safe Rust wrapper for cabocha a japanese language dependency structure analyzer library
-
datalab-cli
A powerful CLI for converting, extracting, and processing documents using the Datalab API
-
is_printable
Determine whether a given text-based value is printable
-
orly
Download O'Reilly books as EPUB
-
pad
padding strings at runtime
-
neco-decor
Platform-agnostic decoration data model for text editor highlight, marker, and widget ranges
-
rustmax-doctest
Doctest runner for rustmax crate examples
-
ogam
A markup language for story writers
-
ssg_whiz
Static site generation orchestration utilities
-
uwl
A management stream for bytes and characters
-
md2bb
CLI Tool to convert markdown to old school bbcode
-
annotationreport
Extract PDF annotations to a summary file
-
opentalk-roomserver-types-legal-vote
OpenTalk RoomServer Types Legal Vote
-
cairn-ingest
Artifact loading and normalization for Cairn ingestion
-
skills-ref-rs
agentskills library for validating, parsing, and managing Agent Skills
-
polished_scancodes
handling and mapping keyboard scancodes in Rust
-
font-map
Macros and utilities for parsing font files
-
grep-matcher
A trait for regular expressions, with a focus on line oriented search
-
rustybara-icc
ICC color profile storage and transform engine for the rustybara workspace
-
bom-strip
Strip UTF-8/16/32 BOM bytes and stray U+FEFF code points from text before parsing or hashing. Zero deps.
-
kawat-output
Output format converters for kawat (TXT, MD, JSON, XML, CSV, TEI)
-
anno-graph
Graph/KG export adapters for anno: converts extraction output to lattix::KnowledgeGraph and N-Triples
-
eml2md
Convert EML files to Markdown
-
regex-pii-rs
Regex-only PII detector for emails, phones, SSNs, credit cards, and prefixed API keys. Rust port of pii-sentry. Zero deps.
-
mdxport
Markdown to PDF via Typst — comrak AST, in-process compilation, LaTeX math support
-
pcre2
High level wrapper library for PCRE2
-
regex-automata
Automata construction and matching using regular expressions
-
defuddle-rs
extracting main content and metadata from HTML web pages
-
rheo-epub
A typesetting and static site engine based on Typst
-
cindex
CSV indexing library
-
mdast_util_to_markdown
Markdown to AST
-
real_time_note_taker
A terminal UI tool to take time stamped notes in real time
-
csv_to_table
pretty print CSV as a table
-
typub-assets-ast
AST-level asset processing for typub
-
fireplace-deluxe
A cozy fireplace in your terminal
-
wpm-rust
A terminal typing tester
-
diff_report
Stable JSON, Markdown, HTML, and AI-review reports for semantic PDF comparison
-
aozora2text
Convert Aozora Bunko format to plain text
-
trace-redact
Redact sensitive fields (api keys, tokens, emails, phone numbers) from agent traces before exporting to OTel or a log sink. Zero deps.
-
fmtt
A diff-friendly text formatter that breaks lines on sensible punctuations and words to fit a line width
-
hoogle-syntax
Haskell syntax highlighting and tokenization for hoogle-tui
-
mdbook-ts
An mdBook preprocessor that uses tree-sitter to extract code snippets from source files
-
lo_draw
Vector drawing page builder with ODG export
-
vesti
A preprocessor that compiles into LaTeX
-
tectonic_io_base
Basic types for Tectonic's pluggable I/O backend system
-
kiri-kotoba
Input text processing for Kiri Japanese morphological analyzer
-
hayro-write
rewriting pages of a PDF file
-
html-auto-p
function like
wpautopin Wordpress. It uses a group of regex replaces used to identify text formatted with newlines and replace double line-breaks with HTML paragraph tags. -
mdwright-mathrender
Math-renderer compatibility profiles and math-body checking for mdwright
-
servo-canvas
A component of the servo web-engine
-
count-md
configurable command-line tool and Rust library for Unicode-aware, Markdown-aware, HTML-aware word counting in Markdown documents
-
glifnames
Mapping of characters to glyph names according to the Adobe Glyph List Specification
-
rdocx-cli
CLI tool for inspecting, converting, and manipulating DOCX files
-
nile-library
supporting nile
-
aki-unbody
output first or last n lines, like a head and tail of linux command
-
text_utils_s
edit array. Example delete duplicate in array. Clear string
-
stringzz
strings and opcodes extraction from various file formats
-
arinamcnulty-markdown-parser
Markdown parser - university project
-
ghost-lib
Ghost Librarian — ultra-lightweight local-LLM RAG engine with Context Distillation
-
mdbook-html
mdBook HTML renderer
-
sixbit
Small packed strings
-
regex-lite
A lightweight regex engine that optimizes for binary size and compilation time
-
fmtm_ytmimi_markdown_fmt
Fork of @ytmimi's Markdown formatter; powers FMTM
-
slay-the-saves
Save file parser for Slay the Spire 2
-
term-gpt
A fast, colorful ChatGPT CLI for your terminal!
-
token-dict
basic dictionary based tokenization
-
rhema_accel_chirho
FPGA acceleration: packed hierarchical bit-domain engine (Chi-Rho patent)
-
text-scatters
A cut-up technique generator from text and ebook files in the terminal
-
rdx-parser
Parser for RDX (Reactive Document eXpressions) documents
-
shell-color
shell-colorprovides a portable, reliable way of determining color support for applications spawned by the shell -
summera
TUI for webpage summarisation
-
text-document-search
Find and replace use cases for text-document
-
langram
Natural language detection library
-
llmtext
Turns any website into a single LLM-ready markdown file
-
legalis-au
Australia jurisdiction support for Legalis-RS (Commonwealth Constitution, ACL, Fair Work, Mabo)
-
pretext-render
Shaping-backed text rasterization helpers for Pretext layouts
-
libreoffice-pure
Pure-Rust LibreOffice-compatible document generation CLI
-
basen
Convert binary data to ASCII with a variety of supported bases
-
laurus-server
gRPC server for the Laurus search engine
-
semtree-embed
Embedding trait and backends (fastembed, openai)
-
lingua-portuguese-language-model
The Portuguese language model for Lingua, an accurate natural language detection library
-
md2ast
Markdown → JSON AST for CleverScript / Relay hosts (WASM, JNI, Rust)
-
chinese_detection
Classify a string as either English, Chinese, or Pinyin
-
miette-arborium
Arborium-powered syntax highlighter for miette diagnostics
-
typf-os-mac
Single-pass text rendering using macOS CoreText
-
tokenx-rs
Fast token count estimation for LLMs at 96% accuracy without a full tokenizer
-
nash-parse
Parser for the nash programming language
-
scrybe-render
Scrybe render — Markdown-to-HTML pipeline, syntect, KaTeX/Mermaid
-
eytan-minigrep
minigrep from "the book"
-
haqumei-cli
Command-line interface for the Haqumei G2P (Grapheme-to-Phoneme) engine
-
alemat
type-safe building of MathML
-
codespec
A specification standard and CLI for AI-driven software projects
-
lumis-cli
Syntax Highlighter CLI powered by Tree-sitter and Neovim themes
-
satteri-pulldown-cmark
A fork of the pulldown-cmark crate with MDX extensions, used in the satteri project
-
harfbuzz-sys
Rust bindings to the HarfBuzz text shaping engine
-
lindera-cc-cedict
A Chinese morphological dictionary for CC-CEDICT
-
terminal_tools
Power-Terminal TUI - fuzzy file and text finder, process manager, git browser, and more
-
asimov-x-module
ASIMOV module
-
mq-task
A task runner using Markdown
-
xhtmlchardet
Character set detection for XML and HTML
-
use-utf8
UTF-8 validation and truncation helpers for RustUse
-
popsam-py
Python extension crate for AI-assisted selection of semantically representative texts
-
wdl-lint
Lint rules for Workflow Description Language (WDL) documents
-
tuillem-tui
Ratatui TUI layer for tuillem
-
cli-boxes
Unicode box drawing characters for creating beautiful CLI interfaces
-
ucd-parse
parsing data files in the Unicode character database
-
nano_banana_pro_prompt
High-quality integration for https://supermaker.ai/blog/nano-banana-pro-prompt-use-cases-ready-to-copy-paste/
-
ps-hash
Generates 64-byte ascii hashes with 256 bits of security
-
floating-ui-utils
Rust port of Floating UI. Utilities for Floating UI.
-
llm-tui
A Terminal User Interface (TUI) for interacting with Language Learning Models (LLM) using llm-cli
-
notmecab
tokenizing text with mecab dictionaries. Not a mecab wrapper.
-
folderwalk
Folder walking tool
-
string-width
Accurate Unicode string width calculation for terminal applications, handling emoji, East Asian characters, combining marks, and ANSI escape sequences
-
bookforge-epub
EPUB reading, validation, and deterministic rebuild support for BookForge
-
poriborton
Interconversion between Unicode and various Bengali ANSI encodings
-
rushdown-meta
Meta(YAML frontmatter) extension for rushdown markdown parser
-
emoji-sanitize
Normalize or strip emoji-related Unicode (presentation selectors, variation selectors, zero-width joiners) from text before LLM input. Zero deps.
-
typf-render-color
Color glyph renderer for Typf (COLR v0/v1, SVG, sbix/CBDT bitmap)
-
pinot
Fast, high-fidelity OpenType parser
-
calculator-tui
A command-line calculator with symbolic math support
-
lipgloss-list
A list component for terminal user interfaces, styled with Lip Gloss
-
legalis-vn
Vietnam jurisdiction support for Legalis-RS - Vietnamese legal system with socialist market economy, Labor Code, Enterprise, Investment
-
stringmatch
Allow the use of regular expressions or strings wherever you need string comparison
-
relux-parser
Internal: parser for Relux. No semver guarantees.
-
svgbob_cli
Transform your ascii diagrams into happy little SVG
-
quillmark-typst
Typst backend for Quillmark
-
textframe
query plain text documents by unicode offset without loading them all into memory
-
matrixcode-tui
MatrixCode TUI - Terminal UI library for AI Code Agent
-
uiua-doc-gen
Documentation generator for Uiua libraries
-
rakugaki
rendering TTF/OTF font characters as ASCII art in the terminal
-
ascii2ext-ascii-rs
convert ascii text to extended ascii and back
-
ufofmt
A fast, flexible UFO source file formatter based on the Norad library
-
to_fraktur
Function that converts any string to fraktur font
-
papyrus-core
PDF-to-Markdown conversion engine with smart heading detection, bold/italic text extraction, and CommonMark output. Pure Rust, best-effort parsing for corrupted PDFs.
-
r2md
Entire codebase to single markdown or pdf file
-
regex-cli
A command line tool for debugging, ad hoc benchmarking and generating regular expressions
-
hmd-format
Low-diff formatter for Human Markdown documents
-
goofy-animals
Generate a name in adjective-adjective-animal form
-
open-redact-pdf-redact
Redaction planning and content rewriting for open-redact-pdf
-
niho
A command-line tool for converting romanized Japanese text to Japanese characters
-
crate2bib-cli
A CLI tool for the crate2bib crate
-
encoding_c_mem
C API for encoding_rs::mem
-
vn-nlp
Vietnamese NLP library — tokenization, normalization, segmentation
-
mq-formatter
Code formatter for mq query language
-
ratex-ffi
C ABI FFI exports for RaTeX
-
charname
Incredibly simple library that just gives you the Unicode name for a character
-
dioxus-typst
Typst component for Dioxus
-
oxidoc-text
Shared tokenization pipeline for oxidoc — used by both build-time and query-time search
-
rucora-skills
Skills system for rucora (YAML command templates)
-
html-compare
compare html files
-
wicket-cli
Wikipedia corpus knowledge extractor
-
llm-text
processing text for LLM consumption
-
contextgrep
Grep your documents with context — fast offline search for PDFs, DOCX, Markdown and code
-
ripgrep
line-oriented search tool that recursively searches the current directory for a regex pattern while respecting gitignore rules. ripgrep has first class support on Windows, macOS and Linux.
-
kelp
A convert tool for Japanese
-
tuillem-markdown
Terminal markdown rendering for tuillem
-
substring-replace
developer-friendly methods to manipulate strings with character indices
-
termio
styling terminal output with CSS-like syntax
-
ferogram-parsers
Telegram HTML and Markdown entity parsers for ferogram
-
opentalk-types-signaling-transcription
Signaling types the OpenTalk transcription module
-
fontcull-klippa
Subsetting a font file according to provided input. (Vendored fork for fontcull)
-
change-case-rs
Convert strings between camelCase, snake_case, PascalCase, kebab-case, and more
-
jpreprocess-naist-jdic
Japanese text preprocessor for Text-to-Speech application (OpenJTalk rewrite in rust language)
-
mecab-ko-dict-builder
한국어 형태소 사전 빌더 - CSV에서 바이너리 사전 생성
-
doccy
brace based markup language
-
uwurs
UwUify your strings with uwurs!
-
loggrep
A smarter log parser with color-coded severity, time filtering, regex matching, and stats
-
indent_tokenizer
Generate tokens based on indentation
-
termwrap
Wrap Unicode text with ANSI color codes
-
ps-str
String transcoding library
-
lindera-nodejs
A Node.js binding for Lindera
-
use-wildcard
wildcard matching helpers for RustUse
-
xmldecl
Extracts an encoding from an ASCII-based bogo-XML declaration in text/html in a Web-compatible way
-
typub-engine
Build engine for typub (pipeline, rendering, assets, project)
-
glowpub
A glowfic to epub converter
-
rhema_testkit_chirho
Shared test fixtures, generators, golden harnesses, differential runners
-
use-regex
Practical regex utility helpers for RustUse
-
table_to_html
interface to convert a
tabled::Tableinto a HTML table (<table>) -
stream-chunkrec
Recombine LLM streaming token deltas into stable text. Buffers partial words, handles UTF-8 fragments across chunks. Zero deps.
-
mdwright-document
Recognised Markdown document facts with stable source coordinates
-
prompt-fence-strip
Strip
code fences, leading prose, and trailing chatter from LLM output so the structured payload survives. Zero deps. -
krilla-svg
Converting SVG files to PDF
-
tiny-grep
grep-like text search utility written in Rust
-
advent-ocr
Converts ASCII-art representations of letters generated by Advent of Code puzzles into a String containing those letters
-
string-replace-all
String replacement utility inspired by JavaScript, allowing pattern-based substitutions with support for both exact matches and regex patterns
-
flabild
A fast Markov chain-based fake word generator that produces pronounceable pseudo-words
-
lingua-dutch-language-model
The Dutch language model for Lingua, an accurate natural language detection library
-
ipa-translate
translating between IPA and ASCII text
-
random-zh
generating random Chinese characters
-
vds
Visibly distinguishable string types for identifiers and codes
-
semtree-store
Vector store trait and backends (usearch)
-
rdocx-oxml
WordprocessingML XML element types for OOXML
-
write16
A UTF-16 analog of the Write trait
-
mitex-lexer
Lexer for MiTeX
-
katha-parsers
Parser adapters for EPUB, DOCX, and PDF document ingestion
-
secret-mask
Mask known secret patterns (API keys, JWTs, AWS access keys, GitHub tokens) in log lines before they reach stdout/files/sinks. Zero deps.
-
marisa-rs
Safe Rust wrapper for the marisa-trie C++ library
-
sesdiff
Generates a shortest edit script (Myers' diff algorithm) to indicate how to get from the strings in column A to the strings in column B. Also provides the edit distance (levenshtein).
-
thediff
Difference between 2 files in percentages
-
brainwires-rag
Codebase indexing + hybrid retrieval (vector + BM25) for the Brainwires Agent Framework. Includes AST-aware chunking via tree-sitter (12 languages), Git history search, and reranking…
-
cargo-cargofmt
Cargo file formatter
-
lexmatch
lexicon matching tool that, given a lexicon of words or phrases, identifies all matches in a given target text. Uses suffix arrays.
-
asimov-dataset-cli
ASIMOV Dataset Command-Line Interface (CLI)
-
magic_string
magic string
-
rheo-pdf
A typesetting and static site engine based on Typst
-
probe-code
AI-friendly, fully local, semantic code search tool for large codebases
-
orgflow-tui
A terminal user interface for orgflow - manage notes and tasks with a smooth workflow
-
words-count
Count the words and characters, with or without whitespaces
-
homoglyph-detect
Detect Cyrillic/Greek lookalike chars masquerading as ASCII. For prompt-injection and phishing defense. Zero deps.
-
rheo-html
A typesetting and static site engine based on Typst
-
mdbook-latex
An mdbook backend for generating LaTeX and PDF documents
-
esl01-drawdag
Parse an ASCII DAG into parent relations
-
dala
Dalia is a light weight formula language
-
mq-crawler
Directory crawler for batch Markdown file processing
-
yeslogic-unicode-blocks
Functions to access and search Unicode blocks
-
typos
Source Code Spelling Correction
-
mitex-spec
Specification Library for MiTeX
-
serde_ssml
A robust Rust library for parsing, manipulating, and generating Speech Synthesis Markup Language (SSML) documents
-
aki-json-pick
The json pick out command
-
pray
A tui tool for preparing a prompt to the llms
-
kiri-native
Native Rust accelerator for Kiri Japanese morphological analyzer
-
table-reformatter
reformat tables in markdown files
-
dcsv
Dyanmic csv reader,writer,editor
-
laurus-mcp
MCP (Model Context Protocol) server for the Laurus search engine
-
golia-pinyin
Self-developed Mandarin Pinyin input method engine — segmenter, fuzzy syllables, FST dict, WASM-ready
-
kodegen_native_notify
KODEGEN.ᴀɪ: Memory-efficient, Blazing-Fast, MCP tools for code generation agents
-
asimov-readwise-module
ASIMOV module
-
varcon
Source Code Spelling Correction
-
batuta-common
Shared utilities for the Batuta stack: formatting, system info, display helpers
-
ucd-generate
A program for generating packed representations of the Unicode character database that can be efficiently searched
-
papyrus-cli
Command-line tool for PDF-to-Markdown conversion with smart heading detection, bold/italic extraction, and CommonMark output. Pure Rust, pipe-friendly.
-
ascii_help
help you quickly convert ASCII codes
-
shoco
port to Rust, a fast compressor for short strings
-
tuillem-db
SQLite storage layer for tuillem
-
kras
Detect, highlight and pretty print almost any structured data inside plain text
-
typf-os
Platform-native linra text rendering dispatcher
-
ratex-render
Raster and image rendering for RaTeX math typesetting
-
oxyl-parser
Parser and AST types for oxyl
-
lingua-irish-language-model
The Irish language model for Lingua, an accurate natural language detection library
-
neco-syntax-textmate
TextMate-style syntax loading and tokenization on top of syntect
-
graphrag-cli
Modern Terminal User Interface (TUI) for GraphRAG operations
-
flux-tui
Fast and lightweight Terminal UI drawing library
-
typf-shape-icu-hb
ICU + HarfBuzz shaping backend for Typf
-
venus-sync
Sync engine for Venus - converts .rs notebooks to .ipynb
-
typub-adapter-wordpress
WordPress platform adapter for typub
-
unitoken
Fast BPE tokenizer/trainer with a Rust core and Python bindings
-
ratex-svg
SVG export for RaTeX DisplayList (vector output, optional KaTeX webfonts)
-
pdfrust
PDF parser
-
mq-lsp
Language Server Protocol implementation for mq query language
-
kizame
(刻め!) - CLI for MeCrab morphological analyzer and data pipeline
-
scivex-nlp
Scivex — Tokenization, embeddings, and text processing
-
wikiext-cli
Wikiext is a tool for extracting and processing Wikipedia data, implemented in Rust
-
opengrep
Advanced AST-aware code search tool with tree-sitter parsing and AI integration capabilities
-
strloin
copy on write slices of a string
-
fast-str
A flexible, easy-to-use, immutable, efficient
Stringreplacement for Rust -
pdf-docx
PDF to DOCX conversion with text, tables, and images
-
rushdown-highlighting
Syntax -highlighting extension for rushdown markdown parser
-
rsticle-cli
Command line tool to convert source files into narratives/articles
-
rushdown-definition-list
Definition list extension for rushdown markdown parser
-
rins_markdown_parser
markdown parser written on Rust
-
mdwright-latex
TeX math-body parsing, Unicode layout, and source translation for mdwright
-
lindera-sqlite
Lindera tokenizer for SQLite FTS5 extention
-
sayit
String replacements using regex
-
fast-unescape
'unescapes' a escaped string with escape sequences into literal one
-
markdown-to-html
Markdown parser that runs at hyper speeds!
-
distri-formatter
Shared event formatting logic for Distri (plain text, HTML, etc.)
-
soft-ascii-string
char/str/string wrappers which add a "is-ascii" soft constraint
-
mandate
Convert Markdown or YAML manuals into roff manpages
-
neco-editor-search
Text search engine for editor buffers
-
m2h
Convert Markdown to HTML with syntax highlighting
-
lex-babel
Format conversion library for the lex format
-
rhema_ingest_chirho
SWORD/OSIS/TEI/IMP importers and normalization into canonical corpus
-
typub-adapter-astro
Astro Content Collection adapter for typub - outputs Markdown with YAML frontmatter
-
mitex-parser
Parser for MiTeX
-
ascii_converter
converting between different ascii representations
-
toklab-core
Pure-Rust core for toklab: bulk tokenizer + counter for OpenAI BPE encodings
-
rucora-embed
Embedding providers for rucora
-
ascii_tree
generates ascii trees
-
portmanteau
create portmanteaux
-
dekor
styling and character repository in Rust
-
lingua-chinese-language-model
The Chinese language model for Lingua, an accurate natural language detection library