Production-grade parser for HQL (Hive Query Language), a modern, verb-first query language designed for high-concurrency, low-latency applications.
| Workload | Throughput | Latency |
|---|---|---|
| PING (micro-query) | 10.8M ops/sec | 93 ns |
| GET (micro-query) | 5.1M ops/sec | 197 ns |
| Simple FIND | 1.1M ops/sec | 904 ns |
| INSERT (10 fields) | 500K ops/sec | 1.9 us |
| TXN (5 statements) | 200K ops/sec | 5.0 us |
| Complex FIND (3 JOINs) | 170K ops/sec | 5.9 us |
Measured with Criterion on a single core. No heap allocations in the lexer hot path.
Add to your Cargo.toml:
[dependencies]
hql-parser = "0.9"Parse a query:
use hql_parser::ir::IrStmt;
let ir = hql_parser::parse(b"FIND users WHERE active == true SELECT name, email").unwrap();
for stmt in &ir.stmts {
match stmt {
IrStmt::Find(find) => println!("querying: {}", find.target),
_ => {}
}
}Every call to parse() runs five stages:
source bytes --> Lexer --> Micro-router --> Parser --> Validator --> IR
-
Lexer — Byte-level tokeniser. Zero-copy, binary-search keyword lookup over a 68-entry sorted table. Each token is 12 bytes.
-
Micro-router — Pattern-matches
PING,GET, andDELqueries directly from the token stream, bypassing the full parser entirely. -
Parser — Recursive-descent with Pratt expression parsing. Single-token lookahead, no backtracking. Handles all 11 statement types.
-
Validator — Post-parse semantic checks: JOIN count, TXN size, null ordering rejection, path depth limits.
-
IR — Converts the AST into a flat, span-free intermediate representation with owned strings, ready for query planning.
| Statement | Description |
|---|---|
FIND |
Query with JOIN, WHERE, SELECT DISTINCT, GROUP BY, HAVING, ORDER BY, LIMIT, OFFSET |
INSERT |
Single-row VALUES or multi-row BATCH [(...), (...)] |
UPDATE |
SET with optional WHERE and RETURNING |
DELETE |
Optional WHERE and RETURNING |
UPSERT |
KEY-based insert-or-update with ON CONFLICT SET |
TXN |
Atomic transaction with IF/ELSE and REQUIRE guards |
REQUIRE |
Guard clause — aborts with error code on failure |
IF/ELSE |
Conditional execution inside TXN blocks |
DEFINE |
Schema definition with typed fields, size limits, and constraints |
ALTER |
ADD/DROP/RENAME FIELD, ADD/DROP INDEX |
EXPLAIN |
Query plan inspection |
- Arithmetic:
+-*/% - Comparison:
==!=>>=<<= - Boolean:
ANDORNOT - Membership:
IN [...]andNOT IN [...] - Range:
BETWEEN low, high - String matching:
STARTS_WITHCONTAINSMATCHES - Aliasing:
SELECT expr AS alias - Functions:
now(),count(*),sum(),avg(),min(),max(),coalesce(),is_null(),verify_hash(),affected_rows(),exists()
Every error carries a machine-readable code, a human-readable message, and a byte-level span pointing to the exact source location:
match hql_parser::parse_str("FIND $invalid") {
Ok(_) => {}
Err(e) => {
println!("[{}] {}", e.code, e.message);
println!(" at bytes {}..{}", e.span.start, e.span.end);
if !e.expected.is_empty() {
println!(" expected: {}", e.expected.join(", "));
}
}
}All limits are enforced at the earliest possible parsing stage to prevent resource exhaustion from untrusted input:
| Limit | Value | Stage |
|---|---|---|
| Max tokens per query | 4,096 | Lexer |
| Max identifier length | 128 bytes | Lexer |
| Max string literal | 64 KB | Lexer |
| Max nesting depth | 32 | Parser |
| Max TXN statements | 64 | Parser |
| Max JOIN count | 16 | Validator |
| Max path depth | 8 segments | Validator |
The crate enforces #![deny(unsafe_code)] — zero unsafe blocks. The lexer
has been brute-force tested against all 65,536 two-byte input combinations
with zero panics.
cargo run --example basic_parse
cargo run --example error_handling
cargo run --example micro_queries
cargo run --example transactioncargo benchHTML reports are generated in target/criterion/.
Licensed under Apache 2.0. Copyright 2025 HiveQL.