A modern, high-performance regular expression library for Zig
Features • Installation • Quick Start • Documentation • Performance
zig-regex is a comprehensive regular expression engine for Zig featuring Thompson NFA construction with linear time complexity, extensive pattern support, and advanced optimization capabilities. Built with zero external dependencies and full memory control through Zig allocators.
| Feature | Syntax | Description |
|---|---|---|
| Literals | abc, 123 |
Match exact characters and strings |
| Quantifiers | *, +, ?, {n}, {m,n} |
Greedy repetition |
| Lazy Quantifiers | *?, +?, ??, {n,m}? |
Non-greedy repetition |
| Possessive Quantifiers | *+, ++, ?+, {n,m}+ |
Atomic repetition (no backtracking) |
| Alternation | a|b|c |
Match any alternative |
| Character Classes | \d, \w, \s, \D, \W, \S |
Predefined character sets |
| Custom Classes | [abc], [a-z], [^0-9] |
User-defined character sets |
| Unicode Classes | \p{Letter}, \p{Number}, \X |
Unicode property support |
| Anchors | ^, $, \A, \z, \Z, \b, \B |
Position matching |
| Wildcards | . |
Match any character |
| Groups | (...) |
Capturing groups |
| Named Groups | (?P<name>...), (?<name>...) |
Named capturing groups |
| Non-capturing | (?:...) |
Grouping without capture |
| Atomic Groups | (?>...) |
Possessive grouping |
| Lookahead | (?=...), (?!...) |
Positive/negative lookahead |
| Lookbehind | (?<=...), (?<!...) |
Positive/negative lookbehind |
| Backreferences | \1, \2, \k<name> |
Reference previous captures |
| Conditionals | (?(condition)yes|no) |
Conditional patterns |
| Escaping | \\, \., \n, \t, etc. |
Special character escaping |
- Hybrid Execution Engine: Automatically selects between Thompson NFA (O(n×m)) and optimized backtracking
- AST Optimization: Constant folding, dead code elimination, quantifier simplification
- NFA Optimization: Epsilon transition removal, state merging, transition optimization
- Pattern Macros: Composable, reusable pattern definitions
- Type-Safe Builder API: Fluent interface for programmatic pattern construction
- Thread Safety: Safe concurrent matching with proper synchronization
- C FFI: Complete C API for interoperability
- WASM Support: WebAssembly compilation target
- Profiling & Analysis: Built-in performance profiling and pattern linting
- Comprehensive API:
compile,find,findAll,replace,replaceAll,split, iterator support
- Zero Dependencies: Only Zig standard library
- Linear Time Matching: Thompson NFA guarantees O(n×m) worst-case
- Memory Safety: Full control via Zig allocators, no hidden allocations
- Extensive Tests: Comprehensive test suite with 150+ test cases
- Battle-Tested: Compliance tests against standard regex behavior
// build.zig.zon
.{
.name = "your-project",
.version = "0.1.0",
.dependencies = .{
.regex = .{
.url = "https://github.com/zig-utils/zig-regex/archive/main.tar.gz",
.hash = "...", // zig will provide this
},
},
}// build.zig
const regex = b.dependency("regex", .{
.target = target,
.optimize = optimize,
});
exe.root_module.addImport("regex", regex.module("regex"));git clone https://github.com/zig-utils/zig-regex.git
cd zig-regex
zig buildconst std = @import("std");
const Regex = @import("regex").Regex;
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
// Simple matching
const regex = try Regex.compile(allocator, "\\d{3}-\\d{4}");
defer regex.deinit();
if (try regex.find("Call me at 555-1234")) |match| {
std.debug.print("Found: {s}\n", .{match.slice}); // "555-1234"
}
}const regex = try Regex.compile(allocator, "(?P<year>\\d{4})-(?P<month>\\d{2})-(?P<day>\\d{2})");
defer regex.deinit();
if (try regex.find("Date: 2024-03-15")) |match| {
const year = match.getCapture("year"); // "2024"
const month = match.getCapture("month"); // "03"
const day = match.getCapture("day"); // "15"
}// Match any Unicode letter
const regex = try Regex.compile(allocator, "\\p{Letter}+");
// Match emoji
const emoji_regex = try Regex.compile(allocator, "\\p{Emoji}");
// Match grapheme clusters
const grapheme_regex = try Regex.compile(allocator, "\\X+");// Prevent catastrophic backtracking
const regex = try Regex.compile(allocator, "(?>a+)b");
const poss_regex = try Regex.compile(allocator, "a++b");
// These won't match "aaaa" - no backtracking allowed
try std.testing.expect(try regex.find("aaaa") == null);
try std.testing.expect(try poss_regex.find("aaaa") == null);// Match different patterns based on a condition
const regex = try Regex.compile(allocator, "(a)?(?(1)b|c)");
try std.testing.expectEqualStrings("ab", (try regex.find("ab")).?.slice);
try std.testing.expectEqualStrings("c", (try regex.find("c")).?.slice);const Builder = @import("regex").Builder;
var builder = Builder.init(allocator);
defer builder.deinit();
const pattern = try builder
.startGroup()
.literal("https?://")
.oneOrMore(Builder.Patterns.word())
.literal(".")
.oneOrMore(Builder.Patterns.alpha())
.endGroup()
.build();
const regex = try Regex.compile(allocator, pattern);
defer regex.deinit();const MacroRegistry = @import("regex").MacroRegistry;
const CommonMacros = @import("regex").CommonMacros;
var macros = MacroRegistry.init(allocator);
defer macros.deinit();
// Load common macros
try CommonMacros.loadInto(¯os);
// Define custom macros
try macros.define("phone", "\\d{3}-\\d{4}");
try macros.define("email", "${email_local}@${email_domain}");
// Expand macros in patterns
const pattern = try macros.expand("Contact: ${email} or ${phone}");
defer allocator.free(pattern);- API Reference - Complete API documentation
- Advanced Features Guide - Detailed feature explanations
- Architecture - Design and implementation
- Examples - Real-world usage examples
- Performance Guide - Optimization tips
- Limitations - Known constraints and workarounds
zig-regex uses Thompson NFA construction to guarantee O(n×m) worst-case time complexity:
- n = input string length
- m = pattern length
This prevents catastrophic backtracking that plagues traditional regex engines.
Pattern: /\d{3}-\d{4}/
Input: 1000-byte string
Time: ~850ns (M1 MacBook Pro)
Pattern: /(?:a|b)*c/
Input: 10000 'a's + 'c'
Time: Linear growth (no exponential backtracking)
Run benchmarks: zig build bench
# Build library
zig build
# Run tests
zig build test
# Run examples
zig build example
# Run benchmarks
zig build bench
# Generate documentation
zig build docsSee TODO.md for the complete development roadmap and planned features.
- Zig 0.15.1 or later
- No external dependencies
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
MIT License - see LICENSE file for details.
Inspired by:
- Ken Thompson's NFA construction algorithm
- RE2 (Google's regex engine)
- Rust's regex crate
- PCRE (Perl Compatible Regular Expressions)