1 unstable release
| new 0.1.0 | May 16, 2026 |
|---|
#2517 in Text processing
7KB
zero-width-strip
Strip zero-width and bidi-control Unicode characters from text.
Zero-width characters (U+200B–U+200F, U+2060, U+FEFF, etc.) and bidi overrides (U+202A–U+202E) are invisible in most renderers but are preserved by most tokenizers, which makes them a clean payload channel for prompt-injection attacks ("invisible instructions" hidden inside otherwise plain text).
This crate strips them.
Example
use zero_width_strip::{strip, has_invisible};
let dirty = "hello\u{200B}\u{202E}world";
assert!(has_invisible(dirty));
assert_eq!(strip(dirty), "helloworld");
zero-width-strip
Strip zero-width and bidi-control Unicode chars from text. Closes the "invisible payload" prompt-injection channel.
use zero_width_strip::{strip, has_invisible};
let dirty = "hello\u{200B}\u{202E}world";
assert!(has_invisible(dirty));
assert_eq!(strip(dirty), "helloworld");
Covers U+200B–U+200F, U+202A–U+202E, U+2060–U+2064, U+2066–U+2069, U+180E, U+FEFF. Zero deps. MIT or Apache-2.0.