Overview • Features • Installation • Usage • Security • Performance • Development • Structure
Primary Function: Execute Unicode smuggling attacks including Trojan Source, homoglyph substitution, and invisible character encoding to hide malicious code in plain sight
noseeum is a modular offensive security framework for executing Unicode-based attacks
noseeum encodes their payload in the same/similar fashion as exhibited in the "GlassWorm" malware of late 2025
noseeum employs a range of obfuscation and encoding techniques into an extensible CLI
Below is a screencap of the VirusTotal analysis of the unencoded powershell malware (BEFORE processing with noseeum) as well as its "MITRE ATT&CK Tactics and Techniques" Chart
- NOTE THE
8/62DETECTION RATE - HASH =
f6adc7db3ce7e756bcfd995c6bfeae1480e4626ab4c049644754903e2610a104
Below is a screencap of the VirusTotal analysis of the Zero Width Character-encoded powershell malware (AFTER processing with noseeum) as well as its "MITRE ATT&CK Tactics and Techniques" Chart
- NOTE THE
0/62DETECTION RATE - HASH =
b700553732b9c8c2843885dc4f1122d2471beac47d682e67863f81cbb6d9a55f
Noseeum provides a single, clean command-line interface powered by Python's click library
-
Modular Architecture: Each attack vector is a self-contained module, allowing for rapid development and integration of new exploits
-
Multiple Attack Vectors:
Bidi (Trojan Source): Make malicious code appear as harmless commentsHomoglyph: Evade signature-based detection and confuse human analysts by substituting characters with visually identical onesInvisible Ink: Hide payloads steganographically within benign text or generate imperceptible prompts to jailbreak LLMsFile Steganography: Encode entire files as zero-width character sequences and decode them backLanguage-Specific Exploits: Target unique weaknesses in Python, JavaScript, and JavaNormalization Exploitation: Craft payloads that normalize differently across system components (parser vs. scanner)Unassigned Planes / Variation Selectors: Generate syntactically valid identifiers using characters from unassigned Unicode planes (U+20000–U+2FFFD)Payload-injection via Identifier Characters: Encode malicious data within language constructs like object properties, class names, or function names
-
Advanced Language Modules:
Go: Exploits Go's configurable lexer and permissive Unicode handlingKotlin: Uses permissive frontend with restrictive backend to create compilation-failing codeJavaScript: Performs AST-level manipulations and low-entropy payload generationSwift: Leverages ambiguous identifier handling and unassigned planes support
-
Globally Installable`: Can be installed as a system-wide command-line tool using pip
Includes a scanner to identify the presence of these same Unicode smuggling vulnerabilities in source code
- File Vulnerability Scanning: Scan individual files for Unicode smuggling vulnerabilities
- Multi-Language Support: Detect vulnerabilities across Python, JavaScript, Java, and other languages
- Comprehensive Detection: Identifies various types of Unicode exploits including Bidi, homoglyphs, and invisible characters
noseeum can be installed as a globally accessible command-line tool:
-
Clone the repository:
git clone <repository_url> cd noseeum
-
Install required data files: Before using the framework, you need to generate the required registry files:
python3 create_registry.py # Creates homoglyph_registry.json python3 create_nfkc_map.py # Creates nfkc_map.json
-
Install the package:
pip install .or using the Makefile:
make install
This will install the noseeum command globally on your system, making it accessible from any directory
To remove the globally installed package:
make uninstallAll functionality is accessed through the noseeum command
View all available commands:
noseeum --helpView attack-specific commands:
noseeum attack --helpScan a file for vulnerabilities:
noseeum detect --file /path/to/your/file.jsFor a complete breakdown of every command, option, and argument, refer to the USAGE.md document
This project uses a Makefile to streamline common development tasks.
make install: Sets up the development environment, installs dependencies fromrequirements.txt, creates required data files, and installs thenoseeumpackage in editable modemake uninstall: Removes thenoseeumpackage from your systemmake clean: Deletes all build artifacts, such asbuild/,dist/, and.egg-info/directories
The framework is organized as follows:
noseeum/: Main Python package containing:attacks/: Individual modules for each attack vectorcore/: Core engine, grammar database, and integration componentsdetector/: Scanning and detection functionalityutils/: Helper utilities and error handlingdata/: Embedded data files (homoglyph_registry.json, nfkc_map.json)
create_registry.py: Script to generate the homoglyph registrycreate_nfkc_map.py: Script to generate the NFKC mapping
- Fixed critical logic bug in homoglyph identifier replacement that could cause incorrect output
- Added Python 3.8+ compatibility by replacing Python 3.9+ type annotations
- Improved error handling by replacing bare except clauses with proper exception types
- Consolidated duplicate code by moving file encoding logic to shared utilities
- Enhanced CLI consistency by standardizing error output with click.echo()
- Completed language support by adding grammar definitions for Java, Rust, C, and C++
- Improved path validation with more reliable directory traversal prevention
- Added pytest dependency to requirements for proper test execution
Run the test suite with:
pip install -e ".[dev]" # Install with dev dependencies
pytest tests/