__ __ _ ____ _____
\ \ / /__| |__ / ___| ___ __ _ _ __ _ __|___ / _ __
\ \ /\ / / _ \ '_ \\___ \ / __/ _` | '_ \| '_ \ |_ \| '__|
\ V V / __/ |_) |___) | (_| (_| | | | | | | |__) | |
\_/\_/ \___|_.__/|____/ \___\__,_|_| |_|_| |_|____/|_|
A comprehensive web reconnaissance tool for red team assessments. This tool scans and maps a website before an attack phase, helping security professionals to understand the attack surface of a target.
- Web Crawling: Recursively crawls the target website to discover pages and endpoints
- Resource Mapping: Creates a directory structure that mirrors the website's organization
- File Downloading: Downloads code files (.js, .php, .css, .html) for analysis
- Deep Analysis: Scans JavaScript files for additional links and resources
- Security Analysis: Checks for potentially dangerous code and security issues
- Function Usage Tracking: Reports on how frequently each function is used across the codebase
- API Endpoint Detection: Automatically identifies and catalogs API endpoints and routes
- Technology Stack Identification: Detects software versions and creates a technology profile
- Domain Scope Control: Option to limit scanning to the target domain or include external domains
- Media Filtering: Option to skip media files (images, videos) to save space and time
- Comprehensive Reporting: Generates detailed reports of the scan findings and security issues
- JSON Data Exports: Provides structured JSON files for integration with other tools
- JSON Dump: Creates a structured JSON file with all discovered files and directories
- Target Organization: Organizes all outputs by target site in a structured directory layout
- Automatic Code Beautification: All downloaded
.js,.html, and.cssfiles are automatically beautified/pretty-printed before analysis and reporting, making security findings easier to review and learn from. - Comprehensive Pattern Documentation: All vulnerability patterns are now well-documented, with clear descriptions, risk levels, OWASP categories, and mitigation guidance for educational value.
- Improved URL Handling: URLs missing a scheme (http/https) are now auto-corrected and a warning is shown in the CLI.
- Better Error Reporting: Improved error and warning messages for file handling and BeautifulSoup parsing.
- Robustness: Type checks and error handling improved for HTML parsing and code analysis.
- Beautified code is now used for all analysis and reporting (not just saved to disk).
- Security pattern coverage and documentation improved for all major vulnerability types.
- CLI and reporting usability enhancements.
-
Clone the repository:
git clone https://github.com/108806/webscann3r.git cd webscann3r -
Install requirements:
pip install -r requirements.txt
-
Make the script executable:
chmod +x webscann3r.py
Basic usage:
./webscann3r.py https://example.comusage: webscann3r.py [-h] [-d DOWNLOADS] [-r REPORTS] [-a] [-m] [-z] [-t]
[-j THREADS] [--timeout TIMEOUT] [-v] [-q]
url
WebScann3r - A Web Scanning and Mapping Tool for Red Teams
positional arguments:
url Target URL to scan
options:
-h, --help show this help message and exit
-d DOWNLOADS, --downloads DOWNLOADS
Base directory for downloads (default: ./targets)
-r REPORTS, --reports REPORTS
Base directory for reports (default: ./targets)
-a [DEPTH], --all-domains [DEPTH]
Scan all linked domains with specified depth (default: unlimited depth)
-m, --media Download media files (images, videos, etc.)
-z, --archives Download archive files (zip, tar, etc.)
-t, --text Download text files (txt, md, etc.)
-j THREADS, --threads THREADS
Number of concurrent threads (default: 10)
--timeout TIMEOUT Request timeout in seconds (default: 30)
-v, --verbose Enable verbose output
-q, --quiet Suppress all output except errors
-
Basic scan of a website:
./webscann3r.py https://example.com
-
Scan with more threads for faster operation:
./webscann3r.py https://example.com -j 20
-
Scan and download media files as well:
./webscann3r.py https://example.com -m
-
Scan all linked domains (not just the target):
./webscann3r.py https://example.com -a
-
Scan all linked domains with depth limit of 1 (only direct links):
./webscann3r.py https://example.com -a 1
-
Complete scan with all file types:
./webscann3r.py https://example.com -a -m -z -t
-
Verbose output for debugging:
./webscann3r.py https://example.com -v
After scanning, WebScann3r generates several reports in the target-specific reports directory:
- security_report.md: Details all potential security issues found in the code
- function_usage_report.md: Shows how many times each function is called
- final_report.md: A comprehensive summary of the scan results
- discovered_files_dirs.json: A structured JSON file containing all discovered files and directories
- discovered_endpoints.json: A JSON file listing all detected API endpoints and routes
- discovered_versions.json: A JSON file containing all detected software versions and technology stack information
The reports are organized by target site with timestamps in a structure like:
targets/
└── example.com_20250515_120000/
├── downloads/
│ └── (downloaded files)
└── reports/
├── final_report.md
├── security_report.md
├── function_usage_report.md
├── discovered_files_dirs.json
├── discovered_endpoints.json
└── discovered_versions.json
This structure ensures each scan is isolated and timestamped for better organization.
WebScann3r is designed to help both red teamers and defenders by distinguishing between potential attack surfaces (sinks) and actual vulnerabilities. This section clarifies what you should expect in the reports:
-
Sinks (in
sinks.md):- These are code locations where dangerous functions (like
exec,eval,system, etc.) are called, regardless of what input is used. - Sinks are not always vulnerabilities. They are places a red teamer should consider for fuzzing or further review.
- Example:
exec($foo)will be listed as a sink, even if$foois not user-controlled.
- These are code locations where dangerous functions (like
-
Vulnerabilities (in
security_report.md):- Only code patterns that match known dangerous usage (e.g., user input passed to
exec, likeexec($_GET['cmd'])) are reported as vulnerabilities. - These are the issues that are most likely to be exploitable without further manual investigation.
- Only code patterns that match known dangerous usage (e.g., user input passed to
- Not every use of a dangerous function is a vulnerability. Many are safe, or only dangerous if user input is involved.
- Reporting every sink as a vulnerability would create too much noise and lead to false positives.
- This approach helps you focus on real issues, while still giving you the option to review all potentially risky code.
- You may see many
execorevalcalls insinks.md, but only a few (or none) insecurity_report.md. - This is expected and correct. If you believe a real vulnerability is missed, check the relevant code and consider improving the regex patterns or reporting logic.
- Start with
security_report.mdto find likely vulnerabilities. - Use
sinks.mdto guide manual code review, fuzzing, or further dynamic testing. - Review the function usage and endpoint reports for additional context.
If you are unsure whether a finding is a vulnerability or just a sink, consult this section before opening a bug report.
- Start with a basic scan to understand the website structure
- Use the
-aflag cautiously as it may scan external domains - Review the security report to identify potential vulnerabilities
- Examine the function usage report to understand the application flow
- Check downloaded code files for additional security issues or attack vectors
This tool is created for legitimate security testing purposes. Only use it on websites that you own or have explicit permission to test. Unauthorized scanning may be illegal in your jurisdiction.
This project is licensed under the MIT License - see the LICENSE file for details.