Version: 2.0.0
- What is AFDF?
- Why Does This Tool Exist?
- How It Works - High Level Overview
- Key Concepts Explained
- Features
- Prerequisites
- Installation
- Running the Application
- Using the Web Interface
- Understanding the Report
- Technical Details
- Troubleshooting
AFDF (Anti-Forensic Detection Framework) is a digital forensics tool that helps investigators analyze disk images to find:
- Hidden encrypted volumes - Areas that might contain secret encrypted data
- Wipe patterns - Signs that data was deliberately destroyed
- Evidence tampering - Modifications to the original data
- Deleted files - Files that were removed but might still be recoverable
- Unallocated space anomalies - Suspicious patterns in free disk space
Think of it like a medical scanner for hard drives - it looks deep inside to find things that aren't visible to the naked eye.
When investigators (like police or forensic experts) get a computer as evidence, they often work with a disk image - an exact copy of everything on the hard drive. Criminals often try to hide evidence by:
- Encrypting hidden volumes with special software (like VeraCrypt)
- Securely deleting files to make them unrecoverable
- Wiping entire drives to destroy evidence
- Using anti-forensic tools like DBAN, SDelete, or BleachBit
AFDF analyzes the patterns in a disk image to detect these anti-forensic techniques. It uses:
- Statistical analysis - Finding unusual patterns in the data
- Machine learning - Ensemble of Random Forest + Isolation Forest for classification
- Forensics tools - Industry-standard tools from The Sleuth Kit
- Unallocated space analysis - Detecting wipe patterns in free disk space
Here's what happens when you upload a disk image:
1. UPLOAD
You upload a disk image file (.dd, .E01, .img, .raw)
│
▼
2. FILE VALIDATION
System calculates hashes (MD5, SHA256, SHA1) for integrity
Detects file type using magic bytes
Identifies embedded filesystem
│
▼
3. RUST ANALYZER
Fast analysis of file signatures
Anti-forensic tool detection (DBAN, SDelete, BleachBit)
Timestamp anomaly detection
Hidden data detection
│
▼
4. PYTHON ENTROPY & WIPE SCAN
Reads disk in small 4KB blocks
Calculates Shannon entropy for each block
Identifies high-entropy regions
Analyzes unallocated space for wipe patterns
│
▼
5. MACHINE LEARNING (Ensemble)
Random Forest + Isolation Forest classification
Analyzes 18 features including unallocated space
Outputs: AUTHENTIC / QUESTIONABLE / TAMPERED
│
▼
6. FORENSICS TOOLS
Parses partition tables (mmls)
Analyzes filesystem metadata (fsstat)
Finds deleted file entries (fls)
Extracts artifacts (bulk_extractor)
│
▼
7. REPORT GENERATION
Creates comprehensive forensic report
Includes all findings, evidence, limitations
Ready for court proceedings
Entropy is a measure of how random or unpredictable data is. Think of it like this:
| Data Type | Entropy | Example |
|---|---|---|
| Zero (all zeros) | 0.0 | 00 00 00 00 00 |
| Text (repetitive) | 2-4 | hello hello hello |
| Compressed | 6-7 | ZIP files, images |
| Encrypted | 7.5-8.0 | Random-looking data |
| Fully Random | 8.0 | A9 F3 7C 2B 1E... |
AFDF flags regions with high entropy because they might contain hidden encrypted volumes.
A disk image is an exact copy of an entire hard drive, including:
- All files (even deleted ones)
- Deleted disk
- The data still on the filesystem structure
- Empty space
- Unallocated (free) space
Common formats:
.ddor.raw- Raw binary copy.E01- EnCase format (compressed).img- Generic disk image
When someone wants to permanently delete data, they might "wipe" the drive by overwriting it with:
- Zeros -
00 00 00 00 - All FFs -
FF FF FF FF - Random data -
A3 7F 2B 9C... - DoD 5220.22 - Specific overwrite patterns
- Gutmann - 35-pass overwrite
AFDF detects these patterns, especially in unallocated space, to identify evidence destruction.
Chain of custody documents everyone who touched the evidence and when. It's critical for court cases to prove the evidence wasn't tampered with.
Unallocated space is the free space on a disk where deleted files can sometimes be recovered. This is a critical area for forensic analysis because:
- Secure deletion tools often overwrite this space
- Hidden volumes may leave traces here
- Wipe patterns are most visible in unallocated regions
- Shannon entropy calculation per 4KB block
- Chi-square distribution testing
- Byte frequency analysis
- Serial correlation detection
- High-entropy region identification
- Random Forest Classifier - Classification into AUTHENTIC/QUESTIONABLE/TAMPERED
- Isolation Forest - Anomaly detection in disk features
- 18 features including unallocated space analysis
- Confidence scoring and tamper probability
- Zero-fill detection
- FF-fill detection
- Random-wipe detection
- DoD 5220.22 and Gutmann patterns
- Unallocated space analysis
- File signature detection
- Anti-forensic tool detection (DBAN, SDelete, BleachBit)
- Timestamp anomaly detection
- Hidden data detection in slack space
- mmls - Partition table mapping (MBR/GPT)
- fsstat - Filesystem metadata analysis
- fls - Deleted file listing
- blkls - Unallocated space extraction
- bulk_extractor - Email, URL, IP extraction
- Magic bytes detection
- Hash calculation (MD5, SHA1, SHA256)
- Filesystem identification (NTFS, FAT32, ext2/3/4, exFAT, HFS+, APFS)
- 22-section comprehensive reports
- Customizable examiner information
- Professional formatting
Before you begin, you need:
- Python 3.10 or higher - Download from python.org
- Node.js 18 or higher - Download from nodejs.org
- The Sleuth Kit (TSK) - For forensics tools
- Windows: Download from sleuthkit.org
- Mac:
brew install sleuthkit - Linux:
sudo apt-get install sleuthkit
- 8GB RAM minimum (16GB recommended for large images)
pip install -r requirements.txt
cd server
npm installcd server/ml-api
pip install -r requirements.txt# 1. Clone the repository
git clone <repository-url>
cd AFDF-updated_updated_more
# 2. Install Python dependencies
pip install -r requirements.txt
# 3. Install Node.js dependencies
cd server
npm install
cd ..
# 4. Install ML API dependencies
cd server/ml-api
pip install -r requirements.txt
cd ../..For Windows users, simply double-click the Start_AFDF.bat file in the root directory.
This shortcut will instantly boot up the Frontend, the Node.js Backend, and the Machine Learning API simultaneously in a single terminal.
If you don't want to use the .bat file, you can run the same command manually:
npm run start:allWait a few seconds for the services to boot up, then navigate to http://localhost:8081.
If you prefer to see the logs separated into individual windows or need to restart one service independently:
Terminal 1 - ML API (Port 3002):
cd server/ml-api
python -m uvicorn main:app --port 3002Terminal 2 - Backend Server (Port 3001):
cd server
npm startTerminal 3 - Frontend (Port 8081):
npm run devNavigate to http://localhost:5173 in your web browser.
Click the upload area and select your disk image file.
- Supported formats: .dd, .E01, .img, .raw, .E01
The system will:
- Calculate file hashes (MD5, SHA1, SHA256)
- Run Rust analyzer for fast signature detection
- Analyze entropy patterns in 4KB blocks
- Detect wipe patterns in unallocated space
- Run forensics tools (partitions, filesystem, deleted files)
- Run ML ensemble classification (Random Forest + Isolation Forest)
Click on any analysis to see:
- Dashboard - Overview of findings with severity ratings
- Full Report - Detailed 22-section report
- Fill in examiner information
- Add case details
- Review all sections
- Download the final report
The AFDF report contains 22 sections:
Your name, title, organization, and qualifications.
Case number, legal authority, court case number.
- File name and size
- Acquisition tool used
- Write blocker information
- Original hash values (for integrity verification)
- MD5, SHA1, and SHA-256 hashes calculated from the uploaded file
- These can be compared with original evidence hashes
- Detected file type (using magic bytes)
- Whether extension matches actual content
- Validation status
- Detected filesystem type (NTFS, FAT32, exFAT, etc.)
- Cluster/block size
- Partition information
Timeline of evidence handling from acquisition to analysis.
- Analysis system details
- Tool versions used
- Timezone of analysis
- Mean entropy value
- Maximum entropy found
- Number of anomalous blocks
Detailed partition information including:
- Start offsets
- Sizes
- Partition types
Regions with unusual entropy that warrant further investigation:
- Offset locations
- Size
- Entropy scores
- Anomaly scores
Detailed findings from forensic analysis:
- Deleted files
- Email addresses found
- URLs discovered
- IP addresses
Events extracted from filesystem metadata.
List of files that were deleted but may still be recoverable.
Summary of emails, URLs, IPs, and phone numbers found.
Analysis of unallocated space for wipe patterns:
- Zero-fill regions
- FF-fill regions
- Random-wipe regions
- Wipe score calculation
Detected anti-forensic techniques:
- Encryption presence
- Wipe patterns
- Anti-forensic tool signatures (DBAN, SDelete, BleachBit)
- Metadata inconsistencies
- Ensemble model details (Random Forest + Isolation Forest)
- Features analyzed (18 total including unallocated space)
- Accuracy metrics
- Prediction result (AUTHENTIC/QUESTIONABLE/TAMPERED)
- Confidence score
- Ensemble breakdown
How ML findings correlate with forensic artifacts.
Detailed analysis of free disk space:
- Total unallocated bytes
- Suspicious regions count
- Wipe pattern distribution
Known limitations of the analysis:
- Encrypted volumes without keys
- Overwritten data
- False positive rates
Statement certifying the analysis was performed properly.
User Upload
│
▼
┌─────────────────────────────────────────┐
│ 1. File Validation │
│ - Hash calculation (MD5/SHA1/SHA256)│
│ - Magic bytes detection │
│ - Filesystem identification │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ 2. Rust Analyzer (Fast) │
│ - File signature detection │
│ - Anti-forensic tool detection │
│ - Timestamp anomaly detection │
│ - Hidden data detection │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ 3. Python Entropy Scan │
│ - Read disk in 4KB blocks │
│ - Calculate Shannon entropy │
│ - Chi-square testing │
│ - Identify high-entropy regions │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ 4. Wipe Detection │
│ - Extract unallocated space │
│ - Scan for wipe patterns │
│ - Zero/FF/Random classification │
│ - Calculate wipe score │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ 5. Forensics Tools │
│ - mmls: Partition tables │
│ - fsstat: Filesystem metadata │
│ - fls: Deleted files │
│ - bulk_extractor: Artifacts │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ 6. Machine Learning (Ensemble) │
│ - 18 features extracted │
│ - Random Forest classification │
│ - Isolation Forest anomaly detection │
│ - Ensemble voting │
│ - Confidence scoring │
└─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ 7. Report Generation │
│ - Combine all findings │
│ - Format for court │
│ - Include limitations │
└─────────────────────────────────────────┘
The ML ensemble analyzes these features:
- Entropy score - Shannon entropy
- Null ratio - Percentage of null bytes
- Repeating chunks - Repeated data patterns
- Timestamp anomalies - Invalid timestamps
- Wiping detected - Wipe patterns found
- Anti-forensic tool - Tool signatures detected
- Hidden data - Data in slack space
- High entropy - Above 7.5 threshold
- Unknown filesystem - No FS detected
- File size - Disk image size in GB
- Sector alignment - 512-byte alignment
- Unallocated space bytes - Free space size
- Suspicious unallocated regions - Anomalies in free space
- Zero-filled regions - Zero-wipe patterns
- Random-filled regions - Random-wipe patterns
- Wipe pattern score - Overall wipe score
- Deleted file entries - Count of deleted files
- Random Forest: 100 trees, classifies into AUTHENTIC/QUESTIONABLE/TAMPERED
- Isolation Forest: 100 trees, anomaly detection
- Weights: 60% Random Forest, 40% Isolation Forest
- Training: Synthetic forensic data (can be retrained with real data)
| Category | Score | Description |
|---|---|---|
| Wipe Detection >30% | 35 | Strong evidence of wiping |
| Wipe Detection >10% | 20 | Moderate wiping evidence |
| Wipe Detection >3% | 10 | Some wiping detected |
| No Anomalies | 0 | Normal findings |
Q: Server won't start
- Make sure port 3001 is not in use
- Make sure port 3002 (ML API) is not in use
- Check that Node.js is installed:
node --version - Check that Python is installed:
python --version
Q: ML API shows "Random Forest model not found, using default"
- This is normal on first run - models will be trained automatically
- Models will be saved to
server/ml-api/models/for future use
Q: Analysis takes too long
- Larger files take more time
- The timeout is set to 10 minutes for Python analysis
- Try reducing block size in configuration
Q: No suspicious regions found
- This could mean the disk is clean
- Or the file might be encrypted/compressed
- Check if high entropy regions exist
Q: Can't detect filesystem
- For encrypted containers, filesystem is inside encrypted data
- E01 files may need decompression first
- Check if disk is raw or has known filesystem
Q: All disk images show same results
- This was a bug that has been fixed
- Each analysis now uses unique output directories
- Restart the server to apply fixes
MIT License - See LICENSE file for details
For issues and questions, please open an issue on GitHub.
Document Version: 2.0.0 AFDF - Anti-Forensic Detection Framework