Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Using gait analysis to study a person’s walking patterns in order to determine if a video or media depicts the real individual or a manipulated deepfake.

Notifications You must be signed in to change notification settings

a2rizing/DeepFake-Detection

Repository files navigation

🎭 DeepFake Detection using Gait Analysis

Python TensorFlow OpenCV MediaPipe

A comprehensive machine learning pipeline that detects deepfake videos by analyzing human gait patterns using MediaPipe pose estimation and advanced ML algorithms.

📋 Table of Contents

🎯 Overview

This project implements a novel approach to deepfake detection by analyzing human gait patterns extracted from videos. Unlike traditional facial analysis methods, gait analysis is more robust against sophisticated deepfake generation techniques that primarily focus on facial features.

How It Works:

  1. Pose Extraction: Uses MediaPipe to extract 33 body landmarks from video frames
  2. Gait Analysis: Processes keypoints to extract meaningful gait features (coordinates + joint angles)
  3. Machine Learning: Trains multiple models (Traditional ML + Deep Learning) to classify authentic vs deepfake videos
  4. Detection: Analyzes new videos to determine authenticity with confidence scores

🧬 Features

  • Multiple Model Support: RandomForest, SVM, Logistic Regression, KNN, Gradient Boosting, LSTM, CNN, Hybrid models
  • Advanced Preprocessing: Normalization, feature engineering, sequence resampling
  • Hyperparameter Tuning: Grid Search and Random Search optimization
  • Comprehensive Evaluation: Confusion matrices, ROC curves, precision-recall analysis
  • Real-time Detection: Process MP4 videos with confidence scoring
  • Batch Processing: Analyze multiple videos simultaneously
  • Visualization Tools: Gait pattern visualization and model performance charts
  • Robust Pipeline: Error handling, logging, and reproducible results

🏗️ Project Structure

DeepFake-Detection/
├── 📁 data/
│   ├── 🎥 *.mp4                     # Input video files
│   ├── 📄 gait_keypoints.csv        # Extracted keypoints
│   └── 📁 processed/
│       ├── 🧮 X.npy                 # Feature matrix
│       ├── 🏷️ y.npy                  # Labels
│       └── 📋 labels.json           # Label mapping
├── 📁 src/
│   ├── 📁 preprocessing/
│   │   ├── 🔧 extract_gait.py       # Extract gait from videos
│   │   └── 🔄 preprocess_gait.py    # Feature engineering
│   ├── 📁 models/
│   │   ├── 🎯 train_baseline.py     # Train ML models
│   │   ├── 🔬 hyperparameter_tuning.py  # Optimize models
│   │   └── 🧠 deep_learning_models.py   # Neural networks
│   └── 📁 utils/
│       └── 📊 visualize_gait.py     # Visualization tools
├── 📁 models/                       # Saved trained models
├── 📁 evaluation_results/           # Model evaluation outputs
├── 🚀 detect_deepfake.py           # Main detection script
├── 📈 evaluation_and_visualization.py  # Comprehensive evaluation
├── 🎨 visualize_menu.py            # Visualization interface
├── 📋 requirements.txt             # Dependencies
└── 📚 README.md                    # This file

⚙️ Installation

Prerequisites

  • Python 3.8+
  • Webcam or MP4 video files
  • At least 4GB RAM (8GB recommended)
  • CUDA-compatible GPU (optional, for deep learning)

1. Clone Repository

git clone https://github.com/your-username/DeepFake-Detection.git
cd DeepFake-Detection

2. Create Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# Linux/Mac
python3 -m venv venv
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Verify Installation

python -c "import cv2, mediapipe, sklearn, numpy, pandas; print('✅ All dependencies installed successfully!')"

🚀 Quick Start

30-Second Demo

  1. Add your videos to the data/ folder (MP4 format)
  2. Extract gait patterns:
    python src/preprocessing/extract_gait.py
  3. Preprocess data:
    python src/preprocessing/preprocess_gait.py
  4. Train models:
    python src/models/train_baseline.py
  5. Detect deepfakes:
    python detect_deepfake.py data/your_video.mp4

That's it! 🎉 You'll get a prediction with confidence score.

📖 Detailed Usage Guide

1. Data Preprocessing

Extract Gait from Videos

# Extract from all MP4 files in data folder
python src/preprocessing/extract_gait.py

# Output: data/gait_keypoints.csv with pose landmarks

What it does:

  • Processes all .mp4 files in the data/ directory
  • Extracts 33 MediaPipe pose landmarks per frame
  • Saves combined results with video identification
  • Reports detection statistics

Feature Engineering

# Convert raw keypoints to ML-ready features
python src/preprocessing/preprocess_gait.py

# Outputs:
# - data/processed/X.npy (feature matrix)
# - data/processed/y.npy (labels)
# - data/processed/labels.json (label mapping)

What it does:

  • Normalizes gait sequences using torso length
  • Computes joint angles (hip-knee-ankle, elbow angles)
  • Resamples sequences to uniform length (64 frames)
  • Creates synthetic deepfake data for training
  • Generates final feature matrix (70 features per frame)

2. Gait Visualization

# Interactive visualization menu
python visualize_menu.py

# Available options:
# 1. Static pose plots
# 2. Animated gait sequences
# 3. Joint angle analysis
# 4. Feature distribution plots

Visualization Features:

  • 📊 Pose landmark scatter plots
  • 🎬 Animated walking sequences
  • 📈 Joint angle time series
  • 🎯 Feature distribution analysis
  • 👥 Person-wise comparisons

3. Model Training

Train Baseline Models

# Train multiple ML algorithms
python src/models/train_baseline.py

# Models trained:
# - RandomForest
# - SVM
# - Logistic Regression
# - K-Nearest Neighbors
# - Gradient Boosting

Output:

  • Trained models saved in models/ directory
  • Performance metrics and comparison
  • Best model automatically selected
  • Cross-validation results

Training Configuration

# Customize training in train_baseline.py
MODELS_CONFIG = {
    'test_size': 0.2,        # 20% data for testing
    'cv_folds': 5,           # 5-fold cross-validation
    'random_state': 42,      # Reproducible results
    'n_jobs': -1            # Use all CPU cores
}

4. Hyperparameter Tuning

# Optimize model parameters
python src/models/hyperparameter_tuning.py

# Methods:
# 1. Grid Search (thorough)
# 2. Random Search (faster)
# 3. Model comparison
# 4. Best model selection

Tuning Process:

  1. Grid Search: Exhaustive parameter exploration
  2. Random Search: Efficient parameter sampling
  3. Cross-validation: Robust performance estimation
  4. Model Selection: Best performing configuration

5. Deep Learning Models

# Train neural networks (requires TensorFlow)
python src/models/deep_learning_models.py

# Architectures:
# - LSTM (temporal patterns)
# - CNN (spatial features)
# - Hybrid (CNN + LSTM)

Deep Learning Features:

  • 🧠 LSTM: Captures temporal gait dynamics
  • 🖼️ CNN: Extracts spatial pose patterns
  • 🔄 Hybrid: Combines spatial and temporal analysis
  • 📊 Advanced Metrics: Precision, recall, F1-score, AUC

6. Model Evaluation

# Comprehensive model analysis
python evaluation_and_visualization.py

# Generates:
# - Confusion matrices
# - ROC curves
# - Precision-recall curves
# - Model comparison charts
# - Detailed performance report

Evaluation Outputs:

  • 📊 evaluation_results/model_comparison.png
  • 📈 evaluation_results/combined_roc_curves.png
  • 📋 evaluation_results/evaluation_report.txt
  • 🎯 Individual model performance plots

7. Deepfake Detection

Single Video Analysis

# Analyze one video
python detect_deepfake.py path/to/video.mp4

# With custom threshold
python detect_deepfake.py video.mp4 --threshold 0.7

# Save results to JSON
python detect_deepfake.py video.mp4 --output results.json

Batch Processing

# Analyze all videos in directory
python detect_deepfake.py data/ --batch

# Batch with output file
python detect_deepfake.py data/ --batch --output batch_results.json

Detection Output:

{
  "video_path": "data/test_video.mp4",
  "prediction": "Deepfake",
  "deepfake_probability": 0.824,
  "confidence": 0.824,
  "threshold": 0.5,
  "model_type": "RandomForest",
  "frames_analyzed": 127
}

🎥 Testing with Your Videos

Supported Formats

  • ✅ MP4 (recommended)
  • ✅ AVI
  • ✅ MOV
  • ✅ WEBM

Video Requirements

  • Resolution: 480p or higher recommended
  • Duration: 2-30 seconds optimal
  • Content: Clear view of person walking/moving
  • Quality: Good lighting, minimal occlusion

Test Video Preparation

# 1. Add videos to data folder
cp your_videos/*.mp4 data/

# 2. Verify video format
python -c "import cv2; cap = cv2.VideoCapture('data/test.mp4'); print(f'Video OK: {cap.isOpened()}')"

# 3. Quick detection test
python detect_deepfake.py data/test.mp4

Expected Results

  • Authentic Video: prediction: "Authentic", confidence > 0.6
  • Deepfake Video: prediction: "Deepfake", confidence > 0.6
  • Low Confidence: May indicate poor video quality or edge cases

📊 Model Performance

Baseline Performance (Example)

Model Accuracy Precision Recall F1-Score AUC
RandomForest 0.892 0.885 0.891 0.888 0.924
SVM 0.876 0.872 0.879 0.875 0.918
Logistic Regression 0.834 0.829 0.831 0.830 0.902
KNN 0.823 0.818 0.825 0.821 0.889
Gradient Boosting 0.887 0.883 0.884 0.883 0.921

Deep Learning Performance

Model Accuracy Precision Recall F1-Score Parameters
LSTM 0.914 0.908 0.912 0.910 124K
CNN 0.901 0.896 0.903 0.899 89K
Hybrid 0.923 0.919 0.921 0.920 156K

Performance Factors

  • Data Quality: Clean pose detection improves accuracy
  • Video Length: 2-10 seconds optimal for gait analysis
  • Movement Type: Walking/running works best
  • Background: Minimal interference preferred

🛠️ Troubleshooting

Common Issues

1. "No module named 'tensorflow'"

# Install TensorFlow
pip install tensorflow

# For GPU support
pip install tensorflow-gpu

2. "MediaPipe pose detection failed"

# Check video file
python -c "import cv2; print(cv2.VideoCapture('your_video.mp4').isOpened())"

# Verify MediaPipe
python -c "import mediapipe as mp; print('MediaPipe OK')"

3. "No pose landmarks detected"

  • Ensure person is visible in video
  • Check lighting conditions
  • Verify video quality and resolution
  • Try different video or adjust detection confidence

4. "Insufficient training data"

# Check data files
ls data/processed/
# Should contain: X.npy, y.npy, labels.json

# Regenerate if missing
python src/preprocessing/preprocess_gait.py

5. "Model not found"

# Train models first
python src/models/train_baseline.py

# Check models directory
ls models/

Performance Optimization

Speed Up Training

# Reduce cross-validation folds
cv_folds = 3  # instead of 5

# Use fewer hyperparameters
n_iter = 20  # for random search

# Limit model complexity
max_depth = 10  # for tree-based models

Memory Optimization

# Process videos in smaller batches
batch_size = 16  # for deep learning

# Reduce sequence length
target_frames = 32  # instead of 64

# Use feature selection
from sklearn.feature_selection import SelectKBest

Debug Mode

# Enable verbose logging
export PYTHONPATH="."
python -v detect_deepfake.py video.mp4

# Check intermediate outputs
python src/preprocessing/extract_gait.py --debug

# Validate data integrity
python -c "import numpy as np; X = np.load('data/processed/X.npy'); print(f'Data shape: {X.shape}, No NaN: {not np.isnan(X).any()}')"

🤝 Contributing

We welcome contributions! Here's how to get started:

Development Setup

# Fork and clone the repository
git clone https://github.com/your-username/DeepFake-Detection.git
cd DeepFake-Detection

# Create development branch
git checkout -b feature/your-feature-name

# Install development dependencies
pip install -r requirements.txt
pip install pytest black flake8  # for testing and code formatting

Code Style

# Format code
black src/ --line-length 88

# Check style
flake8 src/ --max-line-length 88

# Run tests
pytest tests/

Contribution Areas

  • 🚀 New Models: Implement additional ML algorithms
  • 📊 Visualization: Create new plot types and analysis tools
  • 🎥 Video Processing: Support more formats and preprocessing options
  • 📱 Mobile Support: Optimize for mobile deployment
  • 🔧 Performance: Speed and memory optimizations
  • 📚 Documentation: Improve guides and examples

Pull Request Process

  1. Create feature branch
  2. Add tests for new functionality
  3. Update documentation
  4. Ensure all tests pass
  5. Submit pull request with clear description

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🎯 Quick Commands Reference

# Complete pipeline (first time)
python src/preprocessing/extract_gait.py
python src/preprocessing/preprocess_gait.py
python src/models/train_baseline.py
python detect_deepfake.py your_video.mp4

# Hyperparameter tuning
python src/models/hyperparameter_tuning.py

# Deep learning training
python src/models/deep_learning_models.py

# Comprehensive evaluation
python evaluation_and_visualization.py

# Batch detection
python detect_deepfake.py data/ --batch --output results.json

# Visualization
python visualize_menu.py

📞 Support


Made with ❤️ for advancing deepfake detection research

**⭐ Star this repo if you find it useful!**DeepFake-Detection Using gait analysis to study a person’s walking patterns in order to determine if a video or media depicts the real individual or a manipulated deepfake.

About

Using gait analysis to study a person’s walking patterns in order to determine if a video or media depicts the real individual or a manipulated deepfake.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages